Off-and-on trying out an account over at @tal@oleo.cafe due to scraping bots bogging down lemmy.today to the point of near-unusability.

  • 1 Post
  • 447 Comments
Joined 2 years ago
cake
Cake day: October 4th, 2023

help-circle





  • Are Motorola ok?

    Depends on what you value in a phone. Like, I like a vanilla OS, a lot of memory, large battery, and a SIM slot. I don’t care much about the camera quality and don’t care at all about size and weight (in fact, if someone made a tablet-sized phone, I’d probably switch to that). That’s almost certainly not the mix that some other people want.

    There’s some phone comparison website I was using a while back that has a big database of phones and lets you compare and search based on specification.

    goes looking

    This one:

    https://www.phonearena.com/phones



  • I don’t think that memory manufacturers are in some plot to promote SaaS. It’s just that they can make a ton of money off the demand right now for AI buildout, and they’re trying to make as much money as they can in the limited window that they have. All kind of industries are going to be collateral damage for a while. Doesn’t require a more complicated explanation.

    Michael Crichton had some way of putting “it’s not about you” it in Sphere that I remember liking.

    searches

    “I’m afraid that’s true,” Norman said. “The sphere was built to test whatever intelligent life might pick it up, and we simply failed that test.”

    “Is that what you think the sphere was made for?” Harry said. “I don’t.”

    “Then what?” Norman said.

    “Well,” Harry said, “look at it this way: Suppose you were an intelligent bacterium floating in space, and you came upon one of our communication satellites, in orbit around the Earth. You would think, What a strange, alien object this is, let’s explore it. Suppose you opened it up and crawled inside. You would find it very interesting in there, with lots of huge things to puzzle over. But eventually you might climb into one of the fuel cells, and the hydrogen would kill you. And your last thought would be: This alien device was obviously made to test bacterial intelligence and to kill us if we make a false step.

    “Now, that would be correct from the standpoint of the dying bacterium. But that wouldn’t be correct at all from the standpoint of the beings who made the satellite. From our point of view, the communications satellite has nothing to do with intelligent bacteria. We don’t even know that there are intelligent bacteria out there. We’re just trying to communicate, and we’ve made what we consider a quite ordinary device to do it.”

    Like, two years back, there was a glut of memory in the market. Samsung was losing a lot of money. They weren’t losing money back then because they were trying to promote personal computer ownership any more than they’re trying to deter personal computer ownership in 2026. It’s just that demand can gyrate more-rapidly than production capacity can adjust.





  • https://stackoverflow.com/questions/30869297/difference-between-memfree-and-memavailable

    Rik van Riel’s comments when adding MemAvailable to /proc/meminfo:

    /proc/meminfo: MemAvailable: provide estimated available memory

    Many load balancing and workload placing programs check /proc/meminfo to estimate how much free memory is available. They generally do this by adding up “free” and “cached”, which was fine ten years ago, but is pretty much guaranteed to be wrong today.

    It is wrong because Cached includes memory that is not freeable as page cache, for example shared memory segments, tmpfs, and ramfs, and it does not include reclaimable slab memory, which can take up a large fraction of system memory on mostly idle systems with lots of files.

    Currently, the amount of memory that is available for a new workload, without pushing the system into swap, can be estimated from MemFree, Active(file), Inactive(file), and SReclaimable, as well as the “low” watermarks from /proc/zoneinfo.

    However, this may change in the future, and user space really should not be expected to know kernel internals to come up with an estimate for the amount of free memory.

    It is more convenient to provide such an estimate in /proc/meminfo. If things change in the future, we only have to change it in one place.

    Looking at the htop source:

    https://github.com/htop-dev/htop/blob/main/MemoryMeter.c

       /* we actually want to show "used + shared + compressed" */
       double used = this->values[MEMORY_METER_USED];
       if (isPositive(this->values[MEMORY_METER_SHARED]))
          used += this->values[MEMORY_METER_SHARED];
       if (isPositive(this->values[MEMORY_METER_COMPRESSED]))
          used += this->values[MEMORY_METER_COMPRESSED];
    
       written = Meter_humanUnit(buffer, used, size);
    

    It’s adding used, shared, and compressed memory, to get the amount actually tied up, but disregarding cached memory, which, based on the above comment, is problematic, since some of that may not actually be available for use.

    top, on the other hand, is using the kernel’s MemAvailable directly.

    https://gitlab.com/procps-ng/procps/-/blob/master/src/free.c

    	printf(" %11s", scale_size(MEMINFO_GET(mem_info, MEMINFO_MEM_AVAILABLE, ul_int), args.exponent, flags & FREE_SI, flags & FREE_HUMANREADABLE));
    

    In short: You probably want to trust /proc/meminfo’s MemAvailable, (which is what top will show), and htop is probably giving a misleadingly-low number.



  • If databases are involved they usually offer some method of dumping all data to some kind of text file. Usually relying on their binary data is not recommended.

    It’s not so much text or binary. It’s because a normal backup program that just treats a live database file as a file to back up is liable to have the DBMS software write to the database while it’s being backed up, resulting in a backed-up file that’s a mix of old and new versions, and may be corrupt.

    Either:

    1. The DBMS needs to have a way to create a dump — possibly triggered by the backup software, if it’s aware of the DBMS — that won’t change during the backup

    or:

    1. One needs to have filesystem-level support to grab an atomic snapshot (e.g. one takes an atomic snapshot using something like btrfs and then backs up the snapshot rather than the live filesystem). This avoids the issue of the database file changing while the backup runs.

    In general, if this is a concern, I’d tend to favor #2 as an option, because it’s an all-in-one solution that deals with all of the problems of files changing while being backed up: DBMSes are just a particularly thorny example of that.

    Full disclosure: I mostly use ext4 myself, rather than btrfs. But I also don’t run live DBMSes.

    EDIT: Plus, #2 also provides consistency across different files on the filesystem, though that’s usually less-critical. Like, you won’t run into a situation where you have software on your computer update File A, then does a sync(), then updates File B, but your backup program grabs the new version of File B but then the old version of File A. Absent help from the filesystem, your backup program won’t know where write barriers spanning different files are happening.

    In practice, that’s not usually a huge issue, since fewer software packages are gonna be impacted by this than write ordering internal to a single file, but it is permissible for a program, under Unix filesystem semantics, to expect that the write order persists there and kerplode if it doesn’t…and a traditional backup won’t preserve it the way that a backup with help from the filesystem can.



  • tal@lemmy.todaytoTechnology@beehaw.orgMove Over, ChatGPT
    link
    fedilink
    English
    arrow-up
    3
    ·
    6 days ago

    In all fairness, while this is a particularly bad case, the fact that it’s often very difficult to safely fiddle with environment variables at runtime in a process, but very convenient as a way to cram extra parameters into a library have meant that a lot of human programmers who should know better have created problems like this too.

    IIRC, setting the timezone for some of the Posix time APIs on Linux has the same problem, and that’s a system library. And IIRC SDL and some other graphics libraries, SDL and IIRC Linux 3D stuff, have used this as a way to pass parameters out-of-band to libraries, which becomes a problem when programs start dicking with it at runtime. I remember reading some article from someone who had been banging into this on Linux gaming about how various programs and libraries for games would setenv() to fiddle with them, and races associated with that were responsible for a substantial number of crashes that they’d seen.

    setenv() is not thread-safe or signal-safe. In general, reading environment variables in a program is fine, but messing with them in very many situations is not.

    searches

    Yeah, the first thing I see is someone talking about how its lack of thread-safety is a problem for TZ, which is the time thing that’s been a pain for me a couple times in the past.

    https://news.ycombinator.com/item?id=38342642

    Back on your issue:

    Claude, being very smart and very good at drawing a straight line between two points, wrote code that took the authentication token from the HTTP request header, modified the process’s environment variables, then called the library

    for the uninitiated - a process’s environment variables are global. and HTTP servers are famously pretty good at dealing with multiple requests at once.

    Note also that a number of webservers used to fork to handle requests — and I’m sure that there are still some now that do so, though it’s certainly not the highest-performance way to do things — and in that situation, this code could avoid problems.

    searchs

    It sounds like Apache used to and apparently still can do this:

    https://old.reddit.com/r/PHP/comments/102vqa2/why_does_apache_spew_a_new_process_for_each/

    But it does highlight one of the “LLMs don’t have a broad, deep understanding of the world, and that creates problems for coding” issues that people have talked about. Like, part of what someone is doing when writing software is identifying situations where behavior isn’t defined and clarifying that, either via asking for requirements to be updated or via looking out-of-band to understand what’s appropriate. An LLM that’s working by looking at what’s what commonly done in its training set just isn’t in a good place to do that, and that’s kinda a fundamental limitation.

    I’m pretty sure that the general case of writing software is AI-hard, where the “AI” referred to by the term is an artificial general intelligence that incorporates a lot of knowledge about the world. That is, you can probably make an AI to program write software, but it won’t be just an LLM, of the “generative AI” sort of thing that we have now.

    There might be ways that you could incorporate an LLM into software that can write software themselves. But I don’t think that it’s just going to be a raw “rely on an LLM taking in a human-language set of requirements and spitting out code”. There are just things that that can’t handle reasonably.


  • I think that the problem will be if software comes out that’s doesn’t target home PCs. That’s not impossible. I mean, that happens today with Web services. Closed-weight AI models aren’t going to be released to run on your home computer. I don’t use Office 365, but I understand that at least some of that is a cloud service.

    Like, say the developer of Video Game X says “I don’t want to target a ton of different pieces of hardware. I want to tune for a single one. I don’t want to target multiple OSes. I’m tired of people pirating my software. I can reduce cheating. I’m just going to release for a single cloud platform.”

    Nobody is going to take your hardware away. And you can probably keep running Linux or whatever. But…not all the new software you want to use may be something that you can run locally, if it isn’t released for your platform. Maybe you’ll use some kind of thin-client software — think telnet, ssh, RDP, VNC, etc for past iterations of this — to use that software remotely on your Thinkpad. But…can’t run it yourself.

    If it happens, I think that that’s what you’d see. More and more software would just be available only to run remotely. Phones and PCs would still exist, but they’d increasingly run a thin client, not run software locally. Same way a lot of software migrated to web services that we use with a Web browser, but with a protocol and software more aimed at low-latency, high-bandwidth use. Nobody would ban existing local software, but a lot of it would stagnate. A lot of new and exciting stuff would only be available as an online service. More and more people would buy computers that are only really suitable for use as a thin client — fewer resources, closer to a smartphone than what we conventionally think of as a computer.

    EDIT: I’d add that this is basically the scenario that the AGPL is aimed at dealing with. The concern was that people would just run open-source software as a service. They could build on that base, make their own improvements. They’d never release binaries to end users, so they wouldn’t hit the traditional GPL’s obligation to release source to anyone who gets the binary. The AGPL requires source distribution to people who even just use the software.


  • I will say that, realistically, in terms purely of physical distance, a lot of the world’s population is in a city and probably isn’t too far from a datacenter.

    https://calculatorshub.net/computing/fiber-latency-calculator/

    It’s about five microseconds of latency per kilometer down fiber optics. Ten microseconds for a round-trip.

    I think a larger issue might be bandwidth for some applications. Like, if you want to unicast uncompressed video to every computer user, say, you’re going to need an ungodly amount of bandwidth.

    DisplayPort looks like it’s currently up to 80Gb/sec. Okay, not everyone is currently saturating that, but if you want comparable capability, that’s what you’re going to have to be moving from a datacenter to every user. For video alone. And that’s assuming that they don’t have multiple monitors or something.

    I can believe that it is cheaper to have many computers in a datacenter. I am not sold that any gains will more than offset the cost of the staggering fiber rollout that this would require.

    EDIT: There are situations where it is completely reasonable to use (relatively) thin clients. That’s, well, what a lot of the Web is — browser thin clients accessing software running on remote computers. I’m typing this comment into Eternity before it gets sent to a Lemmy instance on a server in Oregon, much further away than the closest datacenter to me. That works fine.

    But “do a lot of stuff in a browser” isn’t the same thing as “eliminate the PC entirely”.