• 1 Post
  • 308 Comments
Joined 2 years ago
cake
Cake day: June 6th, 2023

help-circle



  • If you are familiar with the concept of an NP-complete problem, the weights are just one possible solution.

    The Traveling Salesman Problem is probably the easiest analogy to make. It’s as though we’re all trying to find the shortest path through a bunch of points (ex. towns), and when someone says “here is a path that I think is pretty good”, that is analogous to sharing network weighs for an AI. We can then all openly test that solution against other solutions and determine which is “best”.

    What they aren’t telling you is whether people traveling that path somehow benefits them (maybe they own all the gas stations on that path. Or maybe they’ve hired highway men to rob people on that path). And figuring out if that’s the case in a hyper-dimensional space is non-trivial.


  • It’s not sunk cost, dude. We agreed that $120 will get them 5 years of service that meets their needs. Even if they switch to jellyfin after 5 years, they still got their money’s worth.

    It’s only sunk cost if they are worse off than if they had switched earlier. I guess if you’re arguing that they would still have $120 if they switch today, I would argue they should still pay that $120 toward jellyfin’s development. And that’s assuming they have time to switch to jellyfin AND it fits 100% of their usecases, either of which could be untrue.


  • Or Plex currently does everything they need it to, and $120 for 5+ years of keeping that going without any interruption of service is very reasonable. In the meantime, jellyfin will only get better and there might even be other options available by then.

    Stop trying to make the issue black and white, one-size-fits-all. There are perfectly legitimate reasons for people to use both Plex and Jellyfin.





  • It seems like the issue here is, users want to be spoken to in colloquial language they understand, but any document a legal entity produces MUST be in unambiguous “legal” language.

    So unless there’s a way to write a separate “unofficial FAQ” with what they want to say, they are limited to what they legally have to say.

    And maybe that’s a good thing. Maybe now they need to create a formal document specifying in the best legalese exactly what they mean when they say they “will never sell your data”, because if there’s any ambiguity around it, then customers deserve for them to disambiguate. Unfortunately, it’s probably not going read as quick and catchy as an ambiguous statement.


  • Afaik the cookie policy on your site is not GDPR compliant, at least how it is currently worded. If all cookies are “technically necessary” for function of the site, then I think all you need to do is say that. (I think for a wiki it’s acceptable to require clients to allow caching of image data, so your server doesn’t have to pay for more bandwidth).


  • My recommendation would be, have two machines: new hw for all your services, and use the old hw for your NAS. Each could be whatever OS you’re comfortable with using. Most everything on the services machine could be in docker configs, including network mount points to the NAS. You might be able to get away with using the 1080TI in the services box depending on what all you want to do (AI stuff, or newer stream transcoding requirements may require newer hw).

    Moving the data from the old NAS to a new one without new disks will be a challenge, yes.

    I have a TrueNAS box and used jails for services. I recently set up a debian box separately, and am switching from jails on truenas to docker on debian. Wish I had done this from the start.




  • I agree that you can’t know if the AI has been deliberately trained to act nefarious given the right circumstances. But I maintain that it’s (currently) impossible to know if any AI had been inadvertently trained to do the same. So the security implications are no different. If you’ve given an AI the ability to exfiltrating data without any oversight, you’ve already messed up, no matter whether you’re using a single AI you trained yourself, a black box full of experts, or deepseek directly.

    But all this is about whether merely sharing weights is “open source”, and you’ve convinced me that it’s not. There needs to be a classification, similar to “source available”; this would be like “weights available”.



  • Is there any good LLM that fits this definition of open source, then? I thought the “training data” for good AI was always just: the entire internet, and they were all ethically dubious that way.

    What is the concern with only having weights? It’s not abritrary code exectution, so there’s no security risk or lack of computing control that are the usual goals of open source in the first place.

    To me the weights are less of a “blob” and more like an approximate solution to an NP-hard problem. Training is traversing the search space, and sharing a model is just saying “hey, this point looks useful, others should check it out”. But maybe that is a blob, since I don’t know how they got there.



  • Yeah, I agree that in the long term those two sentiments are inconsistent, but in the short term we have to deal with allegedly misguided layoffs, and worse user experiences, which I think makes both fair to criticise. Maybe firing everyone and using slop AI will make your company go bankrupt in a few years, and that’s great; in the meantime, employees everywhere can rightfully complain about the slop and the jobs.

    But yeah, I don’t think it’s fair to complain about how “inefficient” an early technology is and also call it “magic beans”.