• 1 Post
  • 26 Comments
Joined 2 years ago
cake
Cake day: December 18th, 2023

help-circle

  • “They” is the copyright industry. The same people, who are suing AI companies for money, want the Internet Archive gone for more money.

    I share the fear that the copyrightists reach a happy compromise with the bigger AI companies and monopolize knowledge. But for now, AI companies are fighting for Fair Use. The Internet Archive is already benefitting from those precedents.







  • It’s a bit of a split among libertarians. Some very notable figures like Ayn Rand were strong believers in IP. In fact, Ayn Rand’s dogmas very much align with what is falsely represented as left-wing thought in the context of AI.

    It’s really irritating for me how much conservative capitalist ideals are passed off as left-wing. Like, attitudes on corporations channel Adam Smith. I think of myself as pragmatic and find that Smith or even Hayek had some good points (not Rand, though). But it’s absolutely grating how uneducated that all is. Worst of all, it makes me realize that for all the anti-capitalist rhetoric, the favored policies are all about making everything worse.




  • For fastest inference, you want to fit the entire model in VRAM. Plus, you need a few GB extra for context.

    Context means the text (+images, etc) it works on. That’s the chat log, in the case of a chatbot, plus any texts you might want summarized/translated/ask questions about.

    Models can be quantized, which is a kind of lossy compression. They get smaller but also dumber. As with JPGs, the quality loss is insignificant at first and absolutely worth it.

    Inference can be split between GPU and CPU, substituting VRAM with normal RAM. Makes it slower, but you’ll probably will still feel that it’s smooth.

    Basically, it’s all trade-offs between quality, context size, and speed.







  • Text explaining why the neural network representation of common features (typically with weighted proportionality to their occurrence) does not meet the definition of a mathematical average. Does it not favor common response patterns?

    Hmm. I’m not really sure why anyone would write such a text. There is no “weighted proportionality” (or pathways). Is this a common conception?

    You don’t need it to be an average of the real world to be an average. I can calculate as many average values as I want from entirely fictional worlds. It’s still a type of model which favors what it sees often over what it sees rarely. That’s a form of probability embedded, corresponding to a form of average.

    I guess you picked up on the fact that transformers output a probability distribution. I don’t think anyone calls those an average, though you could have an average distribution. Come to think of it, before you use that to pick the next token, you usually mess with it a little to make it more or less “creative”. That’s certainly no longer an average.

    You can see a neural net as a kind of regression analysis. I don’t think I have ever heard someone calling that a kind of average, though. I’m also skeptical if you can see a transformer as a regression but I don’t know this stuff well enough. When you train on some data more often than on other data, that is not how you would do a regression. Certainly, once you start RLHF training, you have left regression territory for good.

    The GPTisms might be because they are overrepresented in the finetuning data. It might also be from the RLHF and/or brought out by the system prompt.


  • I accidentally clicked reply, sorry.

    B) you do know there’s a lot of different definitions of average, right?

    I don’t think that any definition applies to this. But I’m no expert on averages. In any case, the training data is not representative of the internet or anything. It’s also not training equally on all data and not only on such text. What you get out is not representative of anything.




  • Who exactly creates the image is not the only issue and maybe I gave it too much prominence. Another factor is that the use of copyrighted training data is still being negotiated/litigated in the US. It will help if they tread lightly.

    My opinion is that it has to be legal on first amendment grounds, or more generally freedom of expression. Fair use (a US thing) derives from the 1st amendment, though not exclusively. If AI services can’t be used for creating protected speech, like parody, then this severely limits what the average person can express.

    What worries me is that the major lawsuits involve Big Tech companies. They have an interest in far-reaching IP laws; just not quite far-reaching enough to cut off their R&D.