• 29 Posts
  • 179 Comments
Joined 1 year ago
cake
Cake day: September 13th, 2024

help-circle

  • The question is: What is an effective legal framework that focuses on the precise harms, doesn’t allow AI vendors to easily evade accountability, and doesn’t inflict widespread collateral damage?

    This is entirely my opinion and I’m likely wrong about many things, but at minimum:

    1. The model has to be open source and freely downloadable, runnable, and copyleft, satisfying the distribution license requirements of copyleft source material (I’m willing to give a free pass to making it copyleft in general, as different copyleft licenses can have different and contradictory distribution license requirements, but IMO the leap from permissive to copyleft is the more important part). I suspect this alone will kill the AI bubble, because as soon as they can’t exclusively profit off it they won’t see AI as “the future” anymore.

    2. All training data needs to be freely downloadable and independently hosted by the AI creator. Goes without saying that only material you can legally copy and host on your own server can be used as training data. This solves the IP theft issue, as IMO if your work is licensed such that it can be redistributed in its entirety, it should logically also be okay to use it as training data. And if you can’t even legally host it on your own server, using it to train AI is off the table. And the independently hosted dataset (complete with metadata about where it came from) also serves as attribution, as you can then search the training data for creators.

    3. Pay server owners for use of their resources. If you’re scraping for AI you at the very least need to have a way for server owners to send you bills. And no content can be scraped from the original source more than once, see point 2.

    4. Either have a mechanism of tracking acknowledgement and accurately generating references along with the code, or if that’s too challenging, I’m personally also okay with a blanket policy where anything AI generated is public domain. The idea that you can use AI generated code derived from open source in your proprietary app, and can then sue anyone who has the audacity to copy your AI generated code, is ridiculous and unacceptable.


  • “Wait, not like that”: Free and open access in the age of generative AI

    I hate this take. “Open source” is not “public domain” or “free reign to do whatever the hell you want with no acknowledgement to the original creator.” Even the most permissive MIT license has terms that every single AI company shamelessly violate. All code derived from open source code need to at the very least reference the original author, so unless the AI can reliably and accurately cite where the code it generates came from, all AI generated code that gets incorporated into any publicly distributed software violates the license of every single open source project it has ever scraped.

    That’s saying nothing about projects with copyleft licenses that place conditions on how the code can then be distributed. Can AI reliably avoid using information from those codebases when generating proprietary code? No? And that’s not a problem because?

    I absolutely hate the hypocrisy that permeates the discourse around AI and copyright. Knocking off Studio Ghibli’s art style is apparently the worst atrocity you can commit but god forbid open source developers, most of whom are working for free, have similar complaints about how their work is used.

    Just because you “can’t” obey the license terms due to some technical limitation doesn’t mean you deserve a free pass from them. It means the technology is either too immature to be used or shouldn’t be used at all. Also, why aren’t they using LLMs when scraping to read the licenses and exclude anything other than pure public domain? Or better yet, use literally last century’s technology to read the robots.txt and actually respect it. It’s not even a technical limitation, it’s a case of doing the right thing is too restrictive and won’t allow us to accomplish what we want to do so we demand the right thing be expanded to what we’re trying to do.

    Open source only has anywhere between one and two core demands: Credit me for my work and potentially distribute derivatives in a way I can still take advantage of. And even that’s not good enough for these AI chuds, they think we’re the unreasonable ones for having these demands and not letting them use our code with no strings attached.

    This is where many creators find themselves today, particularly in response to AI training. But the solutions they’re reaching for — more restrictive licenses, paywalls, or not publishing at all — risk destroying the very commons they originally set out to build.

    Yeah blame the people getting exploited and not the people doing the exploiting why don’t you.

    Particularly with AI, there’s also no indication that tightening the license even works. We already know that major AI companies have been training their models on all rights reserved works in their ongoing efforts to ingest as much data as possible. Such training may prove to have been permissible in US courts under fair use, and it’s probably best that it does.

    No. Fuck that. There’s nothing fair about scraping an independent creator’s website (costing them real money) and then making massive profits from it. The creator literally fucking paid to have their work stolen.

    If a kid learns that carbon dioxide traps heat in Earth’s atmosphere or how to calculate compound interest thanks to an editor’s work on a Wikipedia article, does it really matter if they learned it via ChatGPT or by asking Siri or from opening a browser and visiting Wikipedia.org?

    Yes. And the fact that it’s stolen isn’t even the biggest problem by a long shot. In fact, even Wikipedia is a pretty shitty source, do what your high school teacher said you should do and search Wikipedia for citations, not the articles themselves.

    Don’t let AI teach you anything you can’t instantly verify with an authoritative source. It doesn’t know anything and therfore can’t teach anything by definition.

    Instead of worrying about “wait, not like that”, I think we need to reframe the conversation to […] “wait, not in ways that threaten open access itself”.

    Okay, let’s do that then. All AI training threaten open access itself. If not by ensuring the creator can never make money to sustain their work, then by LITERALLY COSTING THE CREATORS MONEY WHEN THEIR CONTENT IS SCRAPED! So the conclusion hasn’t changed.

    The true threat from AI models training on open access material is not that more people may access knowledge thanks to new modalities. It’s that those models may stifle Wikipedia and other free knowledge repositories, benefiting from the labor, money, and care that goes into supporting them while also bleeding them dry. It’s that trillion dollar companies become the sole arbiters of access to knowledge after subsuming the painstaking work of those who made knowledge free to all, killing those projects in the process.

    And how does shaming the victims of that knowledge theft for having the audacity to try and do something about it help exactly?

    Anyone at an AI company who stops to think for half a second should be able to recognize they have a vampiric relationship with the commons.

    […]

    And yet many AI companies seem to give very little thought to this,

    “Anyone at a Southern slave plantation who stops to think for half a second should be able to recognize they have a vampiric relationship with their black slaves.” Yeah, they know. That’s the point.



  • it was called CROSS PLATFORM APPS

    Absolutely not unless it’s as sandboxed as the web (which even the web isn’t sandboxed that well).

    Working with software has only made me not trust software (that’s not open source.)

    Why we’re giving any random software full user level access in 2026 is beyond me.





  • are tucked away behind unintuitive context menus

    That are well documented and don’t change once you figure out where they are. “UX” is code for “we’ll rearrange everything you need twice a year and force you to constantly re-learn our app because fuck you.”

    if you open the app for the first time and immediately think “this looks like it was last updated in 2003”, it’s not a good thing

    Why not? To me it’s reassuring because it means the procedures I memorized years ago probably haven’t changed. It’s the same reason people like the command line so much. Office software UI is a solved problem and arguably peaked in 2003 before MS Office started adding all the bullshit, it doesn’t need to be updated every single year.


  • Boo. It’s one of the last GUI software without user infantilization syndrome. Go use Google Docs if you want your software to coddle you.

    I swear if LibreOffice starts talking to me like I’m a child like MS Office does or starts having animations that actively slow me down and spike my CPU usage just to open a menu or something.

    Also, I’ve noticed a pretty strong correlation between “modern UX” and instability in office software. I don’t think I’ve ever had LibreOffice crash on me, the last major UX revision of MS Office definitely crashed more often than LibreOffice, and the latest version of MS Office crashes at least once every time I have to use it taking my unsaved work with it even with autosave on. I don’t know what “experience” they’re aiming for but not crashing and causing data loss should probably be prioritized over making it look pretty.



  • “No longer needed” is probably never going to happen, but IMO needed by fewer companies is inevitable. I see “vibe coding” as an extension to those website builders like Squarespace, definitely not suitable for a large website or a company whose entire business model is software and/or web based services, but good enough that the owner of a small, non-tech company who just happens to need a website or simple app can do it themselves instead of paying someone on Fiverr or something to do it. Unfortunately that means the options for new developers looking for easy experience building jobs that could eventually help them land a better paying position will be even more limited than it is now.



  • HiddenLayer555@lemmy.mltoProgrammer Humor@lemmy.mlZero Trust Architecture
    link
    fedilink
    English
    arrow-up
    31
    arrow-down
    2
    ·
    edit-2
    2 months ago

    This raises an interesting issue: Should house guests expect to be given Wi-Fi access? I’ve personally never even asked for Wi-Fi when I go over to someone else’s house because frankly I don’t trust their network. I don’t know what “smart devices” are port scanning every other device or collecting MAC addresses, I don’t know if they’ve ever updated their router firmware and if it’s been infected by the numerous malware automatically scanning the internet for unpatched routers. Not worth it, I’d rather use mobile data or not access the internet until I go home. Also I don’t want Google or Cloudflare to know who my friends are and where they live by having my browser fingerprint show up on their IP.



  • Is Erlang special in its architecture or is it more that it’s functional?

    One day I’ll learn how to do purely functional, maybe even purely declarative. But I have to train my brain to think of computer programs like that.

    Is there a functional and/or declarative language that has memory management features similar to Rust as opposed to a garbage collector?


  • but literally beating the flagship desktop chips in single-core performance

    See, this is what I despise about x86. AFAIK it’s literally RISC on the bare metal but there are hundreds of “instructions” running microcode which is basically just a translation layer. You’re not allowed to write code for the actual RISC implementation because that’s a trade secret or something. So obviously single core performance would be shit because you’re basically running an emulator all the time.

    RISC-V can’t come fast enough. Maybe someone will even make a chip that’s RISC-V but with the same instruction/microcode support as x86. So you can run RISC-V code directly or do the microcode thing and pretend you’re on x86. Though that would probably get the shit sued out of them by Intel because god forbid there’s actual innovation that the original creator can’t cash in on.