• pelespirit@sh.itjust.works
      link
      fedilink
      arrow-up
      38
      ·
      20 hours ago

      It’s like a comedy routine.

      • “Question and no guessing or mistakes gemini”
      • “Confident answer that’s totally made up”
      • “Were you guessing or hallucinating in the last statement?”
      • “I apologize, you’re 100% correct that I was guessing”
      • “How do I get you to stop guessing”
      • “Use this code…”
      • Repeat endlessly

      I do think it comes up with some buried, interesting sources, but that’s about it. It’s like a 5 year old.

      • affenlehrer@feddit.org
        link
        fedilink
        arrow-up
        19
        ·
        17 hours ago

        It can’t not hallucinate. It’s just predicting (not even selecting) next tokens. It doesn’t know what it knows and what it doesn’t know. It can’t introspect. It just gives probabilities for all possible tokens in it’s vocabulary based on the context window and the inference engine selects the next one (based on it’s settings). Without having the correct answer in the context window it can just make a prediction based on it’s (fixed) neutral net parameters and these are severely limited, even for big models. What I mean is, they basically “learned” the whole Internet and compressed the whole thing into some hundred billion or a few trillion parameters. That’s an insane compression ratio. This compression is lossy. For niece information and the results are similar to the “unimportant” details in highly compressed JPGs, you can make out the general image but fine details are just a mush. The LLM itself doesn’t know this, it just gives wrong predictions.

        For what it does I think the result is extremely impressive but the way it works is severely limited.

        • pelespirit@sh.itjust.works
          link
          fedilink
          arrow-up
          4
          ·
          17 hours ago

          From what I understand from what it told me about itself, it can be wrangled further. That is what the paid for versions are. The info can be sandboxed and then other agents verify the correctness of the info from very specific, known to be solid, sources. This is very expensive and still not fool proof. Am I wrong in thinking this bubble is going to pop hard?

          • affenlehrer@feddit.org
            link
            fedilink
            arrow-up
            3
            ·
            edit-2
            5 hours ago

            The models and methods are improving. Especially through tool use (Internet search, MCP, using programming languages) the model output improves a lot. Reasoning models are allowed to admit mistakes (during thinking) “wait, that’s wrong” (in normal conversation they will never say that if you don’t point it at the mistake). Otherwise they basically predict tokens, the inference engine selects one and they go with what was selected.

            It’s a bit like you remember some wrong information (Mandela effect), you’re confident it’s correct so you don’t double check and go with it. They usually don’t even know how confident they are, they have no introspection.

            In software development scenarios LLMs, due to their high “compression”, often hallucinate (misremember) methods and parameters that don’t exist in APIs or in different APIs or they don’t know about new versions of the API. Many of those errors are catched when the code doesn’t compile or unit tests fail but some of them stay (e.g. if the model created the unit tests and they don’t test what they’re supposed to test).

            Also a bit like humans the models often don’t have the whole codebase in the context so they make assumptions about the rest. Since they have no introspection they often don’t double check if those assumptions are correct.

            In case of frontend design they often can’t “see” the output or at least not in the way way we do. They don’t really know if something looks “good” or not (depending on their training).

            Verification with other agents can help but fundamentally these agents have the same issues. It’s a workaround.

            I’m actually not sure if the bubble will pop. I believe LLMs can be useful in some fields but they’re but the path to AGI. They’re also way too resource intensive and used in a lot of situations where it’s dangerous or doesn’t make sense.

            However, it’s not that the AI researchers don’t know about the limitations. They trying to work around the issues of the LLMs with some success for years now and they kind of have to because they kind of work and bring publicity. Behind the scenes the AI craze also brought money for research into different directions with other fundamentals. E.g. JEPA, world models, diffusion models, logic based models, energy based models, small recursive models and a lot of optimizations to make things faster and cheaper to compute.

            The bubble could pop if one of the major companies does something stupid and their stock tanks but as long money is pumped in there, there is also actually progress in new fundamentals and if they are developed before the bubble popped we might get “real” AI or AGI.

            • pelespirit@sh.itjust.works
              link
              fedilink
              arrow-up
              3
              ·
              9 hours ago

              I was trying to see what AI says this time about the bubble and it was kind of interesting. This is from Gemini btw:

              Question: But Google is switching to AI search only and everyone hates AI that I know. Isn’t that going to be a problem?

              Yes, the intense public backlash against AI search is a major risk for Google, but the company is trapped in what economists call the “Innovator’s Dilemma”—they must force AI search on users to protect their monopoly, even if people hate it. Google’s sweeping AI Search overhaul has sparked heavy criticism over inaccurate answers, cluttered layouts, and a ruined web ecosystem. However, understanding why they are doing it reveals the corporate survival strategy at play.

              The Problem: Why Everyone is Frustrated

              • The Accuracy Deficit: Users are highly vocal about AI “hallucinations”. When people want hard facts, a summarized paragraph that is only “mostly accurate” is a massive step backward from a trusted link.
              • The “Web-Laundering” Backlash: Independent publishers, bloggers, and creators are furious. Google’s AI scrapes their data, answers the question directly on the page, and deprives the original creator of website clicks. This threatens to destroy the very content ecosystem Google relies on.
              • The “Web-Clean” Workaround: The dislike of these features is prominent enough that a growing number of tech-savvy users are actively using URL filters like &udm=14 to strip AI elements and force Google back to the classic “ten blue links” format.

              Why Google is Doing It Anyway

              Despite the clear user pushback, Google’s leadership is deliberately choosing to cannibalize its own classic search experience for two critical reasons:

              • The Threat of Total Obsolescence: If Google does not build an AI “answer engine,” users might completely migrate to OpenAI, Perplexity, or ChatGPT for their everyday queries. Google would rather you be annoyed by their AI than fall in love with a competitor’s AI.
              • The “Silent Majority” Metric: While power users and creators complain loudly on platforms like Reddit and X, Google’s internal data shows that the general public is adapting. At the recent conference, Google VP Elizabeth Reid revealed that AI Mode usage has been doubling every quarter, quietly crossing over 1 billion monthly active users. For basic, casual queries, the passive majority takes the trade-off of quick answers over clicking links.

              Why Google Can Afford the Backlash (For Now)

              A full-blown consumer revolt is unlikely to break Google anytime soon due to their massive structural defense mechanisms:

              • Unshakable Monopoly: As of early 2026, Google still commands over 90% of the global search market share. Bing sits at a distant 5%. Because consumers have no true equivalent alternative for mass searching, Google can weather significant user dissatisfaction without losing its user base.
              • Default Ecosystems: Google pays billions annually to remain the default search engine on Apple iPhones and Safari browsers. Unless consumers manually dive into their settings to change their default search provider, they will continue to use Google by default.
              • affenlehrer@feddit.org
                link
                fedilink
                arrow-up
                1
                ·
                5 hours ago

                Understanding questions and summarizing information is a field where LLMs are quite good at. If they actually look up the sources and “read” the websites they are often able to give good answers. If they don’t use tools and just answer from what they “remember”, the information often contains hallucinations.

                So from a user perspective I think search will get better for specific questions.

                However, traffic to websites and all the things the LLMs omits are lost. If the LLMs gives you the answer you don’t learn about the author of the information, the design of the website, the nuanced and maybe thoughtful story the author built around the information and all the other stuff the author put there.

          • Jiral@lemmy.org
            link
            fedilink
            arrow-up
            6
            ·
            15 hours ago

            If you wan to know if a bubble pop just look at the fundamentals. Yes, I know, especially during bubbles people tell you that fundamentals don’t matter but they always win in the end. The thing is that you cannot bet on them because the market can always stay longer irrational than you can stay liquid. Eventually however it always corrects on the fundamentals again. Those can change of course over time but looking at the insane amounts of money flowing into data centers with no possible way of recovering that cost, I think the picture is clear. We also have wonderful highly circular money flows that to a large extend do not even exist but are all taken for full.

            The only question is when it implodes. Within a year, within three? Who knows.

          • LurkingLuddite@piefed.social
            link
            fedilink
            English
            arrow-up
            7
            ·
            17 hours ago

            It’s definitely going to pop hard, because those “verifying agents” are just more models computing correlation with sources, not actually verifying anything.

  • WesternInfidels@feddit.online
    link
    fedilink
    English
    arrow-up
    25
    ·
    17 hours ago

    In the future you will downvote nothing and b̵̰͇̹͔̩̹̲͉͉̟̜͂̓̊͝ȩ̸̛̹̠͈͍͓̬̥̱̰̯͎̖̤̫̏̐̍̏̐̀̍̓͜͜ ̴̢̧̰͉̗̠̼̹̙̩̱̫͖̫̂̇̃̅̎͆̕͜ͅh̴̙͓͎͉̩͙͎̪̥̺̔̈a̷̡̰̗͚̪͈̩͆̋͘p̵̢̨̙̪͚̞͇͎̱̰͕̈́̃̂̽̋̒̚͠p̶̛̛͕̓̌̒̒̅̿̃̃̿̌͠͝y̶̢̛͍̟̥̲͇̠̗̼͍͓͐̒̋̿̽̄̓̋͐͊̎͘

  • const_void@lemmy.ml
    link
    fedilink
    arrow-up
    59
    ·
    20 hours ago

    The amount of bugs in the Youtube app has been crazy lately. They’re clearly relying heavily on vibecoding.

    • cinoreus@lemmy.world
      link
      fedilink
      arrow-up
      23
      arrow-down
      1
      ·
      18 hours ago

      Dude, so many times I have encountered bad software lately, and I always thought, some Idiot must have used LLM for this. And I don’t even know why. Everything that already was working good started breaking down for some reason. Why are devs wrecking their code base just to fix minor bugs?

      • AzuranAurora@piefed.ca
        link
        fedilink
        English
        arrow-up
        25
        ·
        17 hours ago

        The execs and shareholders demand it in order to justify the massive investment they’ve made in their shitty chatbots.

        • RidcullyTheBrown@programming.dev
          link
          fedilink
          arrow-up
          5
          ·
          8 hours ago

          Or, hear me out, they’re doing it to justify their continuous employment.

          We could have stopped developing almost every piece of popular software in 2018 or so and the world would have been a better place, but imagine putting YouTube or whatever, Slack, Reddit, etc in maintenance mode a decade ago and only investing in scalability and stability. Wouldn’t the world be a better place?

          But that would mean that a lot of people would be out of a job, entire departments really, so they’re doing crap to justify their existence

        • sobchak@programming.dev
          link
          fedilink
          arrow-up
          3
          ·
          12 hours ago

          They also want training data from their employees correcting the agents so the agents can eventually replace them (that’s what they’re hoping for at least).

  • DandomRude@lemmy.world
    link
    fedilink
    English
    arrow-up
    28
    ·
    20 hours ago

    Isn’t it unbelievable how multi-billion-dollar corporations push out updates that should never have been released, despite the warnings from their undoubtedly highly competent developers?

    That alone says everything you need to know about the times we live in…

    It seems we’ve reached the end of the road for hyper-capitalism.

    • voidsignal@lemmy.world
      link
      fedilink
      arrow-up
      16
      ·
      19 hours ago

      I mean why should they care? They continuously fuck people over, and they keep coming back. I would do the same at that point. #leave

      • DandomRude@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        1
        ·
        19 hours ago

        Yes, that’s true, because of the network effect. But you can still rest on your laurels as long as there’s no serious competition. Another motivation for PeerTube & Co.

        Regarding federated applications: I think they not only need content, but also have to become significantly more user-friendly to ever have a chance in the mainstream. It’s simply a reality that the average user doesn’t know the first thing about the applications they use—and, above all, that they never want to know. The essential and only “selling point” is and remains convenience—and even setting aside the lack of content, federated applications unfortunately can’t keep up. Not for technical reasons, but because the average internet user is such a complacent wimp.

        • voidsignal@lemmy.world
          link
          fedilink
          arrow-up
          4
          ·
          17 hours ago

          Yeah. “me me me, now now now”. And they get exactly what they deserve. Convenience, at any cost.

          But honestly it’s not even really convenient. It’s just accoutumance. Windows is the opposite of simple. Google is not simple (try to find anything in that google drive soup). iOS is terrible at doing anything other than giving Apple money. They are just used to it. That’s all. And I don’t think we should replicate any of this in the Fediverse. Because I’m a Fediverse user, I love how it is, and I DO NOT want to see it flooded by stupid flows taken from greedy useless silicon valley product managers. And if that’s what keeps Rando Joe from using the Fediverse, so be it. Maybe they’ll come later after getting fucked a bit too hard by big tech and realize that, yeah, maybe they can spend 10m learning something new, because fundamentaly, it’s better. We are building an alternative, not a replacement.

  • bright@piefed.social
    link
    fedilink
    English
    arrow-up
    9
    ·
    18 hours ago

    Apple is at least as at fault here. The new ui system they introduced about a year ago is a shitshow

  • Hettyc_Tracyn@lemmy.zip
    link
    fedilink
    arrow-up
    6
    ·
    19 hours ago

    I have been using YouTube Redux and the element remover tool of Ublock Origin…

    Makes Youtube much more pleasant (plus makes it look like it used to before the stupid UI updates)

    • MonkderVierte@lemmy.zip
      link
      fedilink
      arrow-up
      2
      ·
      18 hours ago

      Is there also a trick to make it less slow to load, like old.youtube or youtube.classic or something? I mean, they have no performance goals or what?

      • hexagonwin@lemmy.today
        link
        fedilink
        arrow-up
        2
        ·
        14 hours ago

        rehike or invidious is a thing, has been quite a while since i tried it so no idea if they still work reliably tho

      • LurkingLuddite@piefed.social
        link
        fedilink
        English
        arrow-up
        4
        ·
        17 hours ago

        There are user agent switcher extensions for FireFox et. al. that can sidestep the garbage, anticompetitive throttling Google does against FireFox.

        It’s also nice for those certain stupid websites that think they cannot work in FireFox…

      • Hettyc_Tracyn@lemmy.zip
        link
        fedilink
        arrow-up
        3
        ·
        18 hours ago

        Youtube redux just changes the CSS

        Idk, try different browsers, VPNs, VPN configurations? I don’t usually have trouble with loading stuff… it takes a few seconds which is to be expected

  • IninewCrow@lemmy.ca
    link
    fedilink
    English
    arrow-up
    6
    ·
    20 hours ago

    I think its done by design to drive people away from using a browser where people have more control and the website has less control (of course depending on the skill and patience of the user)

    Having a shitty service on browser and on certain devices just drives people to view Youtube more and more on self contained smart services like Smart TVs or dedicated viewing devices.

    I use Linux, firefox and have ad blocking extensions … and sometimes Youtube just becomes unusable

  • farmgineer@nord.pub
    link
    fedilink
    English
    arrow-up
    1
    ·
    17 hours ago

    The app gets messed up on my Japanese LG TV every now and again (and I seem to have lost the ability to thumbs-up videos again), but it’s never been unusable. The app on my android phone looks fine at the moment.