• REDACTED@infosec.pub
      link
      fedilink
      arrow-up
      2
      arrow-down
      1
      ·
      13 hours ago

      Claude outperforms in coding and agentic tasks. I asked about LLM as a chat model. It’s still in benchmarks at the top, and still the most popular one, by far.

      Even with Claude, the difference isn’t big and it’s the only one that managed to surpass it in benchmarks, so… still - at the bottom? You sure about that?