Kind of impressive when you think about it

cm0002@infosec.pub · 1 day ago

Kind of impressive when you think about it

TheEighthDoctor@lemmy.zip · 13 hours ago

Claude

REDACTED@infosec.pub · 13 hours ago

Claude outperforms in coding and agentic tasks. I asked about LLM as a chat model. It’s still in benchmarks at the top, and still the most popular one, by far.

Even with Claude, the difference isn’t big and it’s the only one that managed to surpass it in benchmarks, so… still - at the bottom? You sure about that?

timestatic@feddit.org · 10 hours ago

The context was about coding specifically tho

REDACTED@infosec.pub · edit-2 8 hours ago

I was directly questioning the “at the bottom” phrase, not the entire context. Context matters, or something

That being said:

Benchmarks

Source