cross-posted from: https://programming.dev/post/51407459

Check what can you use and at what rate of token per seconds would it be… It has examples of many models and quantization levels. Huge resource!

  • CallMeAl (like Alan)@piefed.zip
    link
    fedilink
    English
    arrow-up
    4
    ·
    1 day ago

    I don’t know why you are spamming this across the Threadiverse but this is a data harvesting site with no respect for privacy loudly announcing how important privacy is. No thanks.

    • adr1an@programming.devOP
      link
      fedilink
      arrow-up
      2
      ·
      13 hours ago

      I am regretting having done so much crossposts. It was an impulse, not being sure where to post… I was also interested on seeing what was the take from fellow lemmings. And it’s been enriching to me in that regard. I wonder how much were people really self hosting LLMs… Apparently not that much.

    • YodaDaCoda@aussie.zone
      link
      fedilink
      English
      arrow-up
      1
      ·
      22 hours ago

      How do you figure it’s a data harvesting site? It’s actually suggested a model that works pretty well for me.

      • CallMeAl (like Alan)@piefed.zip
        link
        fedilink
        English
        arrow-up
        1
        ·
        22 hours ago

        Because I looked at the source code on the site. Being a data harvester doesn’t mean they don’t give real answers. Just that there is no such thing as a free lunch. You got a suggestion and they got your data.

  • shellington@piefed.zip
    link
    fedilink
    English
    arrow-up
    1
    ·
    24 hours ago

    Thank you it suggested a model for my GPU that is far better than what i was using and still has a decent output speed.