cross-posted from: https://programming.dev/post/51407459

Check what can you use and at what rate of token per seconds would it be… It has examples of many models and quantization levels. Huge resource!

  • adr1an@programming.devOP
    link
    fedilink
    arrow-up
    2
    ·
    18 hours ago

    I am regretting having done so much crossposts. It was an impulse, not being sure where to post… I was also interested on seeing what was the take from fellow lemmings. And it’s been enriching to me in that regard. I wonder how much were people really self hosting LLMs… Apparently not that much.