cross-posted from: https://programming.dev/post/51407459

Check what can you use and at what rate of token per seconds would it be… It has examples of many models and quantization levels. Huge resource!

  • shellington@piefed.zip
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 day ago

    Thank you it suggested a model for my GPU that is far better than what i was using and still has a decent output speed.