• 0 Posts
  • 6 Comments
Joined 1 year ago
cake
Cake day: January 17th, 2024

help-circle

  • The 1.5B version that can be run basically on anything. My friend runs it in his shitty laptop with 512MB iGPU and 8GB of RAM (inference takes 30 seconds)

    You don’t even need a GPU with good VRAM, as you can offload it to RAM (slower inference, though)

    I’ve run the 14B version on my AMD 6700XT GPU and it only takes ~9GB of VRAM (inference over 1k tokens takes 20 seconds). The 8B version takes around 5-6GB of VRAM (inference over 1k tokens takes 5 seconds)

    The numbers in your second link are waaaaaay off.





  • Unrelated, but the other day I read that the main computer for core calculation in Fukushima’s nuclear plant used to run a very old CPU with 4 cores. All calculations are done in each core, and the result must be exactly the same. If one of them was different, they knew there was a bit flip, and can discard that one calculation for that one core.