Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No, that sounds right. 24GB isn’t enough to feasibly run 27B parameters. The rule of thumb is approximately 1GB of ram per billion parameters.

Someone in another comment on this post mentioned using one of the micro models (Qwen 0.6B I think?) and having decent results. Maybe you can try that and then progressively move upwards?

EDIT: “Queen” -> “Qwen”



That rule of thumb is only related to 8 bit quants at low context. The default for ollama is 4 bit, which puts it roughly about 14GB.

The vast majority of people run between 4-6 bit depending on system capability. The extra accuracy above 6 tends to not be worth it relative to the performance hit.


You also need to leave space for other apps. If you run a 27B model on a 32GB machine you may find that you can't productively run other apps.

I have 64GB and I can only just fit a bunch of Firefox and VS Code windows at the same time as running a 27B model.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: