Exllama + GPTQ is the way to go
llama.cpp && GGUF are great on CPUs
More data: https://oobabooga.github.io/blog/posts/gptq-awq-exl2-llamacp...
Exllama + GPTQ is the way to go
llama.cpp && GGUF are great on CPUs
More data: https://oobabooga.github.io/blog/posts/gptq-awq-exl2-llamacp...