For me the game changer here is the speed. On my local Mac I'm finally getting token counts that are faster than I can process the output (~96 tok/s), and the quality has been solid. I had previously tried some of the distilled qwen and deepseek models and they were just way too slow for me to seriously use them.