To be clear, GLM 4.7 Flash is MoE with 30B total params but <4B active params. While Devstral Small is 24B dense (all params active, all the time). GLM 4.7 Flash is much much cheaper, inference wise.
I don't know whether it just doesn't work well in GGUF / llama.cpp + OpenCode but I can't get anything useful out of Devstal 2 24B running locally. Probably a skill issue on my end, but I'm not very impressed. Benchmarks are nice but they don't always translate to real life usefulness.
They also have an API which you can use to get the icon SVG.
I love making (architecture) diagrams in D2 [1], and love using the vast library of icons from Iconify in my diagrams where it makes sense. A sample diagram with SVG from Iconfiy would look like this:
This is really nice. I keep track of most important habits to me like how often I go to gym, how much protein I eat everyday, and how many days I read (books), on something physical (pen and paper). Mostly on monthly calendars. This would make tracking each of them separately on a single piece of paper across the entire year pretty neat.
Started making side projects as a developer this year and hope to start working on my own products full-time from next year. Two books I found useful for positioning the product:
Good stuff. You will enjoy my short essay, I want to give a lot of fucks! [1], which argues against the typical conclusion reached by people working at big corp long enough: "Stop caring. Stop giving a fuck. Focus on things outside of work".
The core insight it, if you start to feel the need to stop caring, instead of changing your character and values, treat it as a strong signal to change your environment.
Pricing is $0.5 / $3 per million input / output tokens. 2.5 Flash was $0.3 / $2.5. That's 66% increase in input tokens and 20% increase in output token pricing.
For comparison, from 2.5 Pro ($1.25 / $10) to 3 Pro ($2 / $12), there was 60% increase in input tokens and 20% increase in output tokens pricing.
> Gemini 3 Flash is able to modulate how much it thinks. It may think longer for more complex use cases, but it also uses 30% fewer tokens on average than 2.5 Pro.
Started a newsletter [1] focused on agentic coding updates, nothing else. Other newsletters/blogs cover a lot of generic AI news, industry gossip, and marketing fluff. Having a focused feed is something I wanted for myself and finally I have enough time that I can write this newsletter regularly.
https://the-good-doctor.fandom.com/wiki/Not_Fake
reply