Since the article is largely about open weights models, I think the argument is that this is the "last gasp" and soon doing inference at home will be common.
The small models that I can run at home are becoming more capable, and I have replaced some API-based tasks with local inference as they improve, but large open weights models are still a lot stronger. The nice thing with larger open weights models is that competing providers serve them at modest margins and prices. I don't have the hardware to run the largest Qwen models, but I can get API access at low cost. Since there are only modest barriers to new commercial inference providers for these models I'm not worried that API access to them will become drastically more expensive at some future time.
The trend over the last decades was towards more centralization and I don't see that changing. Unless we radically change our economic system, the rent seekers will always win. There will be probably less of them but they will be even bigger.
The work on self-driving cars would not be done if there were not a way to profit from it. If it isn’t expected to earn more than it costs, it wouldn’t be done.
Now maybe it’s all being done because of expectations of a monopoly (this is a free market consequence, right?), but …
Research has a large asymmetry to it. Once someone shows something can be done. Others can follow quite easily.
And more substantial is that when someone shows something can be done, its orders of mangnitude easier for $ENGINEER at $CORP to get $CSUITEs attention to get a budget / justify the risk.
Your self driving car example is the best example of this. Since waymo and tesla got popular NVIDIA really started pushing their self driving cars for everyone tech.
> The work on self-driving cars would not be done if there were not a way to profit from it. If it isn’t expected to earn more than it costs, it wouldn’t be done.
I disagree. "Expected" is hiding a lot of the work there, and the reasons for that expectation could be rational or completely irrational.
Hypothetically "work on self-driving cars" would be done if some crazy psychic-trusting billionaire's psychic told him they would be profitable, even if all rational analysis said they'd be a bad business. A lot of major investments, especially in tech, and done based on hope or as a bet, not due to any real foresight.
Yeah. Lots of discord-like free-software(as in freedom) chat apps are spawning. I think it's clear that whichever becomes the most popular will not be about who has better code but rather about who manages to get a stronger community around their project.
There are always quirks and edges. Like using Bluesky itself, there's a number of viable apps for them (some better, some worse), they're all slightly different. There was a large number of Reddit apps, every single one very different.
Completely disagree. Inability to handle specific math or CS is a matter of training and experience not reasoning and intelligence. The barista is quite capable at reasoning and learning feats the LLMs aren't close to
Yeah, there appears to be this idea that "being smart" is the same thing as "knowing facts", which I don't think is realistic.
I know plenty of people who are considerably smarter than me, but don't know nearly as much as I do about computer science or obscure 90's video game trivia. Just because I know more facts than they do (at least in this very limited scope) doesn't mean that they're less capable of learning than I am.
As you said, a barista is very likely able to reason about and learn new things, which is not something an LLM can really do.
It doesn't look anything like AGI and no one who knows what that means would be confused in any era.
Is it useful? Yes. Is it as smart as a person? Not even remotely. It can't even remember things it already was told 5 minutes ago. Sometimes even if they are still in the context window un compacted!
No, the big thing with AGI was that it was general. AI things we made were extremely narrow, identify things out of a set of classes or route planning or something similarly specific. We couldn't just hand the systems a new kind of task, often even extremely similar ones. We've been making superhuman level narrow AI things for many years, but for a long time even extremely basic and restricted worlds still were beyond what more general systems could do.
If LLMs are your first foray into what AI means and you were used to the term ML for everything else I could see how you'd think that, but AI for decades has referred to even very simple systems.
If AGI doesn't mean human level then what does? As you say, every application of A* is in some way "AI" so we had this idea of "AGI" for something "actually intelligent" but maybe I'm wrong and AGI never meant that. What does mean that?
They admit a raw LLM would be dangerous and then proceed to use RAG... How is this any better? You cannot allow an LLM to generate the final outbound message if you are liable for what it says.
LLM to understand the question? Yes. Generate SQL maybe with Embeddings to look up answers? Yes. Generate the final response? No.
4 bit is as low as I like to go. There are KLD and perplexity tests that compare quantizations where you can see the curve of degradation, but perplexity and KLD numbers can be misleading compared to real world use where small errors compound over long sessions.
In my anecdotal experience I’ve been happier with Q6 and dealing with the tradeoffs that come with it over Q4 for Qwen3.5 27B.
Generally the perplexity charts indicate that quality drops significantly below 4-bit, so in that sense 4-bit is the sweet spot if you're resource constrained.
reply