The more I think about LLMs the stranger it feels trying to grasp what they are.
To me, when I'm working with them, they don't feel intelligence but rather an attempt at mimicking it.
You can never trust, that the AI actually did something smart or dump. The judge always has to be you.
It's ability to pattern match it's way through a code base is impressive until it's not and you always have to pull it back to reality when it goes astray.
It's ability to plan ahead is so limited and it's way of "remembering" is so basic. Every day it's a bit like 50 first dates.
Nonetheless seeing what can be achieved with this pseudo intelligence tool makes me feel a little in awe. It's the contrast between not being intelligence and achieving clearly useful outcomes if stirred correctly and the feeling that we just started to understand how to interact with this alien.
> they don't feel intelligence but rather an attempt at mimicking it
Because that's exactly what they are. An LLM is just a big optimization function with the objective "return the most probabilistically plausible sequence of words in a given context".
There is no higher thinking. They were literally built as a mimicry of intelligence.
> Because that's exactly what they are. An LLM is just a big optimization function with the objective "return the most probabilistically plausible sequence of words in a given context".
> There is no higher thinking. They were literally built as a mimicry of intelligence.
Maybe real intelligence also is a big optimization function? Brain isn't magical, there are rules that govern our intelligence and I wouldn't be terribly surprised if our intelligence in fact turned out to be kind of returning the most plausible thoughs. Might as well be something else of course - my point is that "it's not intelligence, it's just predicting next token" doesn't make sense to me - it could be both!
I don't understand why this point is NOT getting across to so many on HN.
LLM's do not think, understand, reason, reflect, comprehend and they never shall.
I have commented elsewhere but this bears repeating
If you had enough paper and ink and the patience to go through it, you could take all the training data and manually step through and train the same model. Then once you have trained the model you could use even more pen and paper to step through the correct prompts to arrive at the answer. All of this would be a completely mechanical process. This really does bear thinking about. It's amazing the results that LLM's are able to acheive. But let's not kid ourselves and start throwing about terms like AGI or emergence just yet. It makes a mechanical process seem magical (as do computers in general).
I should add it also makes sense as to why it would, just look at the volume of human knowledge (the training data). It's the training data with the mass quite literally of mankind's knowledge, genius, logic, inferences, language and intellect that does the heavy lifting.
> If you had enough paper and ink and the patience to go through it, you could take all the training data and manually step through and train the same model.
But you could make the exact same argument for a human mind? (could just simulate all those neural interactions with pen and paper)
The only way to get out of it is to basically admit magic (or some other metaphysical construct with a different name).
We do know that they are different, and that there are some systematic shortcomings in LLMs for now (e.g. no mechanism for online learning).
But we have no idea how many "essential" differences there are (if any!).
Dismissing LLMs as avenues toward intelligence just because they are simpler and easier to understand than our minds is a bit like looking at a modern phone from a 19th century point of view and dismissing the notion that it could be "just a Turing machine": Sure, the phone is infinitely more complex, but at its core those things are the same regardless.
I'm not so sure "a human mind" is the kind of newtonian clockwork thingiemabob you "could just simulate" within the same degree of complexity as the thing you're simulating, at least not without some sacrifices.
Can you give examples of how that "LLM's do not think, understand, reason, reflect, comprehend and they never shall" or that "completely mechanical process" helps you understand better when LLM works and when they don't?
Many people are throwing around that they don't "think", that they aren't "conscious", that they don't "reason", but I don't see those people sharing interesting heuristics to use LLMs well. The "they don't reason" people tend to, in my opinion/experience, underestimate them by a lot, often claiming that they will never be able to do <thing that LLMs have been able to do for a year>.
To be fair, the "they reason/are conscious" people tend to, in my opinion/experience, overestimate how much a LLM being able to "act" a certain way in a certain situation says about the LLM/LLMs as a whole ("act" is not a perfect word here, another way of looking at it is that they visit only the coast of a country and conclude that the whole country must be sailors and have a sailing culture).
It's an algorithm and a completely mechanical process which you can quite literally copy time and time again. Unless of course you think 'physical' computers have magical powers that a pen and paper Turing machine doesn't?
> Many people are throwing around that they don't "think", that they aren't "conscious", that they don't "reason", but I don't see those people sharing interesting heuristics to use LLMs well.
My digital thermometer doesn't think. Imbibing LLM's with thought will start leading to some absurd conclusions.
A cursory read of basic philosophy would help elucidate why casually saying LLM's think, reason etc is not good enough.
What is thinking? What is intelligence? What is consciousness? These questions are difficult to answer. There is NO clear definition. Some things are so hard to define (and people have tried for centuries) e.g. what is consciousness? That they are a problem set within themselves please see Hard problem of consciousness.
>My digital thermometer doesn't think. Imbibing LLM's with thought will start leading to some absurd conclusions.
What kind of absurd conclusions? And what kind of non absurd conclusions can you make when you follow your let's call it "mechanistic" view?
>It's an algorithm and a completely mechanical process which you can quite literally copy time and time again. Unless of course you think 'physical' computers have magical powers that a pen and paper Turing machine doesn't?
I don't, just like I don't think a human or animal brain has any magical power that imbues it with "intelligence" and "reasoning".
>A cursory read of basic philosophy would help elucidate why casually saying LLM's think, reason etc is not good enough.
I'm not saying they do or they don't, I'm saying that from what I've seen having a strong opinion about whether they think or they don't seem to lead people to weird places.
>What is thinking? What is intelligence? What is consciousness? These questions are difficult to answer. There is NO clear definition.
You see pretty certain that whatever those three things are a LLM isn't doing it, a paper and pencil aren't doing it even when manipulated by a human, the system of a human manipulating a paper and pencil isn't doing it.
But you can automate much of that work by having good tests. Why vibe-test AI code when you can code-test it? Spend your extra time thinking how to make testing even better.
It's a compressed database with diffuse indices. It's using probability matching rather than pattern matching. Write operations are called 'training' and 'fine-tuning'.
If you find yourself 50-first-dating your LLMs, it may be worth it to invest some energy into building up some better context indexing of both the codebase itself and of your roadmap.
Yeah, I admit I'm probably not doing that quite optimally.
I'm still just letting the LLM generate ephemeral .md files that I delete after a certain task is done.
The other day I found [beads](https://github.com/steveyegge/beads) and thought maybe that could be a good improvement over my current state.
But I'm quite hesitant because I also have seen these AGENTS.md files become stale and then there is also the question of how much information is too much especially with the limited context windows.
Probably all things that could again just be solved by leveraging AI more and I'm just an LLM noob. :D
Beads is basically what github issues is, but local and built in a way that LLMs can easily use it. I had a self-made solution that was close, but moved to beads because it worked out of the box without disrupting my workflow that much.
I've used it quite a bit, but now that Gas Town is a thing Beads getting a bit bloated and they're adding new features left and right, dunno why.
Might have to steal the best bits of Beads (the averaged out cli experience and JSONL for storing issues in the repo + local sqlite cache) and build my one with none of the extra bells and whistles.
I wonder, if there were an open platform to exist that people use increasingly, maybe that would be incentive enough for at least one bank/financial app to permit that platform just to get a competitive advantage.
In the meantime probably the best that can be done is having a regular phone and a banking phone.
Maybe the answer is to put whatever the banks etc need on something like a smartwatch. Smartwatch + phone is better than two phones IMHO and they're so tedious to use/install anything on that it reduces the attack surface for hackers etc. Tap to pay or digital signatures or identity, passkeys etc via a smartwatch interaction seems like a good use case. Sort of a souped up yubikey. I don't know how good biometrics is on watches nowadays but my Pixel phone has some sort of camera behind the screen to read fingerprints so I can't imagine its impossible. Even adding a capacitive pad on a band seems plausible. Who knows, I don't feel like biometrics have been a real focus of design in the smartwatches I've used.
Personally, I have found smartwatches fairly useless (I do enjoy the activity tracking and notifications but that's not much really) so freeing my phone from bullshit by moving some functions to a watch could increase the value/utility of a some sort of smartwatch. Ultimately, it doesn't need to be that "smart" even.
Still, the problem is that if you go this way, you'd have to put almost all useful functionality of a modern phone on a smartwatch, at which point you could just ditch the phone.
It's not just one tiny use case that's pushing us down the road of increasingly locked down devices. It's most use cases - because no matter the service, it's more profitable for the provider to control what you can and cannot do.
I don't think that's actually true? That's like insisting all useful functionality would have to be moved to a smartcard/yubikey/bitcoin hardware wallet/TPM etc. The main reason this is an issue is to prevent emulated hardware tokens. If you can disable secure boot, you can emulate secure elements and then things that others (i.e. your bank, government, etc) believe are carefully controlled secrets are not.
Doubtful - the costs of supporting it far outweighs any gain they'd have. In case of banks, the costs of supporting aren't just about developing software for an additional platform, but also insurance premiums and managing fallout of hacks (which always eventually happen) - both of which would go way up, as the company would be voluntarily supporting endpoint decides that are less secure than "industry standard" minimum.
Both convex and Instant let you build apps quickly without worrying about the backend. Where Instant differs, is that queries and transactions can run on the client: you get optimistic updates and offline mode by default. On the other hand, convex lets you define queries as functions that can run on the edge. I haven't used convex deeply, but this is my understanding.
Thank you so much!
I’m a first time builder of a bigger CRUD app. While I’m happy to build it with traditional methods the first time (REST API, SSE, auth etc.) I would love to use offers like Instant or Convex in my next projects.
Is Svelte and in extension SvelteKit somehow the next step in the evolution of frontend frameworks? From what I know it has more fine grained reactivity than for example React or Vue and should therefore just run more efficient? Or has the approach of Svelte also drawbacks that I am not aware of?
> Is Svelte and in extension SvelteKit somehow the next step in the evolution of frontend frameworks?
I personally would say no. I like Svelte's dev experience but I don't like the output code and while it has smaller invalidation subsections than a full component there's not a 1 to 1 mapping between a piece of data changed and the exact piece of DOM getting updated.
> Or has the approach of Svelte also drawbacks that I am not aware of?
Svelte is superb for producing NYT infographics and other relatively lightweight experiences. I work on interface builders and when you're scaling up the number of components on a page and the complexity then having the reactivity code repeated in the components instead of shared in a library becomes a drawback. What pushed me off of of Svelte was a ~500 loc component that had ~40 reactions and resulted in a 4.1k LoC js file output. I looked through the output and didn't see any particularly egregious mis-compilations, just that the Svelte's approach resulted in verbose outputs. I don't think most people will have components this complex so I don't think Svelte is a bad choice and I do like the DX but that caused me to move on.
Of the current options, I recommend Solid. It has fine grained reactivity all the way down, better perf, similar bundle size, and the community is generally performance obsessed. They're currently experimenting with islands/partial hydration/mixed server+client rendering and preliminary results are halving the delivered JS. As an example, their movies demo [1] is ~15k.
For example, angular, react, vue, etc. You can follow these tools as their popularity comes and go, but at the end of the day they exist on top of HTML/CSS/JS. So while these 3 basic blocks are there to stay forever, the tools on top are a game of musical chairs.
I'm not saying that they are bad. I've been using them at works for years. I'm saying that some people chose to focus on "old" tech and produce web pages and webapps without much tools. The advantage of this approach is that your environment is stable so you can actually learn and master the tools. You can even find crazy tricks and "abuse" some of the features as you become really good at those.
It can be liberating to stop changing tools every year, but instead learn the intricacies of the basic tools that will never change.
It's ability to pattern match it's way through a code base is impressive until it's not and you always have to pull it back to reality when it goes astray.
It's ability to plan ahead is so limited and it's way of "remembering" is so basic. Every day it's a bit like 50 first dates.
Nonetheless seeing what can be achieved with this pseudo intelligence tool makes me feel a little in awe. It's the contrast between not being intelligence and achieving clearly useful outcomes if stirred correctly and the feeling that we just started to understand how to interact with this alien.