Hacker Newsnew | past | comments | ask | show | jobs | submit | coderenegade's commentslogin

This is my take as well. A human who learns, say, a Towers of Hanoi algorithm, will be able to apply it and use it next time without having to figure it out all over again. An LLM would probably get there eventually, but would have to do it all over again from scratch the next time. This makes it difficult combine lessons in new ways. Any new advancement relying on that foundational skill relies on, essentially, climbing the whole mountain from the ground.

I suppose the other side of it is that if you add what the model has figured out to the training set, it will always know it.


That's just not true at all. There are entire fields that rest pretty heavily on brute force search. Entire theses in biomedical and materials science have been written to the effect of "I ran these tests on this compound, and these are the results", without necessarily any underlying theory more than a hope that it'll yield something useful.

As for advances where there is a hypothesis, it rests on the shoulders of those who've come before. You know from observations that putting carbon in iron makes it stronger, and then someone else comes along with a theory of atoms and molecules. You might apply that to figuring out why steel is stronger than iron, and your student takes that and invents a new superalloy with improvements to your model. Remixing is a fundamental part of innovation, because it often teaches you something new. We aren't just alchemying things out of nothing.


Well, we know that mixing lead into copper won't make for a strong material. There's a lot of human ingenuity involved.

I failed to make my point clear: Humans make the search area way smaller compared to current day AI.


This. Code generation is cheap, so you can rapidly explore the space and figure out the architecture that best suits the problem. From there, I start fresh and pseudocode the basic pattern I want and have Claude fill in the gaps.

There needs to be a measure (or measures) of the entropy of a codebase that provides a signal of complexity. When you're paying for every token, you want code patterns that convey a lot of immediate information to the agent so that it can either repeat the pattern, or extend it in a way that makes sense. This is probably the next wave of assisted coding (imo), because we're at the stage where writing code works, the quality is mostly decent, but it can be needlessly complex given the context of the existing repo.

There's a way to measure "entropy" of a codebase. Take something like the binary lambda calculus or the triage calculus, convert your program (including libraries, programming language constructs, operating system) into it, and measure the size of the program in it in bits.

You can also measure the crossentropy, which is essentially the whole program entropy above minus entropy of the programming language and functions from standard libraries (i.e. abstractions that you assume are generally known). This is useful to evaluate the conformance to "standard" abstractions.

There is also a way to measure a "maximum entropy" using types, by counting number of states a data type can represent. The maximum entropy of a function is a crossentropy between inputs and outputs (treating the function like a communication channel).

The "difference" (I am not sure how to make them convertible) between "maximum entropy" and "function entropy" (size in bits) then shows how good your understanding (compared to specification expressed in type signature) of the function is.

I have been advocating for some time that we use entropy measures (and information theory) in SW engineering to do estimation of complexity (and thus time required for a change).


Maybe cyclomatic complexity would be a good proxy. It can obviously be gamed but it's obvious when it is

There was a measure used during the Toyota Unintended Acceleration case called McCabe Cyclomatic Complexity, I wonder if anyone is using it alongside AI assisted code.

It is roughly equivalent to diff size: https://entropicthoughts.com/lines-of-code

I mean, it's ultimately a string, and the measurement of the entropy of a string is well-studied. The LLM might start gaming that with variable names so you'd need to do the AST instead. I may actually try something like that; cool idea.

I think that in the long run, AI assisted coding will turn out to be better than handcrafted code. When you pay for every token, and code generation is quick, a clean, low entropy codebase with good test coverage gets you a lot more for your dollar than a dog's breakfast. It's also much easier to fix bad decisions made early on in a project's life, because the machine is doing all of the heavy lifting.

This also lines up with the history of automation in many other industries. Modern manufacturing is capable of producing parts that a medieval blacksmith couldn't dream of, for example. Sure, maybe an artisan can produce better code than an llm now, but AI assisted humans will beat them in the near future if they aren't already producing similar quality output at greater speed, and tomorrow's models will fix the bad code written today. The fact that there's even a discussion on automated vs hand written today means that the writing is almost certainly on the wall.


You mean like I have to pay my compiler to turn high level code into low level code?

I suspect this is more true than most people think. Today's bad code will be cleaned up by tomorrow's agents.

The other factor that gets glossed over is that llms create a financial incentive to create cleaner code, with tests, because the agent that you pay for will be more efficient when the code is easier to understand, and has clear patterns for extensibility. When I do code with llms, a big part of it is demonstration, i.e. pseudocoding a pattern/structure, asking the model if it understands, and then having it complete the pattern. I've had a lot of success with this approach.


> llms create a financial incentive to create cleaner code, with tests, because the agent that you pay for will be more efficient when the code is easier to understand, and has clear patterns for extensibility

Right, this is the kind of discussion we're having on my team: suddenly all of the already good engineering practices like good observability, clear tests with high coverage, clean design, etc. act as a massive force multiplier and are that much more important. They're also easier to do if you prioritize it. We should be seeing quality go up. It's trivial to explore the solution space with throwaway PoCs, collect real data to drive your design, do all of those "nice to have" cleanups, etc. The people who assume LLM = slop are participating in a bizarre form of cope. Garbage in, garbage out; quality in, quality out. Just accept that coding per se is not going to be a profession for long. Leverage new tools to learn more, do more, etc. This should be an exciting time for programmers.


You're more likely to save tokens in the architecture than the language. A clean, extensible architecture will communicate intent more clearly, require fewer searches through the codebase, and take up less of the context window.

Depending on the type of catalytic converter, both of those things can be true.


That's cost, not practicality. Like it or not, the EV isn't as flexible when it comes to ownership, because you need a place to charge it. A product that is less practical has to be cheaper to compete in the market.


>>A product that is less practical has to be cheaper to compete in the market.

Unless the downside doesn't matter to you, then obviously it doesn't. Our e-Up was more expensive than a regular petrol Up, but it was absolutely worth paying the extra for the convenience of being able to charge it at home - it's like having your own personal petrol station in your own driveway.

For someone else, that might have been an inconvenience and the car would have to be much cheaper to offset the hassle - for us it was worth the premium. So it's not so clear cut as you present it.


Depends completely on the EV and usage patterns. Here's one of the more interesting points in the design space: https://silence-mobility.nissan.de/


As long as users are better than 50% accurate, it shouldn't matter if they're experts or not. That being said, it's difficult to measure user accuracy in this case without running into circular reasoning.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: