> But what was the fire inside you, when you coded till night to see your projec...

zeroonetwothree · 2026-01-11T16:11:12 1768147872

You’re right of course. For me there’s no flow state possible with LLM “coding”. That makes it feel miserable instead of joyous. Sitting around waiting while it spits out tokens that I then have to carefully look over and tweak feels like very hard work. Compared to entering flow and churning out those tokens myself, which feels effortless once I get going.

Probably other people feel differently.

wpm · 2026-01-11T16:37:02 1768149422

I'm the same way. LLMs are still somewhat useful as a way to start a greenfield project, or as a very hyper-custom google search to have it explain something to me exactly how I'd like it explained, or generate examples hyper-tuned for the problem at hand, but that's hardly as transformative or revolutionary as everyone is making Claude Code out to be. I loathe the tone these things take with me and hate how much extra bullshit I didn't ask for they always add to the output.

When I do have it one-shot a complete problem, I never copy paste from it. I type it all out myself. I didn't pay hundreds of dollars for a mechanical keyboard, tuned to make every keypress a joy, to push code around with a fucking mouse.

mirror_neuron · 2026-01-11T16:58:47 1768150727

I’m a “LLM believer” in a sense, and not someone who derives joy from actually typing out the tokens in my code, but I also agree with you about the hype surrounding Claude Code and “agentic” systems in general. I have found the three positive use cases you mentioned to be transformative to my workflow on its own. I’m grateful that they exist even if they never get better than they are today.

Hamuko · 2026-01-12T08:29:27 1768206567

Having worked with a greenfield project that has significant amount of LLM output in it, I’m not sure if I agree. There’s all sorts of weird patterns, insufficient permission checking, weird tests that don’t actually test things, etc. It’s like building a house on sand.

I’ve used Claude to create copies of my tests, except instead of testing X feature, it tests Y feature. That has worked reasonably well, except that it has still copied tests from somewhere else too. But the general vibe I get is that it’s better at copying shit than creating it from scratch.

wpm · 2026-01-12T18:09:36 1768241376

That's why I never have it generate any significant amount of code for those. I get juuuuuust enough to start understanding the structure and procedure and which parts of the API docs I should even be looking at for the problem at hand, and start from there on my own. I need a lay of the land, not an entire architecture. I build that part.

Cthulhu_ · 2026-01-12T11:42:42 1768218162

This is where we as software engineers need to be on the ball - just because an LLM wrote it doesn't mean it's right, doesn't mean we can let go of all the checks and balances and best practices we've developed over decades.

Set up tooling like tests and linters and the like. Set rules. Mandate code reviews. I've been using LLMs to write tests and frequently catch it writing tests that don't actually have any valuable assertions. It only takes a minute to fix these.

lelanthran · 2026-01-12T11:59:20 1768219160

> Set up tooling like tests and linters and the like. Set rules. Mandate code reviews. I've been using LLMs to write tests and frequently catch it writing tests that don't actually have any valuable assertions. It only takes a minute to fix these.

You can do all that, but it still remains a case of "I'm only interested in the final result".

When I read LLM generated systems (not single functions), it looks very ... alien to me. Even juniors don't put together systems that have this uncanny valley feel to it.

I suppose the best way to describe it would be to say that everything lacks coherency, and if you are one of these logical-mind people who likes things that make sense, it's not fun wading through a field of Chesterton's Fences as your f/time job.

zingar · 2026-01-13T05:29:34 1768282174

I noticed this too. Everything lacks consistency, wrapped in headings that are designed to make me feel good. It’s uncomfortable reading one thing that seems so right followed by code that feels wrong but my usual instincts about why help less because of how half right it is.

(But still, LLMs have helped me investigate and write code that is beyond me)

lelanthran · 2026-01-13T08:24:36 1768292676

> (But still, LLMs have helped me investigate and write code that is beyond me)

They haven't done that yet[1], but they have sped up things via rubber-ducking, and for documentation (OpenSSL is documentation is very complete, very thorough, but also completely opaque).

------------------------------------

[1] I have a project in the far future where they will help me do that, though. It all depends on whether I can get additional financial security so I can dedicate some time to a new project :-(

pfannkuchen · 2026-01-11T23:52:40 1768175560

> I didn't pay hundreds of dollars for a mechanical keyboard, tuned to make every keypress a joy, to push code around with a fucking mouse

Can’t you use vim controls?

sauercrowd · 2026-01-11T22:44:09 1768171449

> and hate how much extra bullshit I didn't ask for they always add to the output.

I can recommend for that problem to make the "jumps" smaller, e.g. "Add a react component for the profile section, just put a placeholder for now" instead of "add a user profile".

With coding LLMs there's a bit of a hidden "zoom" functionality by doing that, which can help calibrating the speed/involvment/thinking you and the LLM does.

sauercrowd · 2026-01-11T22:41:13 1768171273

Three things I can suggest to try, having struggled with something similiar:

1. Look at it as a completely different discipline, dont consider it leverage for coding - it's it's own thing.

2. Try using it on something you just want to exist, not something you want to build or are interested in understanding.

3. Make the "jumps" smaller. Don't oneshot the project. Do the thinking yourself, and treat it as a junior programmer: "Let's now add react components for the profile section and mount them. Dont wire them up yet" instead of "Build the profile section". This also helps finding the right speed so that you can keep up with what's happening in the codebase

fao_ · 2026-01-12T00:24:20 1768177460

> Try using it on something you just want to exist, not something you want to build or are interested in understanding.

I don't get any enjoyment from "building something without understanding" — what would I learn from such a thing? How could I trust it to be secure or to not fall over when i enter a weird character? How can I trust something I do not understand or have not read the foundations of? Furthermore, why would I consider myself to have built it?

When I enter a building, I know that an engineer with a degree, or even a team of them, have meticulously built this building taking into account the material stresses of the ground, the fault lines, the stresses of the materials of construction, the wear amounts, etc.

When I make a program, I do the same thing. Either I make something for understanding, OR I make something robust to be used. I want to trust the software I'm using to not contain weird bugs that are difficult to find, as best as I can ensure that. I want to ensure that the code is clean, because code is communication, and communication is an art form — so my code should be clean, readable, and communicative about the concepts that I use to build the thing. LLMs do not assure me of any of this, and the actively hamstring the communication aspect.

Finally, as someone surrounded by artists, who has made art herself, the "doing of it" has been drilled into me as the "making". I don't get the enjoyment of making something, because I wouldn't have made it! You can commission a painting from an artist, but it is hubris to point at a painting you bought or commissioned and go "I made that". But somehow it is acceptable to do this for LLMs. That is a baffling mindset to me!

arbitrary_name · 2026-01-12T03:17:19 1768187839

>I don't get any enjoyment from "building something without understanding" — what would I learn from such a thing? How could I trust it to be secure or to not fall over when i enter a weird character? How can I trust something I do not understand or have not read the foundations of? Furthermore, why would I consider myself to have built it?

All of these questions are irrelevant if the objective is 'get this thing working'.

sauercrowd · 2026-01-12T00:55:57 1768179357

You seem to read a lot into what I wrote, so let me phrase it differently.

These are ways I'd suggest to approach working with LLMs if you enjoy building software, and are trying to find out how it can fit into your workflow.

If this isnt you, these suggestions probably wont work.

> I don't get any enjoyment from "building something without understanding".

That's not what I said. It's about your primary goal. Are you trying to learn technology xyz, and found a project so you can apply it vs you want a solution to your problem, and nothing exists, so you're building it.

What's really important is that wether you understand in the end what the LLM has written or not is 100% your decision.

You can be fully hands off, or you can be involved in every step.

CuriouslyC · 2026-01-12T03:17:10 1768187830

> You can commission a painting from an artist, but it is hubris to point at a painting you bought or commissioned and go "I made that". But somehow it is acceptable to do this for LLMs. That is a baffling mindset to me!

The majority of the work on a lot of famous masterpieces of art was done by apprentices. Under the instruction of a master, but still. No different than someone coming up with a composition, and having AI do a first pass, then going in with photoshop and manually painting over the inadequate parts. Yet people will knob gobble renaissance artists and talk about lynching AI artists.

habinero · 2026-01-12T06:01:02 1768197662

I've heard this analogy regurgitated multiple times now, and I wish people would not.

It's true that many master artists had workshops with apprenticeships. Because they were a trade.

By the time you were helping to paint portraits, you'd spent maybe a decade learning techniques and skill and doing the unimportant parts and working your way up from there.

It wasn't a half-assed, slop some paint around and let the master come fix it later. The people doing things like portrait work or copies of works were highly skilled and experienced.

Typing "an army of Garfields storming the beach at Normandy" into a website is not the same.

CuriouslyC · 2026-01-12T12:19:41 1768220381

That's a straw man and you know it.

Anti-AI art folks don't care if you photobashed bits of AI composition and then totally painted over it in your own hand, the fact that AI was involved makes it dirty, evil, nasty, sinful and bad. Full stop. Anti-AI writing agents don't care if every word in a manuscript was human written, if you asked AI a question while writing it suddenly you're darth fucking vader.

The correct comparison for some jackass who just prompts something, then runs around calling it art is to a pre-schooler that scribbles blobs of indistinct color on a page, then calls it art. Compare apples to apples.

habinero · 2026-01-12T17:56:02 1768240562

That's not what a strawman is lol. Me saying the analogy sucks is just criticism.

If you feel judged about using AI, then your choices are (1) don't use it or (2) don't tell people you use it or (3) stop caring what other people think.

Have the courage of your own convictions and own your own actions.

PaulHoule · 2026-01-12T01:52:18 1768182738

Lately I've been interesting in biosignals, biofeedback and biosynchronization.

I've been really frustrated with the state of Heart Rate Variability (HRV) research and HRV apps, particularly those that claim to be "biofeedback" but are really just guided breathing exercises by people who seem to have the lights on and nobody home. [1]

I could have spent a lot of time reading the docs to understand the Web Bluetooth API and facing up to the stress that getting anything with Bluetooth working with a PC is super hit and miss so estimating the time I'd expect a high risk of spending hours rebooting my computer and otherwise futzing around to debug connection problems.

Although it's supposedly really easy to do this with the Web Bluetooth API I amazingly couldn't find any examples which made all the more apprehensive that there was some reason it doesn't work. [2]

As it was Junie coded me a simple webapp that pulled R-R intervals from my Polar H10 heart rate monitor in 20 minutes and it worked the first time. And in a few days, I've already got an HRV demo app that is superior to the commercial ones in numerous ways... And I understand how it works 100%.

I wouldn't call it vibe coding because I had my feet on the ground the whole time.

[1] for instance I am used to doing meditation practices with my eyes closed and not holding a 'freakin phone in my hand. why they expect me to look at a phone to pace my breathing when it could talk to be or beep at me is beyond me. for that matter why they try to estimate respiration by looking at my face when they could get if off the accelerometer if i put in on my chest when i am lying down is also beyond me.

[2] let's see, people don't think anything is meaningful if it doesn't involve an app, nobody's gotten a grant to do biofeedback research since 1979 so the last grad student to take a class on the subject is retiring right about now...

leptons · 2026-01-12T04:26:27 1768191987

>When I enter a building, I know that an engineer with a degree, or even a team of them, have meticulously built this building taking into account the material stresses of the ground, the fault lines, the stresses of the materials of construction, the wear amounts, etc.

You can bet that "AI" is coming for this too. The lawsuits that will result when buildings crumble and kill people because an LLM "hallucinated" will be tragic, but maybe we'll learn from it. But we probably won't.

fao_ · 2026-01-13T15:47:33 1768319253

Have you heard of the Horizon IT Post Office Scandal[0]?

> Between 1999 and 2015, more than 900 subpostmasters were wrongfully convicted of theft, fraud and false accounting based on faulty Horizon data, with about 700 of these prosecutions carried out by the Post Office. Other subpostmasters were prosecuted but not convicted, forced to cover illusory shortfalls caused by Horizon with their own money, or had their contracts terminated. > > Although many subpostmasters had reported problems with the new software, and Fujitsu was aware that Horizon contained software bugs as early as 1999, the Post Office insisted that Horizon was robust and failed to disclose knowledge of the faults in the system during criminal and civil cases.

(content warning for the article about that for suicide)

Now think of places where LLMs are being deployed:

- accountancy[1][2]

- management systems similar to Horizon IT

- medical workers using it to pass their coursework (A friend of mine is doing a nursing degree in the USA and they are encouraged to use Gemini, and she's already seen someone on the same course use it to complete their medical ethics homework...)

- Ordinary people checking drug interactions[3], learning about pickling (and almost getting botulism), talking to LLMs and getting poisoned by bromide[4]

[0]: https://en.wikipedia.org/wiki/British_Post_Office_scandal

[1]: https://www.leapfin.com/luca-ai

[2]: https://www.autoentry.com/integrations/sage

[3]: https://www.tumblr.com/pangur-and-grim/805013689696747520?so...

[4]: https://www.livescience.com/health/food-diet/man-sought-diet...

c-hendricks · 2026-01-12T01:06:02 1768179962

I build a lot of custom tools, things with like a couple of users. I get a lot of personal satisfaction writing that code.

I think comments on YouTube like "anyone still here in $CURRENT_YEAR" are low effort noise, I don't care about learning how to write a web extension (web work is my day job) so I got Claude to write one for me. I don't care who wrote it, I just wanted it to exist.

RayVR · 2026-01-12T00:19:17 1768177157

I think the key thing here is in point 2.

I’ve wanted a good markdown editor with automatic synchronization. I used to used inkdrop. Which I stopped using when the developer/owner raised the price to $120/year.

In a couple hours with Claude code, I built a replacement that does everything I want, exactly the way I want. Plus, it integrates native AI chat to create/manage/refine notes and ideas, and it plugs into a knowledge RAG system that I also built using Claude code.

What more could I ask for? This is a tool I wanted for a long time but never wanted to spend the dozens of hours dealing with the various pieces of tech I simply don’t care about long-term.

This was my AI “enlightenment” moment.

rossu · 2026-01-12T00:33:53 1768178033

Really interesting. How do you find the quality of the code and the final result to be? Do you maybe have this public, would love to check it out!

afavour · 2026-01-12T02:00:23 1768183223

The incredible thing (to me) is that this isn’t even remotely a new thing: it’s reviewing pull requests vs writing your own code. We all know how different that feels!

godelski · 2026-01-12T02:47:49 1768186069

For me it feels like print statement debugging in a compiled language

kristofferR · 2026-01-12T02:40:33 1768185633

Correct, provided you were the one who wrote an incredibly specific feature request that the pull request solved for you.

bossyTeacher · 2026-01-11T22:43:09 1768171389

This.

To me, using an LLMs is more like having a team of ghostwriters writing your novel. Sure, you "built" your novel but it feels entirely different to writing it yourself.

Cthulhu_ · 2026-01-12T11:50:41 1768218641

Wouldn't it be like having a team of software developers writing your code? The analogy doesn't need to be even as far as a different line of work. And for some this (writing to managing) is a natural career progression.

username223 · 2026-01-12T03:59:15 1768190355

And if you write novels mostly because you enjoy watching them sell, as opposed to sharing ideas with people, you don't care.

To scientists, the purpose of science is to learn more about the world; to certain others, it's about making a number of dollars go up. Mathematicians famously enjoy creating math, and would have no use for a "create more math" button. Musicians enjoy creating music, which is very different from listening to it.

We're all drawn to different vocations, and it's perverse to accept that "maximize shareholder value" is the highest.

anonzzzies · 2026-01-11T23:23:35 1768173815

I have both; for embedded and backend I prefer entering code; once in the flow, I produce results faster and feel more confident everything is correct. for frontend (except games), i find everything annoying and a waste of time manually, as do all my colleagues. LLMs really made this excellent for our team and myself. I like doing UX, but I like drawing it with a pen and paper and then do experiments with controls/components until it works. This is now all super fast (I usually can just take photo of my drawings and claude makes it work) and we get excellent end results that clients love.

falcor84 · 2026-01-11T23:35:13 1768174513

> For me there’s no flow state possible with LLM “coding”.

I would argue that it's the same question as whether it's possible to get into a flow state when being the "navigator" in a pair-programming session. I feel you and agree that it's not quite the same flow state as typing the code yourself, but when a session with a human programmer or Claude Code is going well for me, I am definitely in something quite close to flow myself, and I can spend hours in the back and forth. But as others in this thread said, it's about the size of the tasks you give it.

PaulHoule · 2026-01-12T01:57:53 1768183073

I can say I feel that flow state sometimes when it all works but I certainly don't when it doesn't work.

The other day I was making changes to some CSS that I partially understood.

Without an LLM I would looked at the 50+ CSS spec documents and the 97% wrong answers on Stack Overflow and all the splogs and would have bumbled around and tried a lot of things and gotten it to work in the end and not really understood why and experienced a lot of stress.

As it was I had a conversation with Junie about "I observe ... why does it work this way?", "Should I do A or do B?", "What if I did C?" and came to understand the situation 100% and wrote a few lines of code by hand that did the right thing. After that I could have switched it to Code mode and said "Make it so!" but it was easy when I understood it. And the experience was not stressful at all.

jfengel · 2026-01-11T23:57:42 1768175862

I could imagine a world where LLM coding was fun. It would sound like "imagine a game, like Galaxians but using tractor trailers, and as a first person shooter." And it pumps out a draft and you say, "No, let's try it again with an army of bagpipers."

In other words, getting to be the "ideas guy", but without sounding like a dipstick who can't do anything.

I don't think we're anywhere near that point yet. Instead we're at the same point where we are with self-driving: not doing anything but on constant alert.

simonw · 2026-01-12T00:27:59 1768177679

Prompt one:

  imagine a game, like Galaxians but using tractor trailers,
  and as a first person shooter. Three.js in index.html

Result: https://gisthost.github.io/?771686585ef1c7299451d673543fbd5d

Prompt two:

  No, let's try it again with an army of bagpipers.

Result: https://gisthost.github.io/?60e18b32de6474fe192171bdef3e1d91

I'll be honest, the bagpiper 3D models were way better than I expected! That game's a bit too hard though, you have to run sideways pretty quickly to avoid being destroyed by incoming fire.

Here's the full transcript: https://gisthost.github.io/?73536b35206a1927f1df95b44f315d4c

jfengel · 2026-01-13T19:48:39 1768333719

That is deeply impressive. The code is quite readable. I appreciate that it even got the "joke" with bagpipers. I know it's just recycling other people's jokes, but it's still quite the feat.

I have never used one of these. I'm going to have to try it.

PaulHoule · 2026-01-12T02:00:56 1768183256

There's a reason why bagpipes are banned under the Geneva convention!

gsf_emergency_6 · 2026-01-13T03:52:35 1768276355

https://youtube.com/shorts/4RgVrYYBgwY

Tangential:

(As a FW-curious noob I wondered if Gemini understand, Why do foxes struggle, compared to wolves of all gender)

>Yes, many fox species, particularly the red fox, exhibit more neotenous (juvenile-like) traits compared to wolves, such as shorter muzzles, larger eyes relative to head size, and different skull development, reflecting a divergence in evolutionary paths within the canid family, with foxes often retaining softer, more generalized features compared to the larger, more specialized wolf. While wolves are highly SOCIAL pack animals with traits adapted for cooperative hunting, foxes are generally SOLITARY, and this difference in lifestyle and morphology highlights their distinct evolutionary strategies, with foxes leaning towards juvenile-like features in their adult forms.

Hows foxwork doin

PaulHoule · 2026-01-13T17:34:51 1768325691

Wolves work together to bring down large prey and with their large digestive tracts they "wolf it down".

Foxes have a small digestive tract (makes them lightweight so they can jump and pounce on prey) and can't even eat a whole rabbit so they eat a bit and bury the rest under a layer of dirt (to hide it from other animals) and leaf litter (to hide it from birds.) A fox will lose an occasional cache to another fox but will occasionally find a cache from another fox so it evens out. Foxes in a given territory usually have some family relationship so it works from a sociobiological level, it's their form of "social hunting".

For me this week it's been about practicing autonomic control, I've been building biofeedback systems and getting to the bottom of heart rate variability and working towards a biosynchronization demo. Also working to start an anime theme song cover band (Absolute Territory) where I am clearly the "kitsune" (AT-00) but more of a band manager than a mascot. I've got the all-purpose guitarist (AT-01) but I'm still casting AT-02, AT-03 and such.

... and boy do I have a technique now to find out people and places that are identity driven and those who are not.

aebtebeten · 2026-01-15T10:18:56 1768472336

A request for AT: https://www.youtube.com/watch?v=pbJrFALSVZ8 (did UY's "Rock the Planet" make any shout-outs to it? https://www.youtube.com/watch?v=96mZc82fYGc )

E: Score! TIL about the UY remake, eg https://www.youtube.com/watch?v=pEVhv4eB8Q8

gsf_emergency_6 · 2026-01-14T04:16:41 1768364201

Places? As in workplaces? Or servers

PaulHoule · 2026-01-14T17:09:08 1768410548

Places.

One interesting discovery is that people are really dour about it at places that hire a lot of enby's [1] but overall enby people who are by themselves or working at places where 10% or fewer people are enby really dig somebody who represents "oceanic reservoir of calm" [2] with a kidult presentation of self.

[1] "non-binary"

[2] fox old enough to have earned nine tails

gsf_emergency_6 · 2026-01-14T22:42:53 1768430573

So these 10% enby presumably tend to be "silent enbys"?

Hmm. Gotta investigate whether HN (the HQ) has (or aims for) the magic ~10% enby. (I had suspected as such but you again voiced what was in my subconscious!!)

https://www.cs.utep.edu/vladik/2019/tr19-95.pdf

I'm guessing there's a parallel observation (or technique[0]) for people..

("[0]Places they have seen, people they have done")

gsf_emergency_6 · 2026-01-14T01:47:27 1768355247

Ha putting the zettai ryoiki back into Eva I see

Early request :) https://news.ycombinator.com/item?id=46606671

E: On the offchance of further participation, JD Vance (eg) has the look of a fox masquerading as a wolf :)

https://news.ycombinator.com/item?id=46611549

(Mamdani going for the same look, but its just more convincing somehow)

aleph_minus_one · 2026-01-12T02:17:05 1768184225

> There's a reason why bagpipes are banned under the Geneva convention!

I know this is not Reddit, but when I see such a comment, I can't resist posting a video of "the internet's favorite song" on an electrical violin and bagpipes:

> Through the Fire and Flames (Official Video) - Mia x Ally

> https://www.youtube.com/watch?v=KVOBpboqCgQ

what · 2026-01-12T05:11:42 1768194702

Can it make it work on mobile?

simonw · 2026-01-12T05:19:35 1768195175

Yes, but I didn't bother here (not part of the original prompt).

You're welcome to drop the HTML into a coding agent and tell it to do that. In my experience you usually have to decide how you want that to work - I've had them build me on-screen D-Pad controls before but I've also tried things like getting touch-to-swipe plus an on-screen fire button.

jaburgin · 2026-01-13T17:42:00 1768326120

Cthulhu_ · 2026-01-12T11:44:49 1768218289

YOUR EARS HAVE SURRENDERED lmao

jimmaswell · 2026-01-12T00:01:51 1768176111

For me the excitement is palpable when I've asked it to write a feature, then I go test it and it entirely works as expected. It's so cool.

ilaksh · 2026-01-12T01:59:43 1768183183

There are multiple self driving car companies that are fully autonomous and operating in several cities in the US and China. Waymo has been operating for many years.

There are full self driving systems that have been in operation with human driver oversight from multiple companies.

And the capabilities of the LLMs in regards to your specific examples were demonstrated below.

The inability of the public to perceive or accept the actual state of technology due to bias or cognitive issues is holding back society.

Cthulhu_ · 2026-01-12T11:49:45 1768218585

It's a lot of mistrust and fear, too - a computer could never be as good at driving as a person!

And yet, over the years many things have just been accepted. Satnav for example, I grew up with my mom having the map in her lap, or my dad writing down directions. Later on we had a route planner on diskettes (I think) and a printout of the route. And my dad now has had a satnav in his car for near enough two decades. I'm sure they like everyone else ran into the quirks of satnav, but I don't think there was nearly as much "fear" and doubt for satnav as there is for self-driving cars and nowadays LLMs / coding agents. Or I'm misremembering it and have rose-tinted glasses, I also remember the brouhaha of people driving into canals because the satnav told them to turn left.

jauntywundrkind · 2026-01-13T06:19:03 1768285143

You could try Cerebras. It's still vastly vastly vastly cheaper than what many people and all companies pay for Opus. And it's absurdly ridiculously stupendously fast. And GLM-4.7 is quite capable! https://www.cerebras.ai/blog/glm-4-7 https://news.ycombinator.com/item?id=46544047

You can definitely keep tweaking. It's also helpful just to ask it about what your possible concerns are and it will tell you and explain what it did.

I spent a good chunk of 2025 long time being super super careful & specific, using mostly very very cheap DeepSeek and just leading it by the leash at every moment and studying the output. It still felt like a huge win. But with more recent models, I have trust that they are doing ok, and I'm better at asking some questions once the code is written to hone my understanding. And mostly I just trust it now! I don't have to look carefully and tweak to exacting standards, because I've seen it do a ton of good work & am careful in what I ask.

There's other tactics that help. Rather than stare carefully at the code, making sure you and the AI are both running the program frequently, have a rig to test what's under development (ideally I'm an integration test type of way, which it can help set-up!). And then having what good programmers have long had, good observability tools at their back. Be that great logging or ideally sweet tracing. We have such better tools to see the high level behavior of systems now. AI with some prompts to go there can be extremely good about helping enhance that view.

It is going to feel different. But there's a lot you can do to get much better loops.

rparet · 2026-01-11T23:27:40 1768174060

You're not alone. I definitely feel like this is / will be a major adaptation required for software engineers going forward. I don't have any solutions to offer you - but I will say that the state that's enabled by fast feedback loops wasn't always the case. For most of my career build times were much, much longer than they are today, as an example. We had to work around that to maintain flow, and we'll have to work around this, now.

kgdiem · 2026-01-12T00:33:16 1768177996

I feel the same way often but I find it to be very similar to coding. Whether coding or prompting when I’m doing rote, boring work I find it tedious. When I am solving a hard problem or designing something interesting I am engaged.

My app is fairly mature with well established patterns, etc. When I’m adding “just CRUD” as part of a feature it’s very tedious to prompt agents, reviewing code, rinse & repeat. Were I actually writing the code by hand I would probably be less productive and just as bored/unsatisfied.

I spent a decent amount of time today designing a very robust bulk upload API (compliance fintech, lots of considerations to be had) for customers who can’t do a batch job. When it was finished I was very pleased with the result and had performance tests and everything.

Cthulhu_ · 2026-01-12T11:40:07 1768218007

I gotta say, the "sitting around waiting" comment hits - I have the same with current-day merge request based development, a lot of time is fragmented because I'm waiting for the CI to finish. I've got seven open merge requests at the moment, some of which have been open since before the holidays. It's a lot of fragmented waiting, fixing, prodding people to review code, and shitposting on HN to pass the time. It's uh. Not healthy.

But this is my reality in my current line of work, a lot of relatively simple work but a lot of processes and checks to conform to rules (that I set myself lol) and not break existing functionality.

fragmede · 2026-01-12T04:22:25 1768191745

> no flow state possible with LLM “coding”

I've hit flow state setting it up to fly. When it flys is when the human gets out of the loop so the AI can look at the thing itself and figure out why centering the div isn't working to center the div, or why the kernel isn't booting. Like, getting to a point, pre-AI, where git bisect running in a loop is the flow state. Now, with ai, setting that up is the flow.

biophysboy · 2026-01-11T17:12:39 1768151559

I feel differently! My background isn't programming, so I frequently feel inhibited by coding. I've used it for over a decade but always as a secondary tool. Its fun for me to have a line of reasoning, and be able to toy with and analyze a series of questions faster than I used to be able to.

froggit · 2026-01-11T23:22:01 1768173721

Ditto. Coding isn't what i specifically do, but it's something i will choose to do when it's the most efficient solution to a problem. I have no problem describing what i need a program to do and how it should do so in a way that could be understandable even to a small child or clever golden retriever, but i'm not so great at the part where you pull out that syntactic sugar and get to turning people words into computer words. LLMs tend to do a pretty good job at translating languages regardless of whether i'm talking to a person or using a code editor, but i don't want them deciding what i wanted to say for me.

theshrike79 · 2026-01-12T11:05:00 1768215900

Why do you feel you need to "carefully look over and tweak" stuff?

Can you define code quality and the goal of the program in a deterministic way?

If it quacks like a duck, walks like a duck and is a duck, does it matter if it's actually a raven inside?

jplusequalt · 2026-01-12T17:16:12 1768238172

Yes, if your goal is to build a duck, and to understand what goes into building a duck. A lot of people derive joy from learning how to do something, not merely seeing the end result.

theshrike79 · 2026-01-12T22:12:08 1768255928

Depends if you're an artesan or a craftsman.

Do you want to make one beautiful intricate table that will last ages. Or do you need a table ASAP because you have guests coming and your end-table can barely fit a pint and a bag of chips?

It's perfectly OK to want to craft something beautiful and understand every single line of code deeply. But it also takes more time than just solving the problem with sufficient quality.

Forgeties79 · 2026-01-11T22:31:28 1768170688

I like writing. I hate editing.

Coding with an LLM seems like it’s often more editing in service of less writing.

I get this is a very simplistic way of looking at it and when done right it can produce solutions, even novel solutions, that maybe you wouldn’t have on your own. Or maybe it speeds up a part of the writing that is otherwise slow and painful. But I don’t know, as somebody who doesn’t really code every time I hear people talk about it that’s what it sounds like to me.

lotu · 2026-01-12T02:53:26 1768186406

Yes this is exactly what I feel. I disconnect enough that if it’s really taking its time I will pull up Reddit and now that single prompt cost me half an hour.

yomismoaqui · 2026-01-12T03:07:04 1768187224

Maybe this will interest you

https://x.com/Hesamation/status/2009707596954186168

jplusequalt · 2026-01-12T17:20:49 1768238449

>AI does coding for you, so now you have more downtime

>Instead of using downtime to read, draw, disconnect, uses AI to build extension to keep you addicted to scrolling social media while AI works

What a dumb fucking world we live in.

holoduke · 2026-01-11T22:34:50 1768170890

Well are you the super developer than never run into issues, challenges? For me and I think most developers, coding is like a continuous stream of problems you need to solve. For me a LLM is very useful, because I can now develop much faster. Don't have to think which sorting algoritm should be used or which trigonometric function I need for a specific case. My LLM buddy solves most of those issues.

gjadi · 2026-01-11T22:55:30 1768172130

When you don't know the answer to a question you ask an LLM, do you verify it or do you trust it?

Like, if it tells you merge sort is better on that particular problem, do you trust it or do you go through an analysis to confirm it really is?

I have a hard time trusting what I don't understand. And even more so if I realize later I've been fooled. Note that it's the same with human though. I think I only trust technical decision I don't understand when I deem the risk of being wrong low enough. Overwise I'll invest in learning and understanding enough to trust the answer.

visarga · 2026-01-12T00:01:03 1768176063

For all these "open questions" you might have it is better to ask the LLM write a benchmark and actually see the numbers. Why rush, spend 10 minutes, you will have a decision backed by some real feedback from code execution.

But this is just a small part from a much grander testing activity that needs to wrap the LLM code. I think my main job moved to 1. architecting and 2. ensuring the tests are well done.

What you don't test is not reliable yet, looking at code is not testing, it's "vibe-testing" and should be an antipattern, no LGTM for AI code. We should rely on our intuition alone because it is not strict enough, and it makes everything slow - we should not "walk the motorcycle".

gjadi · 2026-01-12T00:16:20 1768176980

Ok. I also have the intuition that more tests and formal specifications can help there.

So far, my biggest issue is, when the code produced is incorrect, with a subtle bug, then I just feel I have wasted time to prompt for something I should have written because now I have to understand it deeply to debug it.

If the test infrastructure is sound, then maybe there is a gain after all even if the code is wrong.

nsonha · 2026-01-12T04:47:40 1768193260

> I have a hard time trusting what I don't understand

Who doesn't? But we have to trust them anyway, otherwise everyone should get a PhD on everything.

Also for people who "has a hard time trusting", they might just give up when encountering things they don't understand. With AI at least there is a path for them to keep digging deeper and actually verify things to whatever level of satisfaction they want.

gjadi · 2026-01-12T09:01:39 1768208499

Sure, but then I rely on an actual expert.

My issue is, LLM fooled me more than a couple of times with stupid but difficult to notice bugs. At that point, I have hard time to trust them (but keep trying with some stuff).

If I asked someone for something and found out several time that the individual is failing, then I'll just stop working with them.

Edit: and to avoid with just anthropomorphizing LLM too much, the moment I notice a tool I use bug to point to losing data for example, I reconsider real hard before I use it again or not.

PaulHoule · 2026-01-12T02:00:13 1768183213

Often those kind of performance things just don't matter.

Like right now I am working on algorithms for computing heart rate variability and only looking at a 2 minute window with maybe 300 data points at most so whether it is N or N log N or N^2 is beside the point.

When I know I computing the right thing for my application and know I've coded it up correctly and I am feeling some pain about performance that's another story.

simonw · 2026-01-12T00:02:09 1768176129

I tell it to write a benchmark, and I learn from how it does that.

gjadi · 2026-01-12T00:18:54 1768177134

IME I don't learn by reading or watching, only by wrestling with a problem. ATM, I will only do it if the problem does not feel worth learning about (like jenkinsfile, gradle scripting).

But yes, the bench result will tell something true.

wk_end · 2026-01-12T07:01:21 1768201281

See, I’m with you, but in my day to day work I almost never could almost never get into a flow state while coding, because very little of my work involves creating things or solving real problems; it typically involves just trying to mentally untangle huge rat nests, Jenna-ing bug fixes and the occasional feature in, and then spending a bunch of time testing to make sure I didn’t break anything, no flow involved. I’ve been grudgingly using Cursor heavily for the past few weeks and it’s been helping make all of this significantly more bearable.

LLMs aren’t replacing the joy of coding for me, but they do seem to be helping me deal with the misery of being a professional coder.

loubbrad · 2026-01-11T11:41:14 1768131674

> I think there is a section of programmer who actually do like the actual typing of letters, numbers and special characters into a computer...

Reminds me of this excerpt from Richard Hamming's book:

> Finally, a more complete, and more useful, Symbolic Assembly Program (SAP) was devised—after more years than you are apt to believe during which most programmers continued their heroic absolute binary programming. At the time SAP first appeared I would guess about 1% of the older programmers were interested in it—using SAP was “sissy stuff”, and a real programmer would not stoop to wasting machine capacity to do the assembly. Yes! Programmers wanted no part of it, though when pressed they had to admit their old methods used more machine time in locating and fixing up errors than the SAP program ever used. One of the main complaints was when using a symbolic system you do not know where anything was in storage—though in the early days we supplied a mapping of symbolic to actual storage, and believe it or not they later lovingly pored over such sheets rather than realize they did not need to know that information if they stuck to operating within the system—no! When correcting errors they preferred to do it in absolute binary addresses.

layer8 · 2026-01-11T14:58:42 1768143522

I think this is beside the point, because the crucial change with LLMs is that you don’t use a formal language anymore to specify what you want, and get a deterministic output from that. You can’t reason with precision anymore about how what you specify maps to the result. That is the modal shift that removes the “fun” for a substantial portion of the developer workforce.

convolvatron · 2026-01-11T16:00:23 1768147223

its not not about fun. when I'm going through the actual process of writing a function, I think about design issues. about how things are named, about how the errors from this function flow up. about how scheduling is happening. about how memory is managed. I compare the code to my ideal, and this is the time where I realize that my ideal is flawed or incomplete.

I think alot of us dont get everything specced out up front, we see how things fit, and adjust accordingly. most of the really good ideas I've had were not formulated in the abstract, but realizations had in the process of spelling things out.

I have a process, and it works for me. Different people certainly have other ones, and other goals. But maybe stop telling me that instead of interacting with the compiler directly its absolutely necessary that instead I describe what I want to a well meaning idiot, and patiently correct them, even though they are going to forget everything I just said in a moment.

rossu · 2026-01-12T01:08:27 1768180107

> ... stop telling me that instead of interacting with the compiler directly its absolutely necessary that instead I describe what I want to a well meaning idiot, and patiently correct them, even though they are going to forget everything I just said in a moment.

This perfectly describes the main problem I have with the coding agents. We are told we should move from explicit control and writing instructions for the machine to pulling the slot lever over and over and "persuading the machine" hoping for the right result.

hackable_sand · 2026-01-11T17:20:32 1768152032

That's not it for me, personally.

I do all of my programming on paper, so keystrokes and formal languages are the fast part. LLMs are just too slow.

colejhudson · 2026-01-12T00:10:37 1768176637

I'd be interested in learning more about your workflow. I've certainly used plaintext files (and similar such things) to aid in project planning, but I've never used paper beyond taking a few notes here and there.

sevensor · 2026-01-12T02:36:47 1768185407

Not who you’re replying to, but I do this as well. I carry a pocket notebook and write paragraphs describing what I want to write. Sometimes I list out the fields of a data structure. Then I revise. By the time I actually write the code, it’s more like a recitation. This is so much easier than trying to think hard about program structure while logged in to my work computer with all the messaging and email.

hackable_sand · 2026-01-12T05:11:33 1768194693

Yes this is my technique as well.

Others have different prerogatives, but I personally do not want to work more than I need to.

visarga · 2026-01-12T00:17:06 1768177026

> because the crucial change with LLMs is that you don’t use a formal language anymore to specify what you want, and get a deterministic output from that

You don't just code, you also test, and your safety is just as good as your test coverage and depth. Think hard about how to express your code to make it more testable. That is the single way we have now to get back some safety.

But I argue the manual inspection of code and thinking it through in your head is still not strict coding, it is vibe-testing as well, only code backed by tests is not vibe-based. If needed use TLA+ (generated by LLM) to test, or go as deep as necessary to test.

zahlman · 2026-01-11T16:07:14 1768147634

I don't know what book you're talking about, but it seems that you intend to compare the switch to an AI-based workflow to using a higher-level language. I don't think that's valid at all. Nobody using Python for any ordinary purpose feels compelled to examine the resulting bytecode, for example, but a responsible programmer needs to keep tabs on what Claude comes up with, configure a dev environment that organizes the changes into a separate branch (as if Claude were a separate human member of a team) etc. Communication in natural language is fundamentally different from writing code; if it weren't, we'd be in a world with far more abundant documentation. (After all, that should be easier to write than a prompt, since you already have seen the system that the text will describe.)

immibis · 2026-01-11T16:43:29 1768149809

> Nobody using Python for any ordinary purpose feels compelled to examine the resulting bytecode, for example,

The first people using higher level languages did feel compelled to. That's what the quote from the book is saying. The first HLL users felt compelled to check the output just like the first LLM users.

zahlman · 2026-01-11T17:09:17 1768151357

Yes, and now they don't.

But there is no reason to suppose that responsible SWEs would ever be able to stop doing so for an LLM, given the reliance on nondeterminism and a fundamentally imprecise communication mechanism.

That's the point. It's not the same kind of shift at all.

le-mark · 2026-01-11T17:05:39 1768151139

Hamming was talking about assembler, not a high level language.

jhbadger · 2026-01-12T00:41:27 1768178487

Assembly was a "high level" language when it was new -- it was far more abstract than entering in raw bytes. C was considered high level later on too, even though these days it is seen as "low level" -- everything is relative to what else is out there.

sanderjd · 2026-01-11T17:31:00 1768152660

The same pattern held through the early days of "high level" languages that were compiled to assembly, and then the early days of higher level languages that were interpreted.

I think it's a very apt comparison.

ThrowawayR2 · 2026-01-11T18:16:10 1768155370

If the same pattern held, then it ought to be easy to find quotes to prove it. Other than the one above from Hamming, we've been shown none.

jhbadger · 2026-01-12T00:45:13 1768178713

Read the famous "Story of Mel" [1] about Mel Kaye, who refused to use optimizing assemblers in the late 1950s because "you never know where they are going to put things". Even in the 1980s you used to find people like that.

[1] https://en.wikipedia.org/wiki/The_Story_of_Mel

ThrowawayR2 · 2026-01-12T01:38:55 1768181935

The Story of Mel counts against the narrative because Mel was so overwhelmingly skilled that he was easily able to outdo the optimizing compiler.

sanderjd · 2026-01-12T15:59:48 1768233588

I don't think that does count against the narrative? The narrative is just that each time we've moved up the abstraction chain in generating code, there have been people who have been skeptical of the new level of abstraction. I would say that it's usually the case that highly skilled operators at the previous level remain more effective than the new adopters of the next level. But what ends up mattering more in the long run is that the higher level of abstraction enables a lot more people to get started and reach a basic level of capability. This is exactly what's happening now! Lots of experienced programmers are not embracing these tools, or are, but are still more effective just writing code. But way more people can get into "vibe coding" with some basic level of success, and that opens up new possibilities.

ThrowawayR2 · 2026-01-12T16:43:24 1768236204

The narrative is that non-LLM adopters will be left behind, lose their jobs, are Luddites, yadda yadda yadda because they are not moving up the abstraction layers by adopting LLMs to improve their output. There is no point in the timeframe of the story at which Mel would have benefitted from a move to a higher abstraction level by adopting the optimizing compiler because its output will always be drastically inferior to his own using his native expertise.

sanderjd · 2026-01-12T17:16:42 1768238202

That's not the narrative in this thread. That's a broader narrative than the one in this thread.

And yes, as I said, the point is not that Mel would benefit, it's that each time a new higher level of abstraction comes onto the scene, it is accessible to more people than the previous level. This was the pattern with machine code to symbolic assembly, it was the pattern with assembly to compiled languages, with higher level languages, and now with "prompting".

The comment I originally replied to implied that this current new abstraction layer is totally different than all the previous ones, and all I said is that I don't think so, I think the comparison is indeed apt. Part of that pattern is that a lot of new people can adopt this new layer of abstraction, even while many people who already know how to program are likely to remain more effective without it.

Supermancho · 2026-01-11T22:56:00 1768172160

> you intend to compare the switch to an AI-based workflow to using a higher-level language.

That was the comparison made. AI is an eerily similar shift.

> I don't think that's valid at all.

I dont think you made the case by cherry picking what it can't do. This is exactly the same situation, as the time SAP appeared. There weren't symbols for every situation binary programmers were using at the time. This doesn't change the obvious and practical improvement that abstractions provided. Granted, I'm not happy about it, but I can't deny it either.

quesera · 2026-01-11T18:05:18 1768154718

Contra your other replies, I think this is exactly the point.

I had an inkling that the feeling existed back then, but I had no idea it was documented so explicitly. Is this quote from The Art of Doing Science and Engineering?

irthomasthomas · 2026-01-11T17:40:05 1768153205

In my feed 'AI hype' outnumbers 'anti-AI hype' 5-1. And anti-hype moderates like antirez and simonw are rare. To be a radical in ai is to believe that ai tools offer a modest but growing net positive utility to a modest but growing subset of hackers and professionals

never_inline · 2026-01-12T06:27:10 1768199230

The only AI bloggers who don't have something to sell seems to be simonw, the flask guy, and this redis guy. Any other blog recommendations from HN?

erk__ · 2026-01-12T07:07:34 1768201654

tbh I think it is just a question about time before flask guy has something to sell: https://earendil.com/

never_inline · 2026-01-12T10:32:43 1768213963

I can't for the life of me tell what it's about.

kaffekaka · 2026-01-11T19:50:21 1768161021

Well put.

AI obviously brings big benefits into the profession. We just have not seen exactly what they are just yet. How it will unfold.

But personally I feel that a future of not having to churn out yet another crud app is attractive.

catlifeonmars · 2026-01-12T05:13:27 1768194807

In theory “not having to churn out yet another crud app” doesn’t require AI, any ol code generator will do. AI is a really expensive way (in terms of gpus/tpus) to generate boilerplate, but as long as that cost is massively subsidized by investors, you may as well use it.

kaffekaka · 2026-01-12T07:04:10 1768201450

I agree, we (or I) should have gotten out of this earlier. Shame on me, really. But LLM:s have lowered the threshold.

phicoh · 2026-01-11T12:10:27 1768133427

The problem I see is not so much in how you generate the code. It is about how to maintain the code. If you check in the AI generated code unchanged then do you start changing that code by hand later? Do you trust that in the future AI can fix bugs in your code. Or do you clean up the AI generated code first?

jt2190 · 2026-01-11T15:24:19 1768145059

LLMs remove the familiarity of “I wrote this and deeply understand this”. In other words, everything is “legacy code” now ;-)

For those who are less experienced with the constant surprises that legacy code bases can provide, LLMs are deeply unsettling.

chrsw · 2026-01-11T16:42:27 1768149747

This is the key point for me in all this.

I've never worked in web development, where it seems to me the majority of LLM coding assistants are deployed.

I work on safety critical and life sustaining software and hardware. That's the perspective I have on the world. One question that comes up is "why does it take so long to design and build these systems?" For me, the answer is: that's how long it takes humans to reach a sufficient level of understanding of what they're doing. That's when we ship: when we can provide objective evidence that the systems we've built are safe and effective. These systems we build, which are complex, have to interact with the real world, which is messy and far more complicated.

Writing more code means that's more complexity for humans (note the plurality) to understand. Hiring more people means that's more people who need to understand how the systems work. Want to pull in the schedule? That means humans have to understand in less time. Want to use Agile or this coding tool or that editor or this framework? Fine, these tools might make certain tasks a little easier, but none of that is going to remove the requirement that humans need to understand complex systems before they will work in the real world.

So then we come to LLMs. It's another episode of "finally, we can get these pesky engineers and their time wasting out of the loop". Maybe one day. But we are far from that today. What matters today is still how well do human engineers understand what they're doing. Are you using LLMs to help engineers better understand what they are building? Good. If that's the case you'll probably build more robust systems, and you _might_ even ship faster.

Are you trying to use LLMs to fool yourself into thinking this still isn't the game of humans needing to understand what's going on? "Let's offload some of the understanding of how these systems work onto the AI so we can save time and money". Then I think we're in trouble.

dpark · 2026-01-11T18:51:49 1768157509

> Are you trying to use LLMs to fool yourself into thinking this still isn't the game of humans needing to understand what's going on?

This is a key question. If you look at all the anti-AI stuff around software engineering, the pervading sentiment is “this will never be a senior engineer”. Setting aside the possibility of future models actually bridging this gap (this would be AGI), let’s accept this as true.

You don’t need an LLM to be a senior engineer to be an effective tool, though. If an LLM can turn your design into concrete code more quickly than you could, that gives you more time to reason over the design, the potential side effects, etc. If you use the LLM well, it allows you to give more time to the things the LLM can’t do well.

freedomben · 2026-01-12T15:24:14 1768231454

Fully agree. In my own usage of AI (which I came to a bit late but have tried to fully embrace so I know what it can and can't do) I've noticed a very unusual side effect: I spend way more of my time documenting and reviewing designs than I used to, and that has been a big positive. I've always been very (maybe too) thoughtful about design and architecture, but I usually focused on high-level design and then would get to some coding as a way of evaluating/testing my designs. I could then throw away v0 using lessons learned and start a v1 on a solid track. Now however, I find myself able to get a lot further in nailing down the design to the point I don't have to build and throw away v0. The prototype is often highly salvageable with the help of the LLM doing the refactoring/iterating that used to make "starting over" a more optimal path. That in turn allows me to maintain the context and velocity of the design much better since there aren't days, or weeks, or even months between the "lessons learned" that then have to go back and revise the design.

The caveat here though, is if I didn't have the decades of experience writing/designing software by hand, I don't think I'd have the skills needed to reap the above benefit.

discreteevent · 2026-01-11T19:33:54 1768160034

" They make it easier to explore ideas, to set things up, to translate intent into code across many specialized languages. But the real capability—our ability to respond to change—comes not from how fast we can produce code, but from how deeply we understand the system we are shaping. Tools keep getting smarter. The nature of learning loop stays the same."

https://martinfowler.com/articles/llm-learning-loop.html

visarga · 2026-01-12T00:39:02 1768178342

Learning happens when your ideas break, when code fails, unexpected things happen. And in order to have that in a coding agent you need to provide a sensitive skin, which is made of tests, they provide pain feedback to the agent. Inside a good test harness the agent can't break things, it moves in a safe space with greater efficiency than before. So it was the environment providing us with understanding all alone, and we should make an environment where AI can understand what are the effects of its actions.

esafak · 2026-01-11T21:28:43 1768166923

Why can't you use LLMs with formal methods? Mathematicians are using LLMs to develop complex proofs. How is that any different?

convolvatron · 2026-01-11T23:18:49 1768173529

maybe. I think we're really just starting this, and I suspect that trying to fuse neural networks with symbolic logic is a really interesting direction to try to explore.

that's kind of not what we're talking about. a pretty large fraction of the community thinks programming is stone cold over because we can talk to an LLM and have it spit out some code that eventually compiles.

personally I think there will be a huge shift in the way things are done. it just won't look like Claude.

marcus_holmes · 2026-01-12T01:21:57 1768180917

I don't know why you're being downvoted, I think you're right.

I think LLMs need different coding languages, ones that emphasise correctness and formal methods. I think we'll develop specific languages for using LLMs with that work better for this task.

Of course, training an LLM to use it then becomes a chicken/egg problem, but I don't think that's insurmountable.

visarga · 2026-01-12T00:35:43 1768178143

I don't think "understanding" should be the criteria, you can't commit your eyes in the PR. What you can commit is a test that enforces that understanding programatically. And we can do many many more tests now than before. You just need to ensure testing is deep and well designed.

jacquesm · 2026-01-12T03:17:20 1768187840

You can not test that which you do not understand.

dpark · 2026-01-11T16:28:11 1768148891

I suspect that we are going to have a wave of gurus who show up soon to teach us how to code with LLMs. There’s so much doom and gloom in these sorts of threads about the death of quality code that someone is going to make money telling people how to avoid that problem.

The scenario you describe is a legitimate concern if you’re checking in AI generated code with minimal oversight. In fact I’d say it’s inevitable if you don’t maintain strict quality control. But that’s always the case, which is why code review is a thing. Likewise you can use LLMs without just checking in garbage.

The way I’ve used LLMs for coding so far is to give instructions and then iterate on the result (manually or with further instructions) until it meets my quality standards. It’s definitely slower than just checking in the first working thing the LLM churns out, but it’s sill been faster than doing it myself, I understand it exactly as well because I have to in order to give instructions (design) and iterate.

My favorite definition of “legacy code” is “code that is not tested” because no matter who writes code, it turns into a minefield quickly if it doesn’t have tests.

d0liver · 2026-01-11T17:15:27 1768151727

How do you know that it's actually faster than if you'd just written it yourself? I think the review and iteration part _is_ the work, and the fact that you started from something generated by an LLM doesn't actually speed things up. The research that I've seen also generally backs this idea up -- LLMs _feel_ very fast because code is being generated quickly, but they haven't actually done any of the work.

dpark · 2026-01-11T18:24:38 1768155878

Because I’ve been a software engineer for over 20 years. If I look at a feature and feel like it will take me a day and an LLM churns it out in a hour including the iterating, I’m confident that using the LLM was meaningfully faster. Especially since engineers (including me) are notoriously bad at accurate estimation and things usually take at least twice as long as they estimate.

I have tested throwing several features at an LLM lately and I have no doubt that I’m significantly faster when using an LLM. My experience matches what Antirez describes. This doesn’t make me 10x faster, mostly because so much of my job is not coding. But in term of raw coding, I can believe it’s close to 10x.

theshrike79 · 2026-01-12T11:08:11 1768216091

Because I don't type that fast.

I know exactly what the result should be, the LLM is just typing it for me.

And it will do the typing while I get up and go to the bathroom (again, I'm getting old).

When I come back, it's done, tests have been run that prove nothing broke.

dpark · 2026-01-12T14:44:35 1768229075

> I know exactly what the result should be, the LLM is just typing it for me.

This is the mental model people should be working with. The LLM is there to tighten the loop from thought to code. You doing need to test it like an engineer. You just need to use it to make you more efficient.

It so happens that you *can^ give an LLM half-baked thoughts and it will sometimes still do a good job because the right thing is so straightforward. But in general the more vague and unclear your own thoughts, the lower quality the results, necessitating more iterations to refine.

dpark · 2026-01-12T17:04:19 1768237459

Too late to fix the typos. Bleh.

“You don’t need to treat it like an engineer.”

svieira · 2026-01-12T02:40:42 1768185642

> My favorite definition of “legacy code” is “code that is not tested” because no matter who writes code, it turns into a minefield quickly if it doesn’t have tests.

Unfortunately, "tests" don't do it, they have to be "good tests". I know, because I work on a codebase that has a lot of tests and some modules have good tests and some might as well not have tests because the tests just tell you that you changed something.

catlifeonmars · 2026-01-12T05:24:22 1768195462

> My favorite definition of “legacy code” is “code that is not tested” because no matter who writes code, it turns into a minefield quickly if it doesn’t have tests.

On the contrary, legacy code has, by definition, been battle tested in production. I would amend the definition slightly to:

“Legacy code is code that is difficult to change.”

Lacking tests is one common reason why this could be, but not the only possible reason.

dpark · 2026-01-12T07:28:30 1768202910

It’s from Working Effectively with Legacy Code. I don’t recall the exact definition but it’s something to that effect. Legacy = lack of automated tests.

The biggest barrier to changing code is usually insufficient automated testing. People are terrified of changing code when they can’t verify the results before breaking production.

More glibly legacy code is “any code I don’t want to deal with”. I’ve seen code written 1 year prior officially declared “legacy” because new coding standards were being put in place and no one wanted to update the old code to match.

cratermoon · 2026-01-11T23:10:43 1768173043

I think it was Cory Doctorow who compared AI-generated code to asbestos. Back in its day, asbestos was in everything, because of how useful it seemed. Fast forward decades and now asbestos abatement is a hugely expensive and time-consuming requirement for any remodeling or teardown project. Lead paint has some of the same history.

epicureanideal · 2026-01-12T01:44:12 1768182252

Get your domain names now! AI Slop Abatement, the major growth industry of the 2030s.

theshrike79 · 2026-01-12T11:06:11 1768215971

As someone who started their first greenfield project 20 years into their career: Sounds like a Tuesday for me.

We have the tools and knowledge for working with legacy code, have had for decades. There are shelf-meters of books written about it.

It's just a different skillset.

buu700 · 2026-01-11T22:48:22 1768171702

I see where you're coming from, and I agree with the implication that this is more of an issue for inexperienced devs. Having said that, I'd push back a bit on the "legacy" characterization.

For me, if I check in LLM-generated code, it means I've signed off on the final revision and feel comfortable maintaining it to a similar degree as though it were fully hand-written. I may not know every character as intimately as that of code I'd finished writing by hand a day ago, but it shouldn't be any more "legacy" to me than code I wrote by hand a year ago.

It's a bit of a meme that AI code is somehow an incomprehensible black box, but if that is ever the case, it's a failure of the user, not the tool. At the end of the day, a human needs to take responsibility for any code that ends up in a product. You can't just ship something that people will depend on not to harm them without any human ever having had the slightest idea of what it does under the hood.

jt2190 · 2026-01-14T01:38:41 1768354721

Where did you get the idea that “legacy code” equals “abandonware”? The world runs on massive legacy codebases that have been maintained for decades.

buu700 · 2026-01-14T03:07:11 1768360031

I'm not sure where you see that in my comment, but I didn't use the word "abandonware".

jt2190 · 2026-01-14T19:35:56 1768419356

You’re “pushing back” against the term “legacy code” with an argument that someone “needs to take responsibility”.

buu700 · 2026-01-14T20:25:34 1768422334

Some of those words appear in my comment, but not in the way you're implying I used them.

My argument was that 1) LLM output isn't inherently "legacy" unless vibe coded, and 2) one should not vibe code software that others depend on to remain stable and secure. Your response about "abandonware" is a non sequitur.

jt2190 · 2026-01-15T00:09:15 1768435755

To be clear, you’re literally saying:

Legacy == vibe coded

And:

Others can not depend on vibe coded software

Thus you seem to mean:

Legacy code can not be depended on

I presume that through some process one can exorcise the legacy/vibe-codiness away. Perhaps code review of every line? (This would imply that the bottleneck to LLM output is human code review.) Or would having the LLM demonstrate correctness via generated tests be sufficient?

buu700 · 2026-01-15T00:43:43 1768437823

Just to clarify, you're inferring several things that I didn't say:

* I was agreeing with you that all vibe code is effectively legacy, but obviously not all legacy code is vibe code. Part of my point is also that not all LLM code is vibe code.

* I didn't comment on the dependability of legacy code, but I don't believe that strict vibe code should ever be depended on in principle.

As far as non-vibe coding with LLMs, I'd definitely suggest some level of human review and participation in the overall structure/organization. Even if the developer hasn't pored through it line by line, they should have signed off on the tech stack/dependencies/architecture and have some idea of what the file layout and internal modules/interfaces look like. If a major bug is ever discovered, the developer should know enough to confidently code review the fix or implement it by hand if necessary.

Detailed specs, docs, and tests are also positives, which I recently wrote up some thoughts on: https://supremecommander.ai/posts/ai-waterfall-trifecta.

visarga · 2026-01-12T00:42:12 1768178532

Take responsibility by leaving a good documentation of your code and a beefy set of tests, future agents and humans will have a point to bootstrap from, not just plain code.

buu700 · 2026-01-12T01:46:44 1768182404

Yes, that too, but you should still review and understand your code.

embedding-shape · 2026-01-11T12:17:14 1768133834

Depends on what you do. When I'm using LLMs to generate code for projects I need to maintain (basically, everything non-throw-away-once-used), I treat it as any other code I'd write, tightly controlled with a focus on simplicity and well-thought out abstractions, and automated testing that verify what needs to be working. Nothing gets "merged" into the code without extensive review, and me understanding the full scope of the change.

So with that, I can change the code by hand afterwards or continue with LLMs, it makes no difference, because it's essentially the same process as if I had someone follow the ideas I describe, and then later they come back with a PR. I think probably this comes naturally to senior programmers and those who had a taste of management and similar positions, but if you haven't reviewed other's code before, I'm not sure how well this process can actually work.

At least for me, I manage to produce code I can maintain, and seemingly others to, and they don't devolve into hairballs/spaghetti. But again, requires reviewing absolutely every line and constantly edit/improve.

phicoh · 2026-01-11T12:48:04 1768135684

We recently got a PR from somebody adding a new feature and the person said he doesn't know $LANG but used AI.

The problem is, that code would require a massive amount of cleanup. I took a brief look and some code was in the wrong place. There were coding style issues, etc.

In my experience, the easy part is getting something that works for 99%. The hard part is getting the architecture right, all of the interfaces and making sure there are no corner cases that get the wrong results.

I'm sure AI can easily get to the 99%, but does it help with the rest?

dent9 · 2026-01-11T22:24:26 1768170266

> I'm sure AI can easily get to the 99%, but does it help with the rest?

Yes the AI can help with 100% is it. But the operator of the AI needs to be able to articulate this to the AI .

I've been in this position, where I had no choice but to use AI to write code to fix bugs in another party's codebase, then PR the changes back to the codebase owners. In this case it was vendor software that we rely on which the vendor hadn't fixed critical bugs in yet. And exactly as you described, my PR ultimately got rejected because even though it fixed the bugs in the immediate sense, it presented other issues due to not integrating with the external frameworks the vendor used for their dev processes. At which point it was just easier for the vendor to fix the software their way instead of accept my PR. But the point is that I could have made the PR correct in the first place, if I as the AI operator had the knowledge needed to articulate these more detailed and nuanced requirements to the AI. Since I didn't have this information then the AI generated code that worked but didn't meet the vendors spec. This type of situation is incredibly easy to fall into and is a good example of why you still need a human at the wheel on projects to set the guidance but you don't necessarily need the human to be writing every line of code.

I don't like the situation much but this is the reality of it. We're basically just code reviewers for AI now

embedding-shape · 2026-01-11T14:15:13 1768140913

Yeah, so what I'm mostly doing, and advocate for others to do, is basically the pure opposite of that.

Focus on architecture, interfaces, corner-cases, edge-cases and tradeoffs first, and then the details within that won't matter so much anymore. The design/architecture is the hard part, so focus on that first and foremost, and review + throw away bad ideas mercilessly.

simonw · 2026-01-11T12:50:01 1768135801

Yes it does... but only in the hands of an expert who knows what they are doing.

I'd treat PRs like that as proof of concepts that the thing that can be done, but I'd be surprised if they often produced code that should be directly landed.

teeeew · 2026-01-11T13:11:57 1768137117

In the hands of an expert… right. So is it not incredibly irresponsible to release these tools into the wild, and expose it those who are not experts? They will actually become incredibly worse off. Ironically this does not ‘democratise’ intelligence at all - the gap widens between experts and the rest.

simonw · 2026-01-11T13:25:07 1768137907

I sometimes wonder what would have happened if OpenAI had built GPT3 and then GPT-4 and NOT released them to the world, on the basis that they were too dangerous for regular people to use.

That nearly happened - it's why OpenAI didn't release open weight models past GPT2, and it's why Google didn't release anything useful built on Transformers despite having invented the architecture.

If we lived in the world today, LLMs would be available only to a small, elite and impossibly well funded class of people. Google and OpenAI would solely get to decide who could explore this new world with them.

I think that would suck.

teeeew · 2026-01-11T13:42:28 1768138948

So… what?

With all due respect I don’t care about an acceleration in writing code - I’m more interested in incremental positive economic impact. To date I haven’t seen anything convince me that this technology will yield this.

Producing more code doesn’t overcome the lack of imagination, creativity and so on to figure out what projects resources should be invested in. This has always been an issue that will compound at firms like Google who have an expansive graveyard of projects laid to rest.

In fact, in a perverse way, all this ‘intelligence’ can exist. At the same time humans can get worse in their ability to make judgments in investment decisions.

So broadly where is the net benefit here?

simonw · 2026-01-11T13:45:41 1768139141

You mean the net benefit in widespread access to LLMs?

I get the impression there's no answer here that would satisfy you, but personally I'm excited about regular people being able to automate tedious things in their lives without having to spend 6+ months learning to program first.

And being able to enrich their lives with access to as much world knowledge as possible via a system that can translate that knowledge into whatever language and terminology makes the most sense to them.

teeeew · 2026-01-11T13:53:02 1768139582

“I'm excited about regular people being able to automate tedious things in their lives without having to spend 6+ months learning to program first.”

Bring the implicit and explicit costs to date into your analysis and you should quickly realise none of this makes sense from a societal standpoint.

Also you seem to be living in a bubble - the average person doesn’t care about automating anything!

bathtub365 · 2026-01-11T14:04:43 1768140283

The average person already automates a lot of things in their day to day lives. They spend far less time doing the dishes, laundry, and cleaning because parts of those tasks have been mechanized and automated. I think LLMs probably automate the wrong thing for the average person (i.e., I still have to load the laundry machine and fold the laundry after) but automation has saved the average person a lot of time

zeroonetwothree · 2026-01-11T16:22:21 1768148541

For example, my friend doesn’t know programming but his job involves some tedious spreadsheet operations. He was able to use an LLM to generate a Python script to automate part of this work. Saving about 30 min/day. He didn’t review the code at all, but he did review the output to the spreadsheet and that’s all that matters.

His workplace has no one with programming skills, this is automation that would never have happened. Of course it’s not exactly replacing a human or anything. I suppose he could have hired someone to write the script but he never really thought to do that.

zahlman · 2026-01-11T16:12:10 1768147930

What sorts of things will the average, non-technical person think of automating on a computer that are actually quality-of-life-improving?

theshrike79 · 2026-01-12T11:17:24 1768216644

A work colleague had a tedious operation involving manually joining a bunch of video segments together in a predictable pattern. Took them a full working day.

They used "just" ChatGPT on the web to write an automation. Now the same process takes ~5 minutes of work. Select the correct video segments, click one button to run script.

The actual processing still takes time, but they don't need to stand there watching it progress so they can start the second job.

And this was a 100% non-tecnical marketing person with no programming skills past Excel formulas.

simonw · 2026-01-11T22:13:01 1768169581

My favorite anecdotal story here is that a couple of years ago I was attending a training session at a fire station and the fire chief happened to mention that he had spent the past two days manually migrating contact details from one CRM to another.

I do not want the chief of a fire station losing two days of work to something that could be scripted!

gjadi · 2026-01-11T23:11:00 1768173060

I don't want my doctor to vibe script some conversion only to realize weeks or months later it made a subtle error in my prescription. I want both of them to have enough fund to hire someone to do it properly. But wanting is not enough unfortunately...

danenania · 2026-01-12T05:10:13 1768194613

Humans make subtle errors all the time too though. AI results still need to be checked over for anything important, but it's on a vector toward being much more reliable than a human for any kind of repetitive task.

Currently, if you ask an LLM to do something small and self-contained like solve leetcode problems or implement specific algorithms, they will have a much lower rate of mistakes, in terms of implementing the actual code, than an experienced human engineer. The things it does badly are more about architecture, organization, style, and taste.

gjadi · 2026-01-12T08:58:04 1768208284

But with a software bug, the error becomes rapidly widespread and systematic, whereas human error are often not. Doing wrong with a couple of prescription because the doc worked for 12+ hrs is different from systematically doing wrong on a significant number of prescriptions until someone double check the results.

theshrike79 · 2026-01-12T11:18:40 1768216720

An error in a massive hand-crafted Excel sheet also becoms systematic and wide-spread.

Because Excel has no way of doing unit tests or any kind of significant validation. Big BIG things have gone to shit because of Excel.

Things that would have never happened if the same thing was a vibe-coded python script and a CSV.

gjadi · 2026-01-12T14:09:52 1768226992

I agree with the excel thing. Not with thinking it can't happen with vibecoded python.

I think handling sensitive data should be done by professional. A lawyer handles contracts, a doctor handles health issue and a programmer handles data manipulation through programs. This doesn't remove risk of errors completely, but it reduces it significantly.

In my home, it's me who's impacted if I screw up a fix in my plumbing, but I won't try to do it at work or in my child's school.

I don't care if my doctor vibe codes an app to manipulate their holidays pictures, I care if they do it to manipulate my health or personal data.

theshrike79 · 2026-01-12T21:57:36 1768255056

Of course issues CAN happen with Python, but at least with Python we have tools to check for the issues.

Bunch of your personal data is most likely going through some Excel made by a now-retired office worker somewhere 15 years ago. Nobody understands how the sheet works, but it works so they keep using it :) A replacement system (a massive SaaS application) has been "coming soon" for 8 years and cost millions, but it still doesn't work as well as the Excel sheet.

simonw · 2026-01-11T14:24:31 1768141471

> Also you seem to be living in a bubble - the average person doesn’t care about automating anything!

One of my life goals is to help bring as many people into my "technology can automate things for you" bubble as I possibly can.

closewith · 2026-01-11T14:13:17 1768140797

You can apply the same logic to all technologies, including programming languages, HTTP, cryptography, cameras, etc. Who should decide what's a responsible use?

phicoh · 2026-01-11T15:16:41 1768144601

I'm curious about the economic aspects of this. If only experts can use such tools effectively, how big will the total market be and does that warrant the investments?

For companies, if these tools make experts even more special, then experts may get more power certainly when it comes to salary.

So the productively benefits of AI have to be pretty high to overcome this. Does AI make an expert twice as productive?

paodealho · 2026-01-11T15:48:14 1768146494

I have been thinking about this in the last few weeks. First time I see someone commenting about it here.

- If the number of programmers will be drastically reduced, how big of a price increase companies like Anthropic would need to be profitable?

- If you are a manager, you now have a much higher bus factor to deal with. One person leaving means a greater blow on the team's knowledge.

- If the number of programmers will be drastically reduced, the need for managers and middle managers will also decline, no? Hmm...

theshrike79 · 2026-01-12T11:14:27 1768216467

Coding style can be deterministically checked for, and should be checked, automatically during linting. And no PR should get a single human pair of eyes, except for the author, looking at it until all CI checks have passed.

Many many other stylistic choices and code complexity can be automatically checked, why aren't you doing it?

bitwize · 2026-01-11T17:34:52 1768152892

> We recently got a PR from somebody adding a new feature and the person said he doesn't know $LANG but used AI.

"Oh, and check it out: I'm a bloody genius now! Estás usando este software de traducción in forma incorrecta. Por favor, consulta el manual. I don't even know what I just said, but I can find out!"

pianopatrick · 2026-01-11T22:47:58 1768171678

I think we will find out that certain languages, frameworks and libraries are easier for AI to get all the way correct. We may even have to design new languages, frameworks and libraries to realize the full promise of AI. But as the ecosystem around AI evolves I think these issues will be solved.

zahlman · 2026-01-11T16:10:18 1768147818

... And with this level of quality control, is it still faster than writing it yourself?

victorbjorklund · 2026-01-11T15:04:47 1768143887

Is it really much different from maintaining code that other people wrote and that you merged?

zjzkshz · 2026-01-11T15:16:15 1768144575

Yes, this is (partly) why developer salaries are so high. I can trust my coworkers in ways not possible with AI.

There is no process solution for low performers (as of today).

dpark · 2026-01-11T16:36:15 1768149375

The solution for low performers is very close oversight. If you imagine an LLM as a very junior engineer who needs an inordinate amount of hand holding (but who can also read and write about 1000x faster than you and who gets paid approximately nothing), you can get a lot of useful work out of it.

A lot of the criticisms of AI coding seem to come from people who think that the only way to use AI is to treat it as a peer. “Code this up and commit to main” is probably a workable model for throwaway projects. It’s not workable for long term projects, at least not currently.

nmehner · 2026-01-11T16:49:02 1768150142

A Junior programmer is a total waste of time if they don't learn. I don't help Juniors because it is an effective use of my time, but because there is hope that they'll learn and become Seniors. It is a long term investment. LLMs are not.

dpark · 2026-01-11T17:10:02 1768151402

It’s a metaphor. With enough oversight, a qualified engineer can get good results out of an underperforming (or extremely junior) engineer. With a junior engineer, you give the oversight to help them grow. With an underperforming engineer you hope they grow quickly or you eventually terminate their employment because it’s a poor time trade off.

The trade off with an LLM is different. It’s not actually a junior or underperforming engineer. It’s far faster at churning out code than even the best engineers. It can read code far faster. It writes tests more consistently than most engineers (in my experience). It is surprisingly good at catching edge cases. With a junior engineer, you drag down your own performance to improve theirs and you’re often trading off short term benefits vs long term. With an LLM, your net performance goes up because it’s augmenting you with its own strengths.

As an engineer, it will never reach senior level (though future models might). But as a tool, it can enable you to do more.

fzeroracer · 2026-01-11T19:38:58 1768160338

> It writes tests more consistently than most engineers (in my experience)

I'm going to nit on this specifically. I firmly believe anyone that genuinely believes this either never writes tests that actually matter, or doesn't review the tests that an LLM throws out there. I've seen so many cases of people saying 'look at all these valid tests our LLM of choice wrote' only for half of them to do nothing and half of them misleading as to what it actually tests.

dpark · 2026-01-11T21:43:39 1768167819

It’s like anything else, you’ve got to check the results and potentially push it to fix stuff.

I recently had AI code up a feature that was essentially text manipulation. There were existing tests to show it how to write effective tests and it did a great job of covering the new functionality. My feedback to the AI was mostly around some inaccurate comments it made in the code but the coverage was solid. Would have actually been faster for me to fix but I’m experimenting with how much I can make the AI do.

On the other hand I had AI code up another feature in a different code base and it produced a bunch of tests with little actual validation. It basically invoked the new functionality with a good spectrum of arguments but then just validated that the code didn’t throw. And in one case it tested something that diverged slightly from how the code would actually be invoked. In that case I told it how to validate what the functionality was actually doing and how to make the one test more representative. In the end it was good coverage with a small amount of work.

For people who don’t usually test or care bunch about testing, yeah, they probably let the AI create garbage tests.

fzeroracer · 2026-01-11T23:39:10 1768174750

I don't see anything here that corroborates your claim that it outputs more consistent test code than most engineers. In fact your second case would indicate otherwise.

And this also goes back to my first point about writing tests that matters. Coverage can matter, but coverage is not codifying business logic in your test suite. I've seen many engineers focus only on coverage only for their code to blow up in production because they didn't bother to test the actual real world scenarios it would be used in, which requires deep understanding of the full system.

dpark · 2026-01-11T23:55:02 1768175702

I still feel like in most of these discussions the criticism of LLMs is that they are poor replacements for great engineers. Yeah. They are. LLMs are great tools for great engineers. They won’t replace good engineers and they won’t make shitty engineers good.

You can’t ask an LLM to autonomously write complex test suites. You have to guide it. But when AI creates a solid test suite with 20 minutes of prodding instead of 4 hours of hand coding, that’s a win. It doesn’t need to do everything alone to be useful.

> writing tests that matters

Yeah. So make sure it writes them. My experience so far is that it writes a decent set of tests with little prompting, honestly exceeding what I see a lot of engineers put together (lots of engineers suck at writing tests). With additional prompting it can make them great.

ubercow13 · 2026-01-11T22:58:08 1768172288

>feature that was essentially text manipulation

That seems like the kind of feature where the LLM would already have the domain knowledge needed to write reasonable tests, though. Similar to how it can vibe code a surprisingly complicated website or video game without much help, but probably not create a single component of a complex distributed system that will fit into an existing architecture, with exactly the correct behaviour based on some obscure domain knowledge that pretty much exists only in your company.

dpark · 2026-01-11T23:27:02 1768174022

> probably not create a single component of a complex distributed system that will fit into an existing architecture, with exactly the correct behaviour based on some obscure domain knowledge that pretty much exists only in your company.

An LLM is not a principal engineer. It is a tool. If you try to use it to autonomously create complex systems, you are going to have a bad time. All of the respectable people hyping AI for coding are pretty clear that they have to direct it to get good results in custom domains or complex projects.

A principal engineer would also fail if you asked them to develop a component for your proprietary system with no information, but a principal engineer would be able to so their own deep discovery and design if they have the time and resources to do so. An AI needs you to do some of that.

Oreb · 2026-01-11T22:47:25 1768171645

I also find it hard to agree with that part. Perhaps it depends on what type of software you write, but in my experience finding good test cases is one of those things that often requires a deep level of domain knowledge. I haven’t had much luck making LLMs write interesting, non-trivial tests.

epicureanideal · 2026-01-12T01:46:36 1768182396

This has been my experience as well. So far, whenever I’ve been initially satisfied with the one shotted tests, when I had to go back to them I realized they needed to be reworked.

12_throw_away · 2026-01-11T18:18:06 1768155486

> It’s far faster at churning out code than even the best engineers.

I'm not sure I can think of a more damning indictment than this tbh

dpark · 2026-01-11T18:29:07 1768156147

Can you explain why that’s damning?

nmehner · 2026-01-11T20:11:12 1768162272

I guess everyone dealing with legacy software sees code as a cost factor. Being able to delete code is harder, but often more important than writing code.

Owning code requires you to maintain it. Finding out what parts of the code actual implement features and what parts are not needed anymore (or were never needed in the first place) is really hard. Since most of the time the requirements have never been documented and the authors have left or cannot remember. But not understanding what the code does removed all possibility to improve or modify it. This is how software dies.

Churning out code fast is a huge future liability. Management wants solutions fast and doesn't understand these long term costs. It is the same with all code generators: Short term gains, but long term maintainability issues.

dpark · 2026-01-11T21:54:06 1768168446

Do you not write code? Is your code base frozen, or do you write code for new features and bug fixes?

The fact that AI can churn out code 1000x faster does not mean you should have it churn out 1000x more code. You might have a list of 20 critical features and it have time to implement 10. AI could let you get all 20 but shouldn’t mean you check in code for 1000 features you don’t even need.

nmehner · 2026-01-12T17:36:50 1768239410

I write code. On a good day perhaps 800-1000 "hand written" lines.

I have never actually thought about how much typing time this actually is. Perhaps an hour? In that case 7/8th of my day are filled with other stuff. Like analysis, planning, gathering requirements, talking to people.

So even if an AI removed almost all the time I spend typing away: This is only a 10% improvement in speed. Even if you ignore that I still have to review the code, understand everything and correct possible problems.

A bigger speedup is only possible if you decide not to understand everything the AI does and just trust it to do the right thing.

dpark · 2026-01-12T18:26:24 1768242384

Maybe you code so fast that the thought-to-code transition is not a bottleneck for you. In which case, awesome for you. I suspect this makes you a significant outlier since respected and productive engineers like Antirez seem to find benefits.

ubercow13 · 2026-01-11T22:52:50 1768171970

Sure if you just leave all the code there. But if it's churning out iterations, incrementally improving stuff, it seems ok? That's pretty much what we do as humans, at least IME.

12_throw_away · 2026-01-11T20:20:55 1768162855

Sure:

[1] https://saintgimp.org/2009/03/11/source-code-is-a-liability-...

[2] https://pluralistic.net/2026/01/06/1000x-liability/

dpark · 2026-01-11T20:50:28 1768164628

I feel like this is a forest for the trees kind of thing.

It is implied that the code being created is for “capabilities”. If your AI is churning out needless code, then sure, that’s a bad thing. Why would you be asking the AI for code you don’t need, though? You should be asking it for critical features, bug fixes, the things you would be coding up regardless.

You can use a hammer to break your own toes or you can use it to put a roof on your house. Using a tool poorly reflects on the craftsman, not the tool.

embedding-shape · 2026-01-11T16:56:29 1768150589

Just like LLMs are a total waste of time if you never update the system/developer prompts with additional information as you learn what's important to communicate vs not.

nmehner · 2026-01-11T17:14:52 1768151692

That is a completely different level. I expect a Junior Developer to be able to completely replace me long term and to be able decide when existing rules are outdated and when they should be replaced. Challenge my decisions without me asking for it. Being able to adapt what they have learned to new types of projects or new programming languages. Being Senior is setting the rules.

An LLM only follows rules/prompts. They can never become Senior.

victorbjorklund · 2026-01-14T16:04:37 1768406677

I think you're making a mistake if your reviews are just that you trust that your co-workers never make a mistake. I make mistakes. My co-workers make mistakes. Everybody makes mistakes, that's why we have code reviews.

YetAnotherNick · 2026-01-11T15:20:37 1768144837

Yes. Firstly AI forgets why it wrote certain code and with humans at least you can ask them when reviewing. Secondly current gen AI(at least Claude) kind of wants to finish the thing instead of thinking of bigger picture. Human programmers code little differently that they hate a single line fix in random file to fix something else in different part of the code.

I think the second is part of RL training to optimize for self contained task like swe bench.

seanmcdirmid · 2026-01-11T16:02:32 1768147352

So you live in a world where code history must only be maintained orally? Have you ever thought to ask AI to write documentation on what and why and not just write the code. Asking it to document as well as code works well when the AI needs to go back and change either.

nemomarx · 2026-01-11T16:06:54 1768147614

I don't see how asking AI to write some description of why it wrote this or that code would actually result in an explanation of why it wrote that code? It's not like it's thinking about it in that way, it's just generating both things. I guess they'd be in the same context so it might be somewhat correct.

seanmcdirmid · 2026-01-11T16:20:28 1768148428

If you ask it to document why it did something, then when it goes back later to update the code it has the why in its context. Otherwise, the AI just sees some code later and has no idea why it was written or what it does without reverse engineering it at the moment.

immibis · 2026-01-11T16:46:02 1768149962

I'm not sure you understood the GP comment. LLMs don't know and can't tell you why they write certain things. You can't fix that by editing your prompt so it writes it on a comment instead of telling you. It will not put the "why" in the comment, and therefore the "why" won't be in the future LLM's context, because there is no way to make it output the "why".

It can output something that looks like the "why" and that's probably good enough in a large percentage of cases.

seanmcdirmid · 2026-01-11T17:02:16 1768150936

LLMs know why they are writing things in the moment, and they can justify decisions. Asking it to write those things down when it writes code works, or even asking them to design the code first and then generate/update code from the design also works. But yes, if things aren’t written down, “the LLM don’t know and can’t tell.” Don’t do that.