More

jbjbjbjb · 2026-02-05T23:20:13 1770333613

It’s cool but there’s a good chance it’s just copying someone else’s homework albeit in an elaborate round about way.

nomel · 2026-02-05T23:43:14 1770334994

I would claim that LLMs desperately need proprietary code in their training, before we see any big gains in quality.

There's some incredible source available code out there. Statistically, I think there's a LOT more not so great source available code out there, because the majority of output of seasoned/high skill developers is proprietary.

To me, a surprising portion of Claude 4.5 output definitely looks like student homework answers, because I think that's closer to the mean of the code population.

dcre · 2026-02-06T03:30:52 1770348652

This is dead wrong: essentially the entirety of the huge gains in coding performance in the past year have come from RL, not from new sources of training data.

I echo the other commenters that proprietary code isn’t any better, plus it doesn’t matter because when you use LLMs to work on proprietary code, it has the code right there.

thesz · 2026-02-06T06:42:43 1770360163

  > the huge gains in coding performance in the past year have come from RL, not from new sources of training data.

This one was on HN recently: https://spectrum.ieee.org/ai-coding-degrades

Author attributes past year's degradation of code generation by LLMs to excessive use of new source of training data, namely, users' code generation conversations.

dcre · 2026-02-06T13:46:57 1770385617

Yeah, this is a bullshit article. There is no such degradation, and it’s absurd to say so on the basis of a single problem which the author describes as technically impossible. It is a very contrived under-specified prompt.

And their “explanation” blaming the training data is just a guess on their part, one that I suspect is wrong. There is no argument given that that’s the actual cause of the observed phenomenon. It’s a just-so story: something that sounds like it could explain it but there’s no evidence it actually does.

My evidence is that RL is more relevant is that that’s what every single researcher and frontier lab employee I’ve heard speak about LLMs in the past year has said. I have never once heard any of them mention new sources of pretraining data, except maybe synthetic data they generate and verify themselves, which contradicts the author’s story because it’s not shitty code grabbed off the internet.

elevation · 2026-02-06T03:58:40 1770350320

> it doesn’t matter because when you use LLMs to work on proprietary code, it has the code right there

The quality of the existing code base makes a huge difference. On a recent greenfield effort, Claude emitted an MVP that matched the design semantics, but the code was not up to standards. For example, it repeatedly loaded a large file into memory in different areas where it was needed (rather than loading once and passing a reference.)

However, after an early refactor, the subsequently generated code vastly improved. It honors the testing and performance paradigms, and it's so clean there's nothing for the linter to do.

nextos · 2026-02-06T04:32:38 1770352358

Progress with RL is very interesting, but it's still too inefficient. Current models do OK on simple boring linear code. But they output complete nonsense when presented with some compact but mildly complex code, e.g. a NumPyro model with some nesting and einsums.

For this reason, to be truly useful, model outputs need to be verifiable. Formal verification with languages like Dafny , F*, or Isabelle might offer some solutions [1]. Otherwise, a gigantic software artifact such as a compiler is going to have a critical correctness bugs with far-fetched consequences if deployed in production.

Right now, I think treating a LLM like something different than a very useful information retrieval system with excellent semantic capabilities is not something I am comfortable with.

[1] https://risemsr.github.io/blog/2026-02-04-nik-agentic-pop

dcre · 2026-02-06T13:50:58 1770385858

Human-written compilers have bugs too! It takes decades of use to iron them out, and we’re introducing new ones all the time.

bearjaws · 2026-02-06T01:59:52 1770343192

I will say many closed source repos are probably equally as poor as open source ones.

Even worse in many cases because they are so over engineered nobody understands how they work.

hirvi74 · 2026-02-06T03:56:57 1770350217

I firmly agree with your first sentence. I can just think about the various modders that have created patches and performance enhancing mods for games with budgets of tens to hundreds of millions of dollars.

But to give other devs and myself some grace, I do believe plenty of bad code can likely be explained by bad deadlines. After all, what's the Russian idiom? "There is nothing more permanent than the temporary."

typ · 2026-02-06T01:50:58 1770342658

I'd bet, on average, the quality of proprietary code is worse than open-source code. There have been decades of accumulated slop generated by human agents with wildly varied skill levels, all vibe-coded by ruthless, incompetent corporate bosses.

Manouchehri · 2026-02-06T02:15:31 1770344131

There's only very niche fields where closed-source code quality is often better than open-source code.

Exploits and HFT are the two examples I can think of. Both are usually closed source because of the financial incentives.

ozim · 2026-02-06T05:06:45 1770354405

Here we can start debating what means better code.

I haven’t seen HFT code but I have seen examples of exploit codes and most of it is amateur hour when it comes to building big size systems.

They are of course efficient in getting to the goal. But exploits are one off code that is not there to be maintained.

Take8435 · 2026-02-06T02:30:01 1770345001

Not to mention, a team member is (surprise!) fired or let go, and no knowledge transfer exists. Womp, womp. Codebase just gets worse as the organization or team flails.

Seen this way too often.

hirvi74 · 2026-02-06T04:03:20 1770350600

In my time, I have potentially written code that some legal jurisdictions might classify as a "crime against humanity" due to the quality.

kortilla · 2026-02-06T04:15:14 1770351314

It doesn’t matter what the average is though. If 1% of software is open source, there is significantly more closed source software out there and given normal skills distributions, that means there is at least as much high quality closed source software out there, if not significantly more. The trick is skipping the 95% of crap.

bhadass · 2026-02-06T00:16:08 1770336968

yeah, but isn't the whole point of claude code to get people to provide preference data/telemetry data to anthropic (unless you opt out?). same w/ other providers.

i'm guessing most of the gains we've seen recently are post training rather than pretraining.

nomel · 2026-02-06T01:04:10 1770339850

Yes, but you have the problem that a good portion of that is going to be AI generated.

But, I naively assume most orgs would opt out. I know some orgs have a proxy in place that will prevent certain proprietary code from passing through!

This makes me curious if, in the allow case, Anthropic is recording generated output, to maybe down-weight it if it's seen in the training data (or something similar)?

andai · 2026-02-05T23:56:07 1770335767

Let's start with the source code for the Flash IDE :)

wvenable · 2026-02-06T00:28:04 1770337684

This is cool and actually demonstrates real utility. Using AI to take something that already exists and create it for a different library / framework / platform is cool. I'm sure there's a lot of training data in there for just this case.

But I wonder how it would fare given a language specification for a non-existent non-trivial language and build a compiler for that instead?

nmstoker · 2026-02-06T01:18:13 1770340693

If you come up with a realistic language spec and wait maybe six months, by then it'll probably be approach being cheap enough that you could test the scenario yourself!

luke5441 · 2026-02-05T23:37:26 1770334646

It looks like a much more progressed/complete version of https://github.com/kidoz/smdc-toolchain/tree/master/crates/s... . But that one is only a month old. So a bit confused there. Maybe that was also created via LLM?

nlawalker · 2026-02-06T00:59:31 1770339571

I see that as the point that all this is proving - most people, most of the time, are essentially reinventing the wheel at some scope and scale or another, so we’d all benefit from being able to find and copy each others’ homework more efficiently.

computerex · 2026-02-06T01:50:43 1770342643

And the goal post shifts.

kreelman · 2026-02-06T00:53:02 1770339182

..A small thing, but it won't compile the RISCV version of hello.c if the source isn't installed on the machine it's running on.

It is standing on the shoulders of giants (all of the compilers of the past, built into it's training data... and the recent learnings about getting these agents to break up tasks) to get itself going. Still fairly impressive.

On a side-quest, I wonder where Anthropic is getting there power from. The whole energy debacle in the US at the moment probably means it made some CO2 in the process. Would be hard to avoid?

jbjbjbjb · 2026-02-02T23:15:46 1770074146

I think you’re overstating the effect. The most volume is sold at supermarkets which have the best location for throughout but they also have the cheapest prices.

jbjbjbjb · 2026-02-02T22:52:26 1770072746

There’s only so much investable capital available, if it is going to hardware stocks it’s got to be coming from somewhere else. It’s just a substitution toward hardware tech stocks. Economics 101.

jbjbjbjb · 2026-01-28T12:21:13 1769602873

> do generalists outperform specialists?

Depends what we mean by specialist. If it frontend vs backend then maybe. If it general dev vs some specialist scientific programmer or other field where a generalist won’t have a clue then this seems like a recipe for disaster (literal disasters included).

jbjbjbjb · 2026-01-13T14:30:14 1768314614

That’s just the way “freedom news” is framing it.

Social movements don’t just happen from grassroots these days. They’re seeded by foreign states. A simpler solution would be require ids for social media posting. If you don’t provide an id you get a limited number of views.

And I don’t see anything wrong with a preventative system in principle, we should be able to join up social services information with policing, because we have had cases where a mass murderer has been known to multiple services.

Edit: probably not ids but a token that verifies my nationality would be enough.

Aurornis · 2026-01-13T14:44:06 1768315446

> A simpler solution would be require ids for social media posting

It’s strange times when even the comments on posts about government overreach are calling for more government overreach and limitations on speech and privacy.

Do you really want to have to verify your ID to post anything online, including HN?

thmsths · 2026-01-13T15:24:55 1768317895

And I am willing to bet that on top of the chilling effect on regular people, it will only act as an inconvenience for the bad actors as they will find ways to circumvent it. Controlling the online discourse is far too valuable, they are not going to just shrug and give up because the government puts up a barrier.

jbjbjbjb · 2026-01-13T16:00:36 1768320036

What chilling effect? Have you seen what people post on facebook under their own name.

mnau · 2026-01-13T18:05:08 1768327508

Meet selection bias. Self-censorship is very common.

jbjbjbjb · 2026-01-13T15:59:35 1768319975

Yeah it might be just be a verified token to say I’m citizen of the country. Doesn’t have to be my actual id. The OSA is a crappy implementation.

jbjbjbjb · 2025-12-05T10:59:07 1764932347

I think ‘git rebase —-update-refs’ is the better way to go for this scenario

lelandfe · 2025-12-05T13:06:17 1764939977

Sweet, looks like this is pretty new (2022).

Running a git command on one branch and multiple branches being affected is really unusual for me! This really does look like it is designed for just this problem, though. Simple overview: https://blog.hot-coffee.dev/en/blog/git_update_refs/

YmiYugy · 2025-12-05T13:42:36 1764942156

It breaks if you amend the top commit instead of adding a new one.

enbugger · 2025-12-05T11:47:26 1764935246

Is there any good guide on how to solve the issue which OP solves?

jbjbjbjb · 2025-12-05T13:47:41 1764942461

I was reading this the other day when I came across this feature because I’m stacking PRs recently which I don’t usually do

https://andrewlock.net/working-with-stacked-branches-in-git-...

Another commenter posted this link which was a bit more succinct

https://blog.hot-coffee.dev/en/blog/git_update_refs/

There isn’t much to it though, you just go to the branch and run git rebase with the update refs flag.

sirsuki · 2025-12-05T12:15:59 1764936959

You don’t really need docs as --update-refs does what the OP does automatically instead of manually like the OP does.

ptx · 2025-12-05T13:49:36 1764942576

How? I tried recreating the scenario from the article (the section "First rebase –onto") and ran the first rebase with "--update-refs":

  $ git checkout feature-1
  $ git rebase --update-refs main
  Successfully rebased and updated refs/heads/feature-1.
  Updated the following refs with --update-refs:
   refs/heads/feature-2-base

But all it did was update feature-2-base. It still left feature-2 pointing to the old commits. So I guess it automates "git branch -f feature-2-base feature-1" (step 3), but it doesn't seem to automate "git rebase --onto feature-1 feature-2-base feature-2" (step 2).

Presumably I'm doing something wrong?

mhw · 2025-12-05T14:17:17 1764944237

Yeah, you need to rebase the tip of the feature branch stack. git will then update all the refs that point to ancestor commits that are moved. So in this case

    $ git rebase --update-refs main feature-2

ptx · 2025-12-05T14:43:39 1764945819

Thanks! Yup, that does the trick.

happytoexplain · 2025-12-05T14:17:09 1764944229

First, you don't need the extra "marker" commit. This flag obviates the entire workflow.

Second, you run it on the outermost branch: feature 2. It updates all refs in the chain.

jbjbjbjb · 2025-10-25T08:08:03 1761379683

The article suggests there’s evidence that screen time has the opposite effect. A little surprising but I guess for a lot of people it is more stimulating than watching the news or soaps all day

tonyedgecombe · 2025-10-25T16:30:04 1761409804

It says it’s unclear which way the causation goes.

jbjbjbjb · 2025-10-21T22:36:48 1761086208

> The world is getting stranger

In the U.K. I was betting 5 minute binary options back in 2008 and parlays or accumulators as we call them (accys for short) have been popular for a while too.

giobox · 2025-10-21T22:50:48 1761087048

Rightly or wrongly, The Gambling Act 2005 put the UK literally decades ahead of places like the US in terms of creating a legal framework for sports betting/gambling in general.

shermantanktop · 2025-10-21T23:23:41 1761089021

And that's how you get a Paddy Power next to a William Hill while the rest of the high street is shuttered.

nly · 2025-10-21T23:35:24 1761089724

And Betfred just threatened to close 1,300 high street betting shops if Reeves increases gambling taxes in the budget next month.

https://www.ft.com/content/3641f944-38cc-4ed5-853a-03fab3ac4...

UltraSane · 2025-10-22T03:35:08 1761104108

How is that a threat?

lmm · 2025-10-21T23:56:27 1761090987

Do you think that's better or worse than an entirely shuttered high street? Why?

shermantanktop · 2025-10-22T03:06:06 1761102366

Those are the choices? I hope not.

The forces that made those shops appear and made the greengrocers disappear are not natural, inevitable, and foolish to resist. They are just laws, laws that permitted some things and discouraged others, taxed some things and subsidized others.

Podrod · 2025-10-22T01:04:13 1761095053

Because the gambling industry is a parasite.

jbjbjbjb · 2025-10-21T21:48:53 1761083333

In my experience there’s still much more to this. I’m sure it helps at population level like the article describes but it’s not foolproof. For our first we were feeding nuts early and still developed an allergy to all nuts. Our second didn’t get nuts until much later and he’s fine. There’s more to the story than timing, notably my first has eczema and asthma too so there’s that atopic march.

Maxion · 2025-10-22T05:48:06 1761112086

> There’s more to the story than timing, notably my first has eczema and asthma too so there’s that atopic march.

Eczema often comes with digestive issues, bowl inflammation, loose stool, blood in the stool etc.

Eczema essentially gives you wounds, if you allow allergens to enter the bloodstream directly without going through the digestive tract you are at an increased risk of developing allergies.

For kids/babies with these kind of issues it's probably better to delay introducing common allergens until their gut can heal or you will end up causing allergies rather than preventing them.

Source: https://pmc.ncbi.nlm.nih.gov/articles/PMC7494573

jgalt212 · 2025-10-21T22:01:52 1761084112

Allergy rate decreases with birth order. Of course, that's at the population level and probably not strong enough effect to notice if you only poll a dozen parents you know.

taeric · 2025-10-21T22:16:15 1761084975

As the 3rd child that had more allergies than I can really make sense of, I'm curious on this. Any recommended studies to read up on this?

akdor1154 · 2025-10-22T03:11:02 1761102662

Did you medicate for reflux?

jbjbjbjb · 2025-10-24T11:59:44 1761307184

No we didn’t

jbjbjbjb · 2025-10-15T13:48:41 1760536121

I’m glad I opted to get the base model M4 Mac Mini rather than upgrade the memory for longevity.