Hacker Newsnew | past | comments | ask | show | jobs | submit | emp17344's commentslogin

Your entire argument is derived from a pseudoscientific field without any peer-reviewed research. Mechanistic interpretability is a joke invented by AI firms to sell chatbots.

Lol that's a stupid ass response, especially when half the papers are from universities from China. You think the chinese universities are trying to sell ChatGPT subscriptions? Ridiculous. You're just falling behind in tech knowledge.

And apparently you think peer reviewed papers presented at NeurIPS or other conferences are considered pseudoscience. (For the people not versed in ML, NeurIPS is where the 2017 paper "Attention is All You Need" that started the modern ML revolution was presented)

https://neurips.cc/virtual/2023/poster/72666

https://jmlr.org/beta/papers/v26/23-0058.html

https://proceedings.mlr.press/v267/palumbo25a.html

https://iclr.cc/virtual/2026/poster/10011755


Sounds like you agree this “evidence” lacks any semblance of scientific rigor?

(Not GP) There was a well recognized reproducibility problem in the ML field before LLM-mania, and that's considering published papers with proper peer-reviews. The current state of afairs in some ways is even less rigourous than that, and then some people in the field feel free to overextend their conclusions into other fields like neurosciences.

Frankly, I don't see a reason to give a shit.

We're in the "mad science" regime because the current speed of progress means adding rigor would sacrifice velocity. Preprints are the lifeblood of the field because preprints can be put out there earlier and start contributing earlier.

Anthropic, much as you hate them, has some of the best mechanistic interpretability researchers and AI wranglers across the entire industry. When they find things, they find things. Your "not scientifically rigorous" is just a flimsy excuse to dismiss the findings that make you deeply uncomfortable.


Mechanistic interpretability is a joke, supported entirely by non-peer reviewed papers released as marketing material by AI firms.

Did you just invent a nonsense fallacy to use as a bludgeon here? “Stochastic parrot fallacy” does not exist, and there actually quite a bit of evidence supporting the stochastic parrot hypothesis.

I imagine "stochastic parrot fallacy" could be their term for using the hypothesis to dismiss LLMs even where they can be useful; i.e., dismissing them for their weaknesses alone and ignoring their strengths. (Of course, we have no way to know for sure without their input.)

That’s because there’s no objective research on this. Similarly, there are no good citations to support your objection. They simply don’t exist yet.

Maybe not worth discussing something that cannot be objectively assessed then.

Then don't; all I did was offer my thoughts in a public comments section.

Oh, please. There’s always a way to blame the user, it’s a catch-22. The fact is that coding agents aren’t perfect and it’s quite common for them to fail. Refer to the recent C-compiler nonsense Anthropic tried to pull for proof.

It fails far less often than I do at the cookie cutter parts of my job, and it’s much faster and cheaper than I am.

Being honest; I probably have to write some properly clever code or do some actual design as a dev lead like… 2% of my time? At most? The rest of the code related work I do, it’s outperforming me.

Now, maybe you’re somehow different to me, but I find it hard to believe that the majority of devs out there are balancing binary trees and coming up with shithot unique algorithms all day rather than mangling some formatting and dealing with improving db performance, picking the right pattern for some backend and so on style tasks day to day.


Ethical realists would disagree with you.

Plagiarizing Stockfish doesn’t make me good at chess. Same principle applies.

That’s a devastating benchmark design flaw. Sick of these bullshit benchmarks designed solely to hype AI. AI boosters turn around and use them as ammo, despite not understanding them.

Relax. Anyone who's genuinely interested in the question will see with a few searches that LLMs can play chess fine, although the post-trained models mostly seem to be regressed. Problem is people are more interested in validating their own assumptions than anything else.

https://arxiv.org/abs/2403.15498

https://arxiv.org/abs/2501.17186

https://github.com/adamkarvonen/chess_gpt_eval


I like this game between grok-4.1-fast and maia-1100 (engine, not LLM).

https://chessbenchllm.onrender.com/game/37d0d260-d63b-4e41-9...

This exact game has been played 60 thousand times on lichess. The peace sacrifice Grok performed on move 6 has been played 5 million times on lichess. Every single move Grok made is also the top played move on lichess.

This reminds me of Stefan Zweig’s The Royal Game where the protagonist survived Nazi torture by memorizing every game in a chess book his torturers dropped (excellent book btw. and I am aware I just committed Godwin’s law here; also aware of the irony here). The protagonist became “good” at chess, simply by memorizing a lot of games.


The LLMs that can play chess, i.e not make an illegal move every game do not play it simply by memorized plays.

> That’s a devastating benchmark design flaw

I think parent simply missed until their later reply that the benchmark includes rated engines.


This is incorrect. It’s basic economics - technology that boosts productivity results in higher salaries and more jobs.

That’s not basic economics. Basic economics says that salaries are determined by the demand for labor vs the supply of labor. With more efficiency, each worker does more labor, so you need fewer people to accomplish the same thing. So unless the demand for their product increases around the same rate as productivity increases, companies will employ fewer people. Since the market for products is not infinite, you only need as much labor as you require to meet the demand for your product.

Companies that are doing better than ever are laying people off by the shipload, not giving people raises for a job well done.


Well, that depends on whether the technology requires expertise that is rare and/or hard to acquire.

I'd say that using AI tools effectively to create software systems is in that class currently, but it isn't necessarily always going to be the case.


You obviously haven't thought about economics much at all to say something this simplistic.

There are so many counter examples of this being wrong that it is not even worth bothering.

I love economics, but it is largely a field based around half truths and intellectual fraud. It is actually why it is an interesting subject to study.


Denial of economic truths is denial of science. Not sure what to tell you. What parts do you reject?

Like denying that more efficiency without a commensurate increase in product demand means the demand for labor goes down, which means fewer jobs, and lower salaries? You don’t pay people what they’re actually worth, you pay people what they’ll work for. Requesting more money because you’re making the company more money is only viable if there aren’t qualified people lining up for the chance to take your role. Even without more money, well-paid people tend to regrettably get laid off in those circumstances.

Nah, most of it just gets returned to capital holders.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: