Hacker Newsnew | past | comments | ask | show | jobs | submit | pronik's commentslogin

Maybe they are being acquired to improve the quality of Codex.

That's the thing. To me that says that as soon as cash becomes tight at OpenAI, the Astral staff will no longer get to work on Python tooling anymore, namely uv, etc.

Tale as old as time in SV, why we keep trusting venture capital to be the community's stewards I have no idea.

We need public investment in open source, in the form of grants, not more private partnerships that somehow always seem to hurt the community.


what do you mean "trusting" or "hurting the community"? i don't think uv has damaged anything yet. i'll use a tool from whoever if the risk profile is acceptable. given the level of quality in uv already, it seems very low risk to adopt no matter who the authors are, because it's open source, easy to use old version, and if they really go off the deep end, i expect the python community as a whole will maintain a slow-moving but stable fork.

i'd love there to be infinite public free money we could spend on Making Good Software but at least in the US, there is vanishingly small public free money available, while there's huge sums of private free money even in post-ZIRP era. If some VCs want to fund a team to write great open source software the rest of us get for free, i say "great thanks!"


> why we keep trusting venture capital to be the community's stewards I have no idea.

They bought the trust.


> we keep trusting venture capital to be the community's stewards

OpenAI isn't a VC. It's VC-backed. But so is Astral.


Code speed or not, people talk about how coding agents have taken away their passion. I've been reflecting on that for quite some time now and quite honestly, I don't miss a single thing.

My passion has always been building something, but my building has always been hindered by a myriad of small paper cuts -- it's just what technology is and, if we are being honest, always has been. It's not being able to recall an exact syntax or a function name for something I know exists, it's frantically searching for whether something I need exists at all and if so, in anything I'm using alredy, it's a CI build not doing the simplest thing after working for months, it's a TypeScript error with a new library not wanting to compile until I change a dozen of things in tsconfig.json, it's my editor deciding not to update diagnostics today at any cost, it's deprecated syntax, it's documentation not describing what functions do and functions not working like the documentation describes them, it's hunting after weird bugs in fourth- and fifth-party code triggered exactly by four people in the world, one of them being me right now.

This list goes on and on and all of those things are great when I finally manage to solve the problems. However, in terms of building things, I find it extremely liberating to have a literal assistant capable of sorting this shit out in seconds or minutes instead of me banging my head against the wall the whole night. Code writing speed wasn't my problem, but I appreciate when I can think about the code as the whole and change it as a whole in an instant. My time spent on building something hasn't changed much in absolute terms, but in the same spent time I will have taken a dozen detours, experimented with alternatives and provided enough context to tell anyone who asks why something is built this way and not another.


An anecdote: for a while now I've noticed or imagined Claude Code becoming ever so slightly dumber around 3-4pm CEST, I've been calling it the "Americans are awake" syndrome, because of assumed higher usage while keeping the latency the same (which is something Anthropic surely keeps an eye on) and thus lower quality.

Haha I've been seeing the same thing late in the evening (Australian time), which I attributed to the same reason.

I'd be more concerned about confusing it with https://github.com/sindresorhus/got, which is well-established (15k stars on GitHub is nothing to sneeze at).


Our life has become so dumb in certain ways. There are people who invested heavily in learning their mother or a foreign language, its spelling, grammar, syntax and idiosyncrasies, like when to use an em-dash, an Oxford comma, a semicolon, an ellipsis -- these smart educated people now seriously deliberate whether using wrong dashes and adding a spelling mistake or two would be a good way to prove you are a human (I think we never should have allowed the framing of CAPTCHA to be "prove you are not a robot", it was demeaning back then and still is now, it's just that the alternatives were not and still aren't clear-cut). The same things that would have made you fail a written essay in school are somehow becoming a requirement, but not in "haX0r" or online communities where "writing funny" has always been a differentiating factor, but for absolutely everybody who has to communicate with others in written form.

It's of course not a surprise that an LLM would be most proficient in language use and, adjacent to that, in proper formatting of said language. But it's a good thing and a good tool for writing, as anyone who has ever used a classic spell or grammar checker will attest to. But apparently we as a society have once again managed to completely overlook and demonise the good and now people who have paid attention in school have to bow to people who are somehow convinced that perfect spelling is a sign that someone cheated. This is not LLMs' fault, it's people's who think they've understood something when they really haven't, crying heresy over others doing things the correct way.

That being said: of course there are social and technological challenges with cheating, spam bots and sock puppets and what not, but the phenomenon itself is not really new, just the scale, cost and quality is way different now. We need to find a balanced way to approach it -- trying to weed out every last possible AI cheater while hurting real innocent people in the process is not worth it. Especially since we don't have a proper metric to actually prove who's a cheater and who is not, it's gotten way harder since the days of "As a large language model" being in every second sentence.


I felt it quite a while back (more than 10 years ago), when, in high school, I learnt LaTeX and discovered Beamer. I naturally proceeded to make all of my presentations with it, including the rehearsal for a big competitive French exam. The person reviewing the presentation advised me to dirty it up a bit, otherwise nobody would believe that my father wasn’t a PhD researcher that did the work for me.

That was a bit saddening honestly. I kept the presentation as-is as I didn’t knew how to willfully screw up a Beamer presentation, and I would not touch PowerPoint (fortunately the final jury believed me).

Cheating had always been an issue before LLMs, but now we’re back to the same old tricks: just make sure to add a mistake or two to hide you copied the homework on your neighbor. It’s a shame because I kinda like learning the subtleties of foreign languages, and as a non-native English speaker, it’s quite rewarding when going online!


An especially egregious case I've encountered was at Berlin train station.

Normally in Germany, you've got those distinct card terminals with a display where you see your total before paying. Some of those have started nagging you for tips which you need to explicitely accept or decline first before tapping your card. Not in this case though: after you've ordered your food, they point you to the combined order/pay display and while you awe at the technology marvel of combining both, you tap your card on that and then you notice that 15% tip has been automatically included and charged. You needed to notice some small text and small buttons in the corner of that display beforehand and actively tap on "0%" or something before tapping your card. I'm already furious they've let this tip begging to be added to the card terminals, but charging tips without explicit consent should be completely illegal.


At Schiphol they offer tipping options to presumably prey on Americans, but the attendant physically reached over the counter to reject it after I ordered in (native) Dutch. Can't imagine how much trouble locals had been giving the shop before training staff like this.


Happens in Spain too, the card machine will sometimes have tip options but the waiter selects no tip if serving locals to avoid the annoyance


I’ve seen Starbucks employees in the US do this often.


Isn't this directly taking money out of their own pocket, or are tips going to corporate?


What takes money out of their pockets is not paying a real wage for a real job. TIP destroys the value of a profession and when you don't think a profession is 'professional' you pay less for it. This is a terrible dark pattern at every level.


I am not arguing for tipping culture, but I question the incentive for rejecting a tip as a US Starbucks employee. I very much doubt that playing a part in desired society-wide change overcomes the immediate incentive of the tip itself.


I don't go to starbucks personally but I've been to local ice cream shops, Cafes, fast food businesses, and others and this isn't too uncommon. I'd say it happens about 15% of the time? Its usually at places where tips aren't as expected anyways, but not always.


It is illegal. They're counting on you to be too busy to sue them. That's why you only saw it in a train station. Ask your bank for a refund for the fraudulent transaction — you probably don't have enough evidence to prove it happened, but they'll still put the complaint on file.

Lawsuits and chargebacks are about the only pressure businesses have not to scam you.


And wonderfully done in a city where people are usually paid a living wage, and even students don’t have to work for free (hi Belgium!)


I am pretty sure it is completely illegal.


Would the credit card company / bank let you dispute the charge? I imagine that's going to be difficult with a card present transaction.


If you purchased $50 worth of items and got charged $75, wouldn't you dispute it?


An especially egregious case I've encountered was at Berlin train station.

You must have been in Berlin, Wisconsin because we've been assured repeatedly on HN that the tipping plague thing is exclusively an American problem, and never ever happens in Eurosparkleponyland.

Ditto for spam calls and texts.


The problem is that US is actively exporting all the plagues they call their own worldwide, so that tipping (the mandatory-on-the-card-terminal kind) is currently actively infecting Europe. Which you could have gathered by reading this very discussion.


And I think people don't mind tipping when they feel it's voluntary


I hate that the tipping culture is infecting Germany now


Tipping culture has been around in Germany for a long time, hasn’t it? It’s surprisingly common there


It's not the tipping culture that's invading Germany, it's the "begging for tips" culture. The worst kind. You buy a piece of bread at the bakery over the counter, pay with card, the card reader is begging for tips. This is exceptionally out of the ordinary, but I'm afraid there is not enough explicit resistance and there is still too much "looking up to the USA" happening so that the society might accept this idiocy as normal some day.


(I'm German. This is my personal stance.)

Charging a tip for to-go items is preposterous. When dining in, I will indeed tip, usually by rounding up to the next 5 or 10 euro increment for a group meal, or to the next 1 or 2 euro increment for single meals (e.g. during lunch hours near the office). But this is only if the service is actually good. If a restaurant makes me wait more than 30 minutes for a quick lunch, they will be paid exactly the amount posted on the menu.


Mostly contained to tourist areas though. But AFAICT, it's been spreading recently.


Not to this extent


While it's an excellent way to make more money in the moment, I think this might become a standard no-extra-cost feature in several months (see Opus becoming way cheaper and a default model within months). Mental load management while using agents will become even more important it seems.


Why would they cut a money making feature? In fact I am already imagining them asking for speed ransom every time you are in a pinch, some extra context space will also become buyable. Anthropic is in a penny pincher phase right now and they will try to milk everything. Watch them add micro transactions too.


Yeah especially once they make an even faster fast mode.


To the folks comparing this to GasTown: keep in mind that Steve Yegge explicitely pitched agent orchestrators to among others Anthropic months ago:

> I went to senior folks at companies like Temporal and Anthropic, telling them they should build an agent orchestrator, that Claude Code is just a building block, and it’s going to be all about AI workflows and “Kubernetes for agents”. I went up onstage at multiple events and described my vision for the orchestrator. I went everywhere, to everyone. (from "Welcome to Gas Town" https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16d...)

That Anthropic releases Agent Teams now (as rumored a couple of weeks back), after they've already adopted a tiny bit of beads in form of Tasks) means that either they've been building them already back when Steve pitched orchestrators or they've decided that he's been right and it's time to scale the agents. Or they've arrived at the same conclusions independently -- it won't matter in the larger scale of things. I think Steve greately appreciates it existing; if anything, this is a validation of his vision. We'll probably be herding polecats in a couple of months officially.


It's not like he was the only one who came up with this idea. I built something like that without knowing about GasTown or Beeds. It's just an obvious next step

https://github.com/mohsen1/claude-code-orchestrator


I also share your confusion about him somehow managing to dominate credit in this space, when it doesn't even seem like Gastown ended up being very effective as a tool relative to its insane token usage. Everyone who's used an agentic tool for longer than a day will have had the natural desire for them to communicate and coordinate across context windows effectively. I'm guessing he just wrote the punchiest article about it and left an impression on people who had hitherto been ignoring the space entirely.


It was a fun article!


Exactly! I built something similar. These are such low hanging fruit ideas that no one company/person should be credited for coming up with them.


Seriously, I thought that was what langchain was for back in 2023.


Seriously, what is langchain? It’s so completely useless. Clearly none of the new agents care about it or need it. Irrelevant.


Agree, langchain was useless then and completely irrelevant now, but the idea that we need to orchestrate different LLM loops is extremely obvious.


> what is langchain?

and incantation you put on your resume to double your salary for a few months before the company you jumped ship to gets obsoleted by the foundational model


There seems to be a lot of convergent evolution happening in the space. Days before the gas town hype hit, I made a (less baroque, less manic) "agent team" setup: a shell script to kick off a ralph wiggum loop, and CLAUDE-MESSAGE-BUS.md for inter-ralph communication (Thread safety was hacked into this with a .claude.lock file).

The main claude instance is instructed to launch as many ralph loops as it wants, in screen sessions. It is told to sleep for a certain amount of time to periodically keep track of their progress.

It worked reasonably well, but I don't prefer this way of working... yet. Right now I can't write spec (or meta-spec) files quick enough to saturate the agent loops, and I can't QA their output well enough... mostly a me thing, i guess?


Not a you thing. Fancy orchestration is mostly a waste, validation is the bottleneck. You can do E2E tests and all sorts of analytic guardrails but you need to make sure the functionality matches intent rather than just being "functional" which is still a slow analog process.


> Right now I can't write spec (or meta-spec) files quick enough to saturate the agent loops, and I can't QA their output well enough... mostly a me thing, i guess?

Same for me, however, the velocity of the whole field is astonishing and things change as we get used to them. We are not talking that much about hallucinating anymore, just 4-5 months ago you couldn't trust coding agents with extracting functionality to a separate file without typos, now splitting Git commits works almost without a hinch. The more we get used to agents getting certain things right 100% of the time, the more we'll trust them. There are many many things that I know I won't get right, but I'm absolutely sure my agent will. As soon as we start trusting e.g. a QA agent to do his job, our "project management" velocity will increase too.

Interestingly enough, the infamous "bowling score card" text on how XP works, has demonstrated inherently agentic behaviour in more way than one (they just didn't know what "extreme" was back then). You were supposed to implement a failing test and then implement just enough functionality for this test to not fail anymore, even if the intended functionality was broader -- which is exactly what agents reliably do in a loop. Also, you were supposed to be pair-driving a single machine, which has been incomprehensible to me for almost decades -- after all, every person has their own shortcuts, hardware, IDEs, window managers and what not. Turns out, all you need is a centralized server running a "team manager agent" and multiple developers talking to him to craft software fast (see tmux requirement in Gas Town).


Compare both approaches to mature actor frameworks and they don’t seem to be breaking much ice. These kinds of supervisor trees and hierarchies aren’t new for actor based systems and they’re obvious applications of LLM agents working in concert.

The fact that Anthropic and OpenAI have been going on this long without such orchestration, considering the unavoidable issues of context windows and unreliable self-validation, without matching the basic system maturity you get from a default Akka installation shows us that these leading LLM providers (with more money, tokens, deals, access, and better employees than any of us), are learning in real time. Big chunks of the next gen hype machine wunder-agents are fully realizable with cron and basic actor based scripting. Deterministically, write once run forever, no subscription needed.

Kubernetes for agents is, speaking as a krappy kubernetes admin, not some leap, it’s how I’ve been wiring my local doom-coding agents together. I have a hypothesis that people at Google (who are pretty ok with kubernetes and maybe some LLM stuff), have been there for a minute too.

Good to see them building this out, excited to see whether LLM cluster failures multiply (like repeating bad photocopies), or nullify (“sorry Dave, but we’re not going to help build another Facebook, we’re not supposed to harm humanity and also PHP, so… no.”).


If it was so obvious and easy, why didn't we have this a year ago ? Models were mature enough back then to make this work


The high level idea is obvious but doing it is not easy. "Maybe agents should work in teams like humans with different roles and responsibilities and be optimized for those" isn't exactly mind bending. I experimented with it too when LLM coding became a thing.

As usual, the hard part is the actual doing and producing a usable product.


Orchestration definitely wasn't possible a year ago, the only tool that even produced decent results that far back was Aider, it wasn't fully agentic, and it didn't really shine until Gemini 2.5 03-25.

The truth is that people are doing experiments on most of this stuff, and a lot of them are even writing about it, but most of the time you don't see that writing (or the projects that get made) unless someone with an audience already (like Steve Yegge) makes it.


Roo Code in VSCode was working fine a year ago, even back in November 2024 with Sonnet 3.5 or 3.7


Because gathering training data and doing post-training takes time. I agree with OP that this is the obvious next step given context length limitations. Humans work the same way in organizations, you have different people specializing in different things because everyone has a limited "context length".


Because they are not good engineers [1]

Also, because they are stuck in a language and an ecosystem that cannot reliably build supervisors, hierarchies of processes etc. You need Erlang/Elixir for that. Or similar implementations like Akka that they mention.

[1] Yes, they claim their AI-written slop in Claude Code is "a tiny game engine" that takes 16ms to output a couple of hundred of characters on screen: https://x.com/trq212/status/2014051501786931427


what mature actor frameworks do you recommend?


They did mention Akka in their post, so I would assume that's one of them.


Elixir/Erlang. It's table stakes for them.


Sorry, are you saying that engineers at Anthropic who work on coding models every day hadn’t thought of multiple of them working together until someone else suggested it?

I remember having conversations about this when the first ChatGPT launched and I don’t work at an AI company.


Claude Code has already had subagent support. Mostly because you have to do very aggressive context window management with Claude or it gets distracted.


Why is Yegge so.... loud?

Like, who cares? Judging from his blog recount of this it doesn't seem like anybody actually does. He's an unnecessarily loud and enthused engineer inserting himself into AI conversations instead of just playing office politics to join the AI automation effort inside of a big corporation?

"wow he was yelling about agent orchestration in March 2025", I was about 5 months behind him, the company I was working for had its now seemingly obligatory "oh fuck, hackathon" back in August 2025

and we all came to the same conclusions. conferences had everyone having the same conclusion, I went to the local AWS Invent, all the panels from AWS employees and Developer Relations guys were about that

it stands to reason that any company working on foundational models and an agentic coding framework would also have talent thinking about that sooner than the rest of us

so why does Yegge want all of this attention and think its important at all, it seems like it would have been a waste of energy to bother with, like in advance everything should have been able to know that. "Anthropic! what are you doing! listen to meeeehhhh let me innnn!"

doesn't make sense, and gastown's branding is further unhinged goofiness

yeah I can't really play the attribution games on this one, can't really get behind who cares. I'm glad its available in a more benign format now


This is nothing new, folks have been doing this for since 2023. Lots of paper on arxiv and lots of code in github with implementation of multiagents.

... the "limit" were agents were not as smart then, context window was much smaller and RLVR wasn't a thing so agents were trained for just function calling, but not agent calling/coordination.

we have been doing it since then, the difference really is that the models have gotten really smart and good to handle it.


Honestly this is one of plenty ideas I also have.

But this shows how much stuff is still to do in the ai space


The last 20 minutes or so are probably the reason this is being posted. I must admit, I haven't expected the otherwise snarky but not nearly malicious Alec to be pissed beyond recognition, but it's very clear why and how this came to be.


I actually posted this before watching it completely through… I was just reminded of recent discussions of EV vs ICE cars I have read here and Alec managed to put my thoughts much better into words than I could myself. After watching it through completely I just feel sad that what he needed to say out loud actually needs to be said at all as it should be obvious (coming from someone watching from the other side of the pond)


Watching from the other side of the pond as well and what Alec tries to get people to do is pretty sad because there is misnomer between action and reaction. While people correctly observe how the one side dismantles institutions and the very ground rules for being a civilized country, the other side has nothing more to offer than civility and evoking change by going through the very institutions that are being dismantled (which I'm absolutely sympathetic to, I prefer to stay civil whenever possible as well). But there is a real possibility that by midterms in ten months there will be nothing to be rescued by voting for a third of a congress.


For literally decades we've hoped that Linux will get like Windows in some crucial areas like sleep and hibernations support for laptops, supported first-party drivers, correct and reliable multi-monitor setups, games, etc. Never ever could I imagine that Linux parity, which I'd like to argue is closer than ever before, would be reached by Windows getting worse in exactly those areas we, the Linux freaks, got told to get Windows for -- sleep not working, graphics drivers bringing whole systems down, incoherent configuration etc. Only games got better and we have to thank Valve for that.


On a modern laptop without S3 sleep, it works better for me on Linux than on Windows currently. So at least there's that.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: