> Claude Opus 4.5 in a casual Claude Code session, approximately matching the best human performance in 2 hours
Is this saying that Claude matched the best human performance, where the human had two hours? I think that is the correct reading, but I'm not certain they don't mean that Claude had two hours, and matched the best human performance where the human had an arbitrary amount of time. The former is impressive but the later would be even more so.
> Someone could be taking this position in this situation because they're highly skeptical that the Americans involved in this have the ability or desire to proceed in a way that will result in a minimum of casualities or in a way that will bring about real democractic change to the region.
> People want an Eisenhower doing these kinds of things
Why would people who don't want Trump doing it want an Eisenhower doing it? He helped overthrow democratically elected Árbenz in Guatemala with even weaker justifications than Trump overthrowing Maduro (Maduro at least seems to lack popular support and probably cheated in elections).
Eisenhower:
Overthrow of Árbenz to protect fruit company profits > series of military dictators > 30+ years of civil war where the US-backed government committed a genocide against Maya people
Given what I understand about the nature of competitive programming competitions, using an LLM seems kind of like using a calculator in an arithmetic competition (if such a thing existed) or a dictionary in a spelling bee.
I feel like it’s more like using an electronic dictionary in a spelling bee that already allowed you to use a paper dictionary. All it really does is demonstrate that the format isn’t suited to be a competition in the first place.
Which is why I think it’s great they dropped the competitive part and have just made it an advent calendar. Much better that way.
This is also why I'm skeptical of claims that it would be impossible (or nearly so) for governments to meaningfully regulate AI R&D/deployment (regardless of whether or not they should). The "you can't regulate math" arguments. Yeah, you can't regulate math, but using the math depends on some of the most complex technologies humanity has produced, with key components handled by only one or a few companies in only a handful of countries (US, China, Taiwan, South Korea, Netherlands, maybe Japan?). US-China cooperation could probably achieve any level of regulation they want up to and including "shut it all down now." Likely? Of course not. But also not impossible if the US and China both felt sufficiently threatened by AI.
The only thing that IMO would be really hard to regulate would be the distribution of open-weight models existing at the time regulations come into effect, although I imagine even that would be substantially curtailed by severe enough penalties for doing so.
If they're truly Chinese state-sponsored actors, does it really matter if their actions/methods are exposed? What is Anthropic going to do, send the Anthropic Police Force to China to arrest them?
I suppose I could see this argument if their methods were very unique and otherwise hard to replicate, but it sounds like they had Claude do the attack mostly autonomously.
This definition makes sense, but in the context of LLMs it still feels misapplied. What the model providers call "guardrails" are supposed to prevent malicious uses of the LLMs, and anyone trying to maliciously use the LLM is "explicitly trying to get off the road."
> I'd really hate to see the world go down the path of gatekeeping tools behind something like ID or career verification.
This is already done for medicine, law enforcement, aviation, nuclear energy, mining, and I think some biological/chemical research stuff too.
> It's a tradeoff we need to be willing to make.
Why? I don't want random people being able to buy TNT or whatever they need to be able to make dangerous viruses*, nerve agents, whatever. If everyone in the world has access to a "tool" that requires little/no expertise to conduct cyberattacks (if we go by Anthropic's word, Claude is close to or at that point), that would be pretty crazy.
* On a side note, AI potentially enabling novices to make bioweapons is far scarier than it enabling novices to conduct cyberattacks.
> If everyone in the world has access to a "tool" that requires little/no expertise to conduct cyberattacks (if we go by Anthropic's word, Claude is close to or at that point), that would be pretty crazy.
That's already the case today without LLMs. Any random person can go to github and grab several free, open source professional security research and penetration testing tools and watch a few youtube videos on how to use them.
The people using Claude to conduct this attack weren't random amateurs, it was a nation state, which would have conducted its attack whether LLMs existed and helped or not.
Having tools be free/open-source, or at least freely available to anyone with a curiosity is important. We can't gatekeep tech work behind expensive tuition, degrees, and licenses out of fear that "some script kiddy might be able to fuzz at scale now."
Yeah, I'll concede, some physical tools like TNT or whatever should probably not be available to Joe Public. But digital tools? They absolutely should. I, for example, would have never gotten into tech were it not for the freely available learning resources and software graciously provided by the open source community. If I had to wait until I was 18 and graduated university to even begin to touch, say, something like burpsuite, I'd probably be in a different field entirely.
What's next? We are going to try to tell people they can't install Linux on their computers without government licensing and approval because the OS is too open and lets you do whatever you want? Because it provides "hacking tools"? Nah, that's not a society I want to live in. That's a society driven by fear, not freedom.
I think you're overestimating how much real damage someone can cause with burpsuite and "a few youtube videos." I'd imagine if you pick a random person off the street, subject them to a full month's worth of cybersecurity YouTube videos, and hand them an arsenal of traditional security tools, that they would still be borderline useless as a black-hat hacker against all but the absolute weakest targets. But if instead of giving them that, you give them an AI that is functionally a professional security researcher in its own right (not saying we're there yet, but hypothetically), the story is clearly very different.
> Yeah, I'll concede, some physical tools like TNT or whatever should probably not be available to Joe Public. But digital tools?
Digital tools can affect the physical world though, or at least seriously affect the people who live in the physical world (stealing money, blackmailing with hacked photos, etc.).
To see if there's some common ground to start a debate from, do you agree that at least in principle there are some kinds of intelligence that are too dangerous to allow public access to? My extreme example would be an AI that could guide an average IQ novice in producing biological weapons.
Are you saying this because you think that people should still try to learn things for personal interest in a world where AI makes learning things to make money pointless (I agree completely, though what I spend time learning would change), or do disagree with their assessment of where AI capabilities are heading?
> So like these are less serious issues if you are paid an extra $1-200k/ year
Ok but to be fair most people in the US aren't making "extra $1-200k / year" over a person in Europe. They aren't even making $100k / year to begin with.
Almost 40% of the USA is on medicare, medicaid, or entitled to VA benefits or military healthcare. It's only a narrow majority that depends on unsubsidized private healthcare, and those people skew in the upper income levels.
You believe the top 60% of the nation skew in the upper income levels? Median pay is $61k a year for the entire country. The top 1% skews to the upper income levels. The rest are charged $30 for a dose of aspirin and can't afford it.
There are numbers on this, and their comment is probably directionally correct; the median household with private insurance earns more than 400% of household FPL (KFF). By subtracting Medicaid and fixed-income seniors from the picture, you are sharply biasing the median upwards.
I would say if you ignore the poorest 40% of the population, you've got quite the slim margin to go before you are no longer talking about "Most" Americans, which the OP was pretty explicitly talking about.
He was saying "Most people in the US" don't make 100-200k more, and that they probably don't even make 100k. This was in response to the generalization that "people from other countries ... underestimate how well paid people in the US often are".
Now there was talk of getting the political motivation to change things, so I guess everyone is assuming Medicaid/Medicare/VA recipients don't want to change the system, but that wasn't really established, nor was that really being refuted.
I don't think I could be any clearer that I am (1) talking about Americans with private health insurance and (2) not making a normative judgement about which system is better, but rather a positive claim about the political challenge of changing the system (its large group of stakeholders who are better off under it).
Oh I'm clear about the demographic you are trying to discuss, my point was I'm not sure this all stemmed from a discussion about that specific demographic. It started at "people in US", then went to "most", then by the time you got involved in the thread you were defending a statement about people with private health insurance.
I could have made this comment at the level where it went off the rails, but I thought making it at the leaf level would help everyone involved see the deviation between what was said and what was being argued.
i think in this case, if you're at all familiar with what US hospitals charge for the small stuff, it's a safe assumption that when someone says aspirin costs $30 a dose, they're not talking about buying it at a CVS. of many folks on hacker news dot com i trust you to bridge that gap instead of nitpicking!
That's an odd argument to make in this thread, because whatever the drivers of burdensome consumer health spending are, they're not overpriced hospital aspirin.
So what about this? It is a question, not meant as a counter.
Although I have to say the rosy picture some paint here about the high incomes is counter to anything I ever heard - and saw, although I left the US in the early 2000s, after having lived there for almost a decade (still mostly paid from Germany, never ready to make a complete move).
By the way, Europeans don't quite all have a "nationalized healthcare system". Germany, for example, has "Krankenkassen" but also private insurance, and the "Krankenkassen" are private organizations.
We pay health insurance and get to choose the provider, those with higher incomes can switch to complete private insurance. We also have lots of our own problems and increasing costs because of immigration but more so aging population.
However, I personally know several people who had severe illnesses for a long time, and their normal "Krankenkassen" insurance never made any problems. One person with plenty of money, whose wife was dying, even asked US medical experts if he should come to the US with her, and those US experts said he should stay where he is, the German univ3ersity hospital right next door had some of the leading therapies in the field. She lived five more years instead of dying after less than half a year with the standard therapy, every single expense paid for with the standard insurance, additional private insurance unnecessary. Similar with my stepfather, who had soooo many severe conditions, and yet every single item down to the special medical bed brought into our house so that he could finally die at home was paid without question.
The problems are with more mundane expenses, e.g. glasses, or the dentist, where only some of the treatments are covered. The really expensive illnesses seem to be better covered than the more common and much simpler problems.
Careful there, thats a rightwing propaganda point. Immigration into an aging society does not raise healthcare costs, it lowers it. See https://archive.is/XxfTH (and note that this is a NZZ article, a right-wing publication by now, so not slanted towards being immigration friendly).
I did not try to make a political statement, what happened here, anyway???
I have no idea what there is to defend - even if you assume they will all get high-paying jobs some ay, for the first few years costs will increase while they either learn the language, are not allowed to work (status pending), or get minimum wage jobs (food delivery and parcel services at least in my city now is dominated by immigrants).
Even with your most positive outlook, initially there will be lots more people and the same system (number of doctors), and the numbers of payers increases slowly.
I even wrote "but more so aging population", conveniently overlooked in this strange politicized discussion.
I am NOT against immigration!!! Don't make stuff up people.
You are misreading this exchange. You just got a fact wrong, but thus repeated a lie that is planted often by nazis - and it's easy to get mislead. Anyway, you did not get criticized for an imagined stance on immigration, but those answers are to a comment I assume you missed, the one by nxor?
You wrote, incorrectly:
> We also have lots of our own problems and increasing costs because of immigration
As the NZZ article explained, health care / Krankenkassen are the area where it is the clearest that immigration is an economic benefit. Look at statements like the section title "Krankenkassen profitieren", followed by "Ein grosser Profiteur der Zuwanderung sind dagegen wohl die Krankenkassen." and the ending paragraph of said section:
> Laut dieser Analyse gab es in diesen sieben Jahren einen Wanderungssaldo aus dem Ausland in Höhe von 4,7 Millionen Menschen in das System der GKV. Für das Jahr 2019 ergab sich daraus eine Entlastung der GKV über etwa 8 Milliarden Euro (umgerechnet 0,6 Beitragssatzpunkte). Seit 2019 hätten sich die Rahmenbedingungen aber deutlich geändert, heisst es dazu von der TK.
So the numbers we have do not support that part of your statements. And I'm not aware of newer numbers that say the contrary - the recent cost increase sees completely different reasons for example, as in https://www.mdr.de/nachrichten/deutschland/panorama/krankenk..., the "but more so aging population" part of your comment fits there.
> Of course, far-left demagogues like you would advocate for flooding a country with uneducated criminals
We've obviously banned this account. Please stop registering accounts just to keep breaking the guidelines. It's boring and a waste of everyone's time.
And while European countries have various forms of nationalized welfare, their salaries are so low that they would be automatically eligible for the US' welfare too!
our blocs aren't that different
except in the US middle class and upper middle class
It's hilariously out of touch, but it's what you should expect from the HN bros. They live in a bubble.
I'm from the eu and earn far less than these American techbros do, but far more than my American friends who work normal jobs. They work at the DMV, a supermarket, or general office work. You know, normal people. The vast majority.
Making all of those things cheaper is great, as long the automation isn't also making everyone poorer at an equal or faster rate. It doesn't really help if house prices and food prices are cut in half if most people lose their employment because of automation.
I think the concern is that true human+ AGI and advanced robotics would obsolete so many roles that it doesn't matter if things can be made more efficiently, because nobody will have any money at all. If/when AI can do my job better than me, it isn't giving me leverage, it is removing all leverage I have as someone who puts food on the table through labor.
In the interim period before that happens then sure, the automation is great for some people who can best leverage it.
On the path to “AGI” I would expect a lot of short-term pain as people lose their jobs while unemployment is still around normal levels. But if unemployment rises too much, we would pass laws to protect people, like greater corporate taxes to fund things like UBI.
But honestly, if we have this level of automation it feels like it would be very hard to predict how society will evolve. I would expect our current model of work-to-live to become untenable, and we’d move to something else. I doubt that transition will be easy.
Is this saying that Claude matched the best human performance, where the human had two hours? I think that is the correct reading, but I'm not certain they don't mean that Claude had two hours, and matched the best human performance where the human had an arbitrary amount of time. The former is impressive but the later would be even more so.
reply