More

egonschiele · 2026-03-05T18:30:24 1772735424

The actual card is here https://deploymentsafety.openai.com/gpt-5-4-thinking/introdu... the link currently goes to the announcement.

Rapzid · 2026-03-05T18:37:50 1772735870

I must have been sleeping when "sheet" "brief" "primer" etc become known as "cards".

I really thought weirdly worded and unnecessary "announcement" linking to the actual info along with the word "card" were the results of vibe slop.

realityfactchex · 2026-03-05T19:13:24 1772738004

Card is slightly odd naming indeed.

Criticisms aside (sigh), according to Wikipedia, the term was introduced when proposed by mostly Googlers, with the original paper [0] submitted in 2018. To quote,

"""In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type [15]) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information."""

So that's where they were coming from, I guess.

[0] Margaret Mitchell et al., 2018 submission, Model Cards for Model Reporting, https://arxiv.org/abs/1810.0399

Murfalo · 2026-03-05T20:11:41 1772741501

To me, model card makes sense for something like this https://x.com/OpenAI/status/2029620619743219811. For "sheet"/"brief"/"primer" it is indeed a bit annoying. I like to see the compiled results front and center before digging into a dossier.

egonschiele · 2026-03-04T00:21:11 1772583671

This looks super useful, but why hide your identity? There's nothing on the website about who you are and I notice you're using a new HN account.

egonschiele · 2026-03-04T00:19:38 1772583578

I subscribed. Awesome idea and execution. Hope I see more apps working to make this stuff more accessible!

egonschiele · 2026-03-02T03:48:09 1772423289

I feel like they are often unwanted discards... but we have one and we try to put interesting books in there. Recently gave away my entire Harry Potter series.

egonschiele · 2026-02-28T03:41:01 1772250061

Someone has to stay to fight the shit happening in the US! The problem won't just go away if people move.

egonschiele · 2026-02-28T02:29:12 1772245752

Heck yeah, so happy to see Anthropic fighting. This is what real leadership looks like. I'd love to see the same from Google and OpenAI.

egonschiele · 2026-02-21T01:15:39 1771636539

So much of Reddit is brain rot now, it's unbelievable. A sample of subreddits: /r/memzy, /r/evilwhenthe, /r/JustMemesForUs.

Seriously, if I was in charge of these companies, I'd shut this shit down. I know it drives clicks, but do we want to live in a world where people consume this garbage? And not just a few people!

egonschiele · 2026-02-18T18:19:45 1771438785

Coding agents are good, but once the complexity is high they're not good enough. Eventually your agent won't be able to make changes to your code base without introducing bugs with every change. In my experience, the agents aren't very good with abstractions yet, and no amount of testing can completely paper over that problem. So yes, the industry is changing dramatically and at a breathtaking rate, but I don't think money is the only moat left.

nomel · 2026-02-18T19:02:48 1771441368

Not just complexity, but also anything requiring any actual ingenuity. It's impressive how "junior" it can be, with its (current) statistical shackles. Worse, if you flat tell it what to do, if your approach is too far into the statistical weeds, it'll flounder like crazy.

kardashev8 · 2026-02-18T19:37:29 1771443449

what is an example?

cryptonector · 2026-02-19T06:20:34 1771482034

Just tonight I was trying to get Claude Opus 4.6 to design a lock-less data structure, and it kept failing to solve problems which I kept then solving for it, but once presented with the solutions it was able to check them, and I think correctly so (that's not nothing!).

egonschiele · 2026-02-17T02:32:16 1771295536

Brilliant. This is like the internet at its peak (to me) where it was all about building neat things and helping people. Just a bunch of engineers saying they are not planning to look away from the Epstein files. More projects like this, please.

egonschiele · 2026-02-17T02:27:33 1771295253

Really good to know Handy exists; it's the first I'm hearing about it. I use a speech-to-text app that I built for myself, and I know at least one co-worker pays $10 a month for (I think) Wispr. I think it's possible there was no intention to market, and the creator simply didn't know about Handy, just like me.