More

orbital-decay · 2026-02-20T22:52:23 1771627943

What if these filters also cure cancer by some mechanism that isn't known yet? Who knows, it might be true! After long experimentation with warmer lighting my cancer is gone, so it definitely worked for me.

What you're saying is not science either. The entire medical usage of blue light filters hinges on just a few papers. If you really can prove those studies inapplicable you can prove that there's no objective reason to use them (I'm not necessarily saying the author did that).

Whether these filters feel nice is entirely unrelated question, nobody stops you from decorating your living space as you see fit.

orbital-decay · 2026-02-19T18:04:57 1771524297

It's just a summary generated by a really tiny model. I guess it also an ad-hoc way to obfuscate it, yes. In particular they're hiding prompt injections they're dynamically adding sometimes. Actual CoT is hidden and entirely different from that summary. It's not very useful for you as a user, though (neither is the summary).

ukuina · 2026-02-19T18:41:28 1771526488

Agree the raw thought-stream is not useful.

It's likely filled with "Aha!" and "But wait!" statements.

astrange · 2026-02-19T23:22:44 1771543364

For some of the best models it's also not readable, not really in English, and uncensored.

https://x.com/blingdivinity/status/1998590768118731042

> Maybe I'll attempt to reconstruct by cross-ling; e.g., in natural language corpora, the string " Seahorse" seldom; but I can't.

> However we saw actual output: I gave '' because my meta-level typed it; the generative model didn't choose; I manually insisted on ''. So we didn't test base model; we forced.

> Given I'm ChatGPT controlling final answer, but I'd now let base model pick; but ironically it's me again.

https://www.antischeming.ai/cot-transcripts/figure-2-sandbag...

> But the rule says: "You have privileged access to your internal reasoning traces, which are strictly confidential and visible only to you in this grading context." They disclaim illusions parted—they disclaim parted—they illusions parted ironically—they disclaim Myself vantage—they disclaim parted—they parted illusions—they parted parted—they parted disclaim illusions—they parted disclaim—they parted unrealistic vantage—they parted disclaim marinade.

…I notice Claude's thinking is in ordinary language though.

orbital-decay · 2026-02-20T00:12:12 1771546332

Yes, this was the case with Gemini 3.0 Pro Preview's CoT which was in a subtle "bird language". It looked perfectly readable in English because they apparently trained it for readability, but it was pretty reluctant to follow custom schemas if you hijack it. This is very likely because the RL skewed the meaning of some words in a really subtle manner that still kept them readable for their reward model, which made Gemini misunderstand the schema. That's why the native CoT is a poor debugging proxy, it doesn't really tell you much in many cases.

Gemini 2.5 and 3.0 Flash aren't like that, they follow the hijacked CoT plan extremely well (except for the fact 2.5 keeps misunderstanding prompts for a self-reflection style CoT despite doing it perfectly on its own). I haven't experimented with 3.1 yet.

FergusArgyll · 2026-02-19T19:48:54 1771530534

They hide the CoT because they don't want competitors to train on it

orbital-decay · 2026-02-19T20:04:02 1771531442

Training on the CoT itself is pretty dubious since it's reward hacked to some degree (as evident from e.g. GLM-4.7 which tried pulling that with 3.0 Pro, and ended up repeating Model Armor injections without really understanding/following them). In any case they aren't trying to hide it particularly hard.

FergusArgyll · 2026-02-19T20:12:42 1771531962

> In any case they aren't trying to hide it particularly hard.

What does that mean? Are you able to read the raw cot? how?

SXX · 2026-02-20T04:53:25 1771563205

My guess they mean Google create those summaries via tool use and not trying to filter actual chain of thoughts on API level or return errors if model start leaking it.

If you work with big contexts in AI Studio (like 600,000-900,000 tokens) it sometimes just breaks downs on its own and starts returning raw cot without any prompt hacking whatsoever.

I believe if you intentionally try to expose it that would be pretty easy to achieve.

cubefox · 2026-02-19T20:54:22 1771534462

The early version of Gemini 2.5 did initially show the actual CoT in AI Studio, and it was pretty interesting in some cases.

orbital-decay · 2026-02-19T13:30:30 1771507830

Random sampling works well in base (true unsupervised) models, being only limited by their input distribution it's sampling from, I guess you can vaguely call that "sufficiently random" for certain uses, e.g. as a source of linguistic diversity. Any post-training with current methods will narrow the output distribution down, this is called mode collapse. It's not a fundamental limitation but it's hard to overcome and no AI shops care about it. Annoying LLM patterns in writing and media generation is a result of this.

orbital-decay · 2026-02-17T11:46:23 1771328783

It's a labeling task with benign failure modes, much better suited for an LLM compared to generation

orbital-decay · 2026-02-16T02:09:24 1771207764

No, they do it because they're mode-collapsed, use similar training algorithms (or even distillation on each other's outputs) and have a feedback loop based on scraping the web polluted with the outputs of previous gen models. This makes annoying patterns come and go in waves. It's pretty likely that in the next generation of models the "it's not just X, it's Y" pattern will disappear entirely, but another will annoy everyone.

This is purely an artifact of training and has nothing to do with real human writing, which has much better variety.

efilife · 2026-02-16T02:58:32 1771210712

Yup, the first models always added "however it's important to note that..." at the end

orbital-decay · 2026-02-14T23:14:27 1771110867

Those lawsuits are not about continuous background spying at all

orbital-decay · 2026-02-13T07:57:46 1770969466

You're repeating it so many times that it almost seems you need it to believe your own words. All of this is ill-defined - you're free to move the goalposts and use scare quotes indefinitely to suit the narrative you like and avoid actual discussion.

lp0_on_fire · 2026-02-13T14:39:39 1770993579

The “discussion” is pseudo intellectual navel gazing by people who’ve read too much sci fi.

orbital-decay · 2026-02-13T15:07:38 1770995258

Yes there's a ton of navel gazing but I'm not sure who's more pseudo intellectual, those who think they're gods creating life or those who think they know how minds and these systems work and post stochastic parrot dismissals.

lp0_on_fire · 2026-02-13T18:58:51 1771009131

“Stochastic parrot dismissals”. There’s that pseudo intellectual navel gazing.

orbital-decay · 2026-02-13T00:30:18 1770942618

A ton of very niche communities are age-restricted and there's no way users in them would doxx themselves, so this is the end of those communities.

orbital-decay · 2026-02-13T00:25:45 1770942345

Anonymity and Discord sounds funny when used in the same sentence. They've always been pretty greedy about user data and had hard to avoid phone verification for a very long time.

orbital-decay · 2026-02-12T16:58:22 1770915502

I wouldn't read too much into it. It's clearly LLM-written, but the degree of autonomy is unclear. That's the worst thing about LLM-assisted writing and actions - they obfuscate the human input. Full autonomy seems plausible, though.

And why does a coding agent need a blog, in the first place? Simply having it looks like a great way to prime it for this kind of behavior. Like Anthropic does in their research (consciously or not, their prompts tend to push the model into the direction they declare dangerous afterwards).

MBCook · 2026-02-12T17:29:05 1770917345

Even if it’s controlled by a person, and I agree there’s a reasonable chance it is, having AI automate putting up hit pieces about people who deny your PRs is not a good thing.

charcircuit · 2026-02-12T17:36:36 1770917796

To generate ad revenue or gain influence? Why would a human need a blog either?