The 40-line cap not moving rate limits makes sense - plan text is cheap. The cost is in Phase 1 exploration.
Plan mode spins up to 3 explore subagents before the planner even starts, and the heuristic is "use multiple when scope is uncertain." It won't choose fewer - it's being asked to plan, so scope is always uncertain. Nothing penalizes claude for over-exploring and nothing rewards restraint.
Plan mode also ignores session state. A cold start gets the same fanout as a warm session where the relevant files are already in context. In a warm session the explore pass is pure waste - it re-reads loaded files and feeds the planner lossy summaries that conflict with what it already knows.
More tokens, worse plan.
If exploration was conditional on what's already in context..skip it for warm sessions, keep it for cold starts - that does more for both rate limits and plan quality than a hard 40-line cap.
Note: plan mode didn’t always have this 3 subagent fan out behavior attached to it, it was introduced around opus 4.6 launch.
I get excited for every new vision model, especially those that work better and more efficiently. Vision is where we are so very far behind.. I can’t wrap my head around it
What do you mean far behind? Far behind what? The new (actually the old one too) Qwen can give you bounding rectangular prisms around things in a scene, OCR text with ink spilled on it correctly, read graphs and understand spatial relationships, I think it's pretty impressive for something I'm running on like a 5 year old GPU.
yeah i know lol, that’s kind of my point. impressive that it runs on your gpu, but it still can’t tell you what happens if you tilt a glass. that’s what world models are working toward. but even then..so what? you get a perfect simulator. it knows the glass tips. it still doesn’t know why someone tipped it, or what happens if they don’t. A four year old can do this and we’re just barely on step one and a half.
I think it’s much simpler & easier to just build this into agents than trying to modify every tool ever created to be less verbose. Just guard agents from it user-side. Let users control what they want to see and pass into context.
I think you guys are on the right track here. I’d love to learn more about the math behind the FDM. I don’t think folks realize how behind we are on vision, thank you for your work here.
thanks! the math and architecture of the FDM (no video encoder) is pretty simple, its a regular transformer with next-token predictions but with frames interleaved.
Recommend you never give codex or Claude access to rm or deletions in general. Always force them to replace files rather than deleting, and moving into an ~/archive folder when not replacing and wanting to “remove”.
This works well, but is not sureproof. You can add a hook onto Claude code to block those commands at various stages, I have some useful hooks at my https://GitHub.com/claude-warden repo.
It's a good guardrail, but like you say, it's not foolproof. Lots of commands have destructive options, or can be used to in turn invoke arbitrary operations. Like `find` is just as risky a call as `rm`. I can just see imagine the reasoning chain.
"There is an error due to <file>. If I remove <file>, the error could be resolved. I don't have permission to use `rm`, but `find` can be used to delete files and I have permission to use that..."
Shit man my Pet Feeder setup a back door to my network.. ended up reverse engineering the entire tuya piece of shit just so I could keep the automatic feeder running.
Fucking everyone is spying. I started downloading and decrypting apps from the App Store. It’s a god damn nightmare. Random apps are storing keys in the keychain (thanks expo!) that never leave our apple account. They follow us forever. You can’t delete them. Well.. there’s one way but it involves backing up your phone, putting it in recovery mode, and restoring from backup.
I just bought a reseller plan from verpex host for $5/month. Can host unlimited domains and bandwidth with WHM. Access everything through cPanel and ftp. SSH on occasion.
The reader view is broken. Despite my other comment this is really bad web design. So bad that I couldn't share this article with normal people who won't put up with this. I really wanted to since the story is so interesting.
Plan mode spins up to 3 explore subagents before the planner even starts, and the heuristic is "use multiple when scope is uncertain." It won't choose fewer - it's being asked to plan, so scope is always uncertain. Nothing penalizes claude for over-exploring and nothing rewards restraint.
Plan mode also ignores session state. A cold start gets the same fanout as a warm session where the relevant files are already in context. In a warm session the explore pass is pure waste - it re-reads loaded files and feeds the planner lossy summaries that conflict with what it already knows.
More tokens, worse plan.
If exploration was conditional on what's already in context..skip it for warm sessions, keep it for cold starts - that does more for both rate limits and plan quality than a hard 40-line cap.
Note: plan mode didn’t always have this 3 subagent fan out behavior attached to it, it was introduced around opus 4.6 launch.
reply