Hacker Newsnew | past | comments | ask | show | jobs | submit | iamronaldo's commentslogin

77% arc agi 2 Wild

Seems a bit more then that "Developed with the nonprofit Policing Lab and Perrone Robotics, the SUV can drive itself, detect suspicious activity through artificial intelligence-powered cameras and even deploy drones for aerial surveillance."


I think the best thing about something like this would be to get people to Slow Down.


Alright then. AI and drones totally makes it cooler, or at least more au courant for marketing purposes. Let's say CCTV++. :-)


Sota is 5.2 pro or 5.2 codex or 5.2 extra high not 5.1 (I know I know it's confusing)


Not least of all because I misread that as Sora at first, lol


That was quick


My first thought was "they must not be seeing as many Claude Code conversions as they hoped"


I bet they just wanted to counter Gemini 3 and stay on top of the leaderboards for coding, and were preparing this for a while to push out alongside Gemini 3.


Whenever one of them releases a milestone release the rest start publishing big milestones too. I'm waiting for Opus 5 next.



Live now


Website link not quite live yet https://x.ai/news/grok-4-1 But it sets new records on lmarena (1483 for thinking and 1465 for fast)


no new benchmarks on hard sciences, coding etc.




This may be the most relevant xkcd yet. It answered all questions I had about this, thank you.


Bizarrely uncompetitive is referencing the 5 uses per day not the performance itself


Cost per task went from $200 with o3 preview To $1.5 with o3 medium


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: