I don't think that GPT-5 Pro is much better (if better at all) than o3-pro. It's markedly slower. Output quality is comparable. It's still quite gullible and misses the point sometimes. It does seem better, however slightly, at suggesting novel approaches to problem solving. My initial impressions are that 5-pro is maybe 0-2% more knowledgeable and 5-10% more inventive/original than o3-pro. The "tone" and character of the models feel exactly the same.
I'll agree that it's superhuman and state-of-the-art at certain tasks: Formal logic, data analysis, and basically short analytical tasks in general. It's better than any version of Grok or Gemini.
When it comes to writing prose, and generally functioning as a writing bot, it's a poor model, obviously and transparently worse than Kimi K2 and Deepseek R1. (It never ceases to amaze me that the best English prose stylists are the Chinese models. It's not just that they don't write in GPT's famous "AI style," it's to the point where Kimi is actually on par with most published poets.)
I think it is, I've been using these models for 6 hours a day for almost a year. At any given time I have 2 of the max subscriptions (right now grok and openai).
I have a bug that was a complex interaction between backend and front end over websockets. The day before I was banging my head against the wall with o3 pro and grok heavy, gpt5 solved it first try.
I think its also true that most people arent pushing the limits on the models, and dont even really know how to correctly push the limits. Which is also why openai is not focussed on the best models
Similar usage as me, but I don't see a difference between o3-pro and 5-pro. Sounds odd, but my impression is that o1-pro was better at creating complex independent small functions than o3-pro/5-pro.
Actually will agree that o1 pro was better than o3 at really deep bug finding/coding analysis. Which is also why i have the theory that they could just turn up the compute to show better results, but dont do to cost. O3 and GPT5 seem heavily quantized, o1 pro was more raw
Another thing I’ll add though, is o3 pro is better through the api than the chat website. They clearly constrain it unless you’re paying the absurd api cost
To each their own, but I find the idea of ai-generated poetry sad as hell. I simply can’t see poetry as a collection of evocative words judged without context, in a vacuum— is poetry not both an activity and a relationship to most people? A person deftly portraying some difficult-to-express facet of the human experience and just maybe it viscerally strikes a chord with other people? I just don’t understand how people don’t value the fundamental humanity of that process. Even prose. James Baldwin stories, word for word, would land a hell of a lot differently if they were written and published by Hemingway.
I 100% agree. I am inclined to think an AI may be able to develop a sense of what words carry in future -- they can analyse it -- but it still lacks real experience.
Plus their creative output in literary quality is dreary, dull, and dire. That's why I was so curious for the OP to share examples.
Stylized prose is where Claude 3 Opus particularly shines due to its character training and multilingual performance. It's plagued with claudeisms and has a ton of other shortcomings, but it's still better than any current model at this, including K2, R1, and especially Claude 4. Too bad Anthropic basically reversed their direction on creative writing, despite reporting some improvements each time (which don't seem to be true in practice).
I'll agree that it's superhuman and state-of-the-art at certain tasks: Formal logic, data analysis, and basically short analytical tasks in general. It's better than any version of Grok or Gemini.
When it comes to writing prose, and generally functioning as a writing bot, it's a poor model, obviously and transparently worse than Kimi K2 and Deepseek R1. (It never ceases to amaze me that the best English prose stylists are the Chinese models. It's not just that they don't write in GPT's famous "AI style," it's to the point where Kimi is actually on par with most published poets.)