I would have suspected it too, but I’ve been struggling with OpenAI returning syntactically invalid JSON when provided with a simple pydantic class (a list of strings), which shouldn’t be possible unless they have a glaring error in their grammar.
You might be using JSON mode, which doesn’t guarantee a schema will be followed, or structured outputs not in strict mode. It is possible to get the property that the response is either a valid instance of the schema or an error (eg for refusal)
So cool to see Anthropic support this feature.
I’m a heavy user of the OpenAI version, however they seem to have a bug where frequently the model will return a string that is not syntactically valid json, leading the OpenAI client to raise a ValidationError when trying to construct the pydantic model.
Curious if anyone else here has experienced this?
I would have expected the implementation to prevent this, maybe using a state machine to only allow the model to pick syntactically valid tokens.
Hopefully Anthropic took a different approach that doesn’t have this issue.
Brian on the OpenAI API team here. I would love to help you get to the bottom of the structured outputs issues you're seeing. Mind sending me some more details about your schema / prompt or any request IDs you might have to by[at]openai.com?
yeah I have, but I think only when it gets stuck in a loop and outputs a (for example) array that goes on forever. a truncated array is obviously not valid JSON. but it'd be hard to miss that if you're looking at the outputs.
Each year they pay me $1,000 (in the form of HSA deposits, which I can invest) to do basic things like get a checkup, get a flu shot, and get a blood test. I sync my wear-able data and they pay me $1-2 each time I exercise or get enough sleep.
A reasonably common belief among people who have studied the issue is that Tether was at one point unbacked (see NY AG report), and likely fudged their numbers a bit through the use of corperate paper which was plausibly worth $1 but practically could be bought for less than $1, but they have since made enough through various investments that they could now plausibly be fully backed.
And to be fair, given basically all of the fraudulent companies that managed to pass audits, the fact that they won't even bother is a pretty strong signal.
But hey, ultimately it's just gonna blow up the economy at some point, but we'll be fine right? Right?
1. Bloomberg Businessweek — “Anyone Seen Tether’s Billions?” (Cover Story, Oct 2021) — a deep investigation into Tether’s backing, counterparties, and leadership. www.bloomberg.com/news/features/2021-10-07/crypto-mystery-where-s-the-69-billion-backing-the-stablecoin-tether
Apologies, I meant about them printing the money Friday. Trying to understand the relationship with them stabilizing bitcoin prices somehow but also why everyone is worried about USDT and it being "real"
Tether's backing / solvency become more important when it's a major provider of crypto liquidity.
If liquidity is generated by many participants, the failure of one doesn't impact the underlying asset.
If liquidity is concentrated in one participant, it increases the potential volatility of the asset, as that participant's failure can drastically limit liquidity and leave the asset open to bigger price swings.
That said, even at $1B, Tether is a smaller portion of the BTC market than it was historically.
The US Government wanted companies to build fabs in the US so it offered them money to do it.
Intel, which was one of those companies, but not the only one, took them up on the offer and was paid to begin construction on a fab in the US.
Normally when we pay businesses to do things we don't demand equity stakes in the businesses afterwards.
Notably, the biggest shareholders in Intel appear to be retirement funds of Americans - so Trump has just pilfered some money from the retirement accounts of Americans.
Figure 5 is really quite remarkable. It seems to show that normal LLMs are better at tasks where the correct answer is likely to be the next token. For tasks that require a small number of intermediate steps, current reasoning models do much better, but break down as the number of intermediate steps grow.
This seems to indicate that the next generation of models should focus on recursively solving small parts of the problem before function-calling another model to solve another small part of the problem and working it's answer into the reasoning loop.
Many seem to be citing this paper as an indication that LLMs are over - I think this indicates a clear path towards the next step function change in their abilities.
Respondology is working on a social media comment activation platform, building tools to automatically hide spam, abuse, and hate comments, understand audiences, and respond to users using GenAI/LLMs. We are a small team that just raised a recent funding round and are rapidly growing. We are looking to hire a Senior Backend Engineer with significant experience with Python, experience with Opensearch/Elasticsearch is also a plus [120k-140k + equity + bonus]. Our core tech stack is: React, a core Ruby-on-Rails monolith with Python+FastAPI microservices, Postgres, AWS. We are based out of Boulder, CO, but the engineering team is distributed across the US. Learn more here: https://respondology.com/careers/
The last Starship test flight failed - now SpaceX is probably not going to make the Mars launch window next year that they were aiming to hit. It will also probably not meet the goal NASA had to return to the moon in 2027.
I would have suspected it too, but I’ve been struggling with OpenAI returning syntactically invalid JSON when provided with a simple pydantic class (a list of strings), which shouldn’t be possible unless they have a glaring error in their grammar.