Hm, I thought that these salaries were offered to actual "giants" like Jeff Dean or someone extremely knowledgeable in the specifics of how the "business side" of AI might look like (CEOs, etc). Can someone clarify what is so special about this specific person? He is not a "top tier athlete" - I looked at his academic profile and it does not seem impressive to me by any measure. He'd make an alright (not even particularly great) assistant professor in a second tier university - which is impressive, but is by no means unique enough to explain this compensation.
I think the key was multimodality. Meta made a big move in combining texts, audio, images. I remember imagebind was pretty cool. Allen AI has published some notable models, and Matt seems to have expertise in multimodal models. Molmo looks really cool.
"Our key innovation is a new collection of datasets called PixMo that includes a novel highly-detailed image caption dataset collected entirely from human annotators using speech-based descriptions, and a diverse mixture of fine-tuning datasets that enable new capabilities. Notably, PixMo includes innovative 2D pointing data that enables Molmo to answer questions not just using natural language but also using non verbal cues. We believe this opens up important future directions for VLMs enabling agents to interact in virtual and physical worlds. The success of our approach relies on careful choices for the model architecture details, a well-tuned training pipeline, and most critically the quality of our newly collected datasets, all of which we have released."
This is a solid engineering project with a research component - they collected some data that ended up being quite useful when combined with pre-existing tech. But this is not rocket science and not a unique insight. And I don't want to devalue the importance of solid engineering work, but you normally don't get paid as much for non-unique engineering expertise. This by no means sounds unique to me. This seem like a good senior-staff research eng project in a big tech company these days. You don't get paid 250M for that kind of work. I know very talented people who do this kind of work in big tech, and from what I can tell, many of them appear to have much more fundamental insight and experience, and led larger teams of engineers, and their comp does not surpass 1-2M tops (taking a very generous upper bound).
A PhD dropout with an alright (passable) academic record, who worked in a 1.5-tier lab on a fairly pedestrian project (multimodal llms and agents, sure), and started a startup.. Reallyttrying to not sound bitter, good for him, I guess, but does it indicate that there's something really fucked up with how talent is being acquired?
You bring up the only relevant data point at the end, as a throw in. Nobody outside of academia cares about your PhD and work history if you have a startup that is impressive to them. That's the only reason he's being paid.