Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> it is never going to work in the real world

DeepSeek used RL to train R1, so that is clearly not true. But ignoring that, what is your alternative? Supervised learning? Good luck finding labels if you don’t even know what the objective is.



No, let's not ignore DeepSeek: text is not the real world any more than Minecraft is the real world.

And why do I have to offer an alternative? If it's not working, it's not working, regardless of whether there's an alternative (that we know of) or not.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: