Not necessarily. For example Anthropic's ConstitutionalAI (CAI) leverages the mo...

K0balt · on Aug 5, 2023

Yeah, it makes some sense that you could use a more intense introspection to train weaker ones… I wonder what the human analogue for that looks like.

Maybe working up a proof and then quizzing yourself on it?

As long as we get >N supervision and the difference is more than the model retrograde, it seems that could work. But it seems like there is a definite limit to that. The N-n1 difference will only stay above the improvement delta up to a point.

visarga · on Aug 6, 2023

The model would learn from feedback, not just regurgitate the training set, as long as the model is part of a system that can generate this feedback. AlphaGo Zero had self play for feedback. Robots can check task execution success. Even chatting with us generates feedback to the model.