Some people say we're near the end of pre-training scaling, and RLHF etc is goin...

Some people say we're near the end of pre-training scaling, and RLHF etc is going to be more important in the future. I'm interested in trying out systems like https://github.com/OpenPipe/ART to be able to train agents to work on a particular codebase and learn from my development logs and previous interactions with agents.