• @merari42
    link
    21 month ago

    Didn’t deepseek solve some of the data wall problems by creating good chain of thought data with an intermediate RL model. That approach should work with the tried and tested scaling laws just using much more compute.