• @moseschrute
    link
    English
    118 days ago

    Interesting! I wonder if this new method of training will improve performance or if it only benefits the efficiency of training the model. I don’t know too much about R1, and I had no idea ByteDance was also working on LLMs.