• @brucethemoose
    link
    English
    13
    edit-2
    2 days ago

    I had suspicious before, but I knew they were screwed when Qwen 2.5 came out. 32Bs and 72Bs nipping at their heels… O3 was a joke in comparison.

    And they probably aren’t fudging anything. Base Deepseek isn’t like crazy or anything, and the way they finetuned it to R1 is public. Researchers are trying to replicate it now.