Summary

Alibaba has launched Qwen 2.5-Max, an AI model it claims outperforms DeepSeek-V3, OpenAI’s GPT-4o, and Meta’s Llama-3.1-405B.

The release, coinciding with Lunar New Year, reflects mounting competition in China’s AI sector after DeepSeek’s rapid rise.

DeepSeek’s recent advancements have pressured Chinese rivals like ByteDance and Baidu to upgrade their models and cut prices.

DeepSeek’s founder downplays price wars, focusing on artificial general intelligence (AGI). The company’s lean, research-focused structure contrasts with China’s tech giants, which face challenges in AI innovation.

  • @[email protected]
    link
    fedilink
    English
    681 day ago

    Well, the models start comin’ and they don’t stop comin’…

    The US tech sector has just been completely disrupted. Turns out decades of slashing public education and demonizing “liberal” colleges is starting to catch up. Even Elmo himself said that H1B visas are critical because the US simply isn’t producing enough talent, but he and the other tech billionaires didn’t realize that money can’t buy everything, as they are now being shown with their pants down.

    • @[email protected]
      link
      fedilink
      English
      201 day ago

      Well, the models start comin’ and they don’t stop comin’…

      Got my RTX, gonna hit the ground runnin’…

      • @pHr34kY
        link
        English
        520 hours ago

        Didn’t make sense just to train for fun.

        • @locahosr443
          link
          English
          320 hours ago

          Gonna steal some data it’s free to learn

    • @mlg
      link
      English
      71 day ago

      I read this entire comment synced to smashmouth lmao

  • Em Adespoton
    link
    fedilink
    English
    382 days ago

    DeepSeek’s “big change” isn’t the performance of its model though; it’s that it is fully open and operates on a fraction of the resources.

    Is alibaba’s model also open weights, open reasoning, free for anyone to run, and runnable (and trainable) on consumer hardware?

    • trevor
      link
      fedilink
      English
      371 day ago

      Call it “open weight” if you want, but it’s not “fully open”. The training data is still proprietary, and the model can’t be accurately reproduced. It’s proprietary in the same way that llama is proprietary.

      • @[email protected]
        link
        fedilink
        English
        9
        edit-2
        1 day ago

        But I could use it as a starting point for training and build from it with my own data. I could fork it. I couldn’t fork llama, I don’t have the weights.

        • trevor
          link
          fedilink
          English
          101 day ago

          You can also fork proprietary code that is source available (depending on the specific terms of that particular proprietary license), but that doesn’t make it open source.

          Fair point about llama not having open weights though. So it’s not as proprietary as llama. It still shouldn’t be called open source if the training data that it needs to function is proprietary.

  • r00ty
    link
    fedilink
    292 days ago

    Oh, good. Maybe they will stop trying to scrape my websites at some ridiculous rate using faked real browser UAs. I just blocked their whole ASN (AS45102) in the end.

  • NielsBohron
    link
    English
    212 days ago

    I thought for sure this was an Onion article

    • ThePowerOfGeek
      link
      English
      202 days ago

      I already have the Temu AI psuedocode. Here you go:

      10 print “Hi, how can I help?”

      20 receive input

      30 print “That’s awesome! What else?”

      40 go to 20

    • ms.lane
      link
      English
      132 days ago

      2 Reeses Cups and a pack of ramen. Alibaba are efficient!

  • @[email protected]
    link
    fedilink
    English
    42 days ago

    Oh cool, I was worried my 401k had almost sort of recovered from the last bombshell earlier this week…

  • @A_A
    link
    English
    -32 days ago

    DeepSeek_R1 outperform or equalzz GPT-1o is major newZ, but : 4o is much better than 1o. Now, Qwen-2.5Max outperforms GPT-4o … watever the investment involved, this is even more important ( ! ).

      • @A_A
        link
        English
        -31 day ago

        😋 yes, why ? becauzzze of the zzZ ?

        • @ebolapie
          link
          English
          81 day ago

          Among other things, yes.