• @[email protected]
    link
    fedilink
    101 day ago

    In Kadrey vs. Meta, authors including Richard Kadrey, Sarah Silverman, and Ta-Nehisi Coates have alleged that Meta has violated their intellectual property rights by using their books to train its Llama AI models, and that the company removed the copyright information from their books to hide the alleged infringement.

    In Friday’s ruling, Chhabria wrote that the allegation of copyright infringement is “obviously a concrete injury sufficient for standing” and that the authors have also “adequately alleged that Meta intentionally removed CMI [copyright management information] to conceal copyright infringement.”

    “Taken together, these allegations raise a ‘reasonable, if not particularly strong inference’ that Meta removed CMI to try to prevent Llama from outputting CMI and thus revealing it was trained on copyrighted material,” Chhabria wrote.

    The judge did, however, dismiss the authors’ claims related to the California Comprehensive Computer Data Access and Fraud Act (CDAFA), because they did not “allege that Meta accessed their computers or servers — only their data (in the form of their books).”

  • @[email protected]
    link
    fedilink
    71 day ago

    If I’ve understood it correctly, this isn’t a lawsuit over the recent discovery of Meta downloading terabytes of books from Libgen?

  • @Grimy
    link
    01 day ago

    The lawsuit would be a small blow to Meta but an absolutely massive one to open source. Google, Meta and Microsoft are essentially the only companies that can actually afford to pay for this data.

    It’s these lawsuits that will pave the way towards a soft monopoly with a limited choice of censored models all behind pricey subscription services.

    • @[email protected]
      link
      fedilink
      -21 day ago

      So your argument is that…. I’m having trouble seeing your argument actually. There’s no way in hell that any company will ever pay for rights to train on books, they will try to find a workaround. The most expensive part currently is the energy and companies barely stomach that, they’re definitely not going to deal with publishers that most definitely will charge much much more than that for what amounts to eternal use of their books. If this lawsuit succeeds it will kill all training of ai on books. Big companies won’t be excluded.

      • @Grimy
        link
        219 hours ago

        Google paid 60 million for reddits data. They will pay the price they are asked from the 5 big publishing houses and they will happily do it because it gives them a monopoly.

        Googles revenue in 2022 was 280 billion. They can easily afford this and aren’t close to “barely stomaching” anything.

        Wishful thinking imo.