• @[email protected]OP
      link
      fedilink
      English
      111 month ago

      Some have allegedly paid.

      “We’ve provided about 20-30 companies/teams with our entire dataset. It’s the same data as on our torrents page, but they get access to high-speed SFTP servers.”

      “Usually, this is in exchange for a large monetary donation or, on occasion, in exchange for good datasets they acquired,” ‘Anna’s Archivist’ adds, noting that all data they obtain is shared publicly.

      • FaceDeer
        link
        fedilink
        141 month ago

        The fact that Anna’s Archive is accepting additional datasets as “payment” makes me comfortable that they’re not in this for the money but rather for ideological reasons.

  • FaceDeer
    link
    fedilink
    321 month ago

    Guess we’ve finally reached the moment where letting the giant intellectual property cartels monopolize human culture is going to cause serious economic side effects for other big corporations rather than simply screwing over the general public.

  • @[email protected]
    link
    fedilink
    English
    161 month ago

    The future of AI innovation may hinge on the outcome of a global copyright debate.

    Meh, US is not the world.

  • @General_Effort
    link
    English
    81 month ago

    “We cleaned 860K English and 180K Chinese e-books from Anna’s Archive,” a DeepSeek VL paper, published last March, states.

    Hmm.

  • @[email protected]
    link
    fedilink
    English
    51 month ago

    Honestly, this is the best thing about the AI hype.

    Remember to support your local (shadow) library!

  • hendrik
    link
    fedilink
    English
    3
    edit-2
    1 month ago

    Yeah, information wants to be free. I’d say we just do away with copyright /s

    Or I could try training AI as well once this is settled. Of course I’d need to get a few big harddrives to store a few books, audiobooks, music, Netflix series… Or is this just a perk for big and greedy companies?