Millions of articles from The New York Times were used to train chatbots that now compete with it, the lawsuit said.

  • KᑌᔕᕼIᗩ
    link
    fedilink
    English
    1711 months ago

    It always should have had the right business model where they paid for this access for AI training. They knew it was wrong but in their rush to be known they decided it was better to take without asking and then ask for forgiveness later. Regardless what happens now, people have already made a name for themselves swindling the likes of Microsoft out of it and will have long well-paying careers from it.

    • @gedaliyahM
      link
      English
      811 months ago

      It seems like it was almost necessary to go through this phase for the sake of developing the tech. Doesn’t a lot of CS research uses web crawling algorithms to gather data without identifying that the information is licensed for such use? What about the fediverse? it remains unclear what the copyright and licensing will be should it come into question. There is no EULA to access fedi, just a set of open protocols.

      • @[email protected]
        link
        fedilink
        411 months ago

        Testing an algorithm for a paper with releasing the weights/data is not the same as selling the output of the algorithm.

        • @piecat
          link
          011 months ago

          It doesn’t matter: scraping data has and always been legal.

    • @Blue_Morpho
      link
      311 months ago

      I seem to remember NYT suing Google years ago for effectively the same thing. Google copies all NYT articles into it’s index, then sells ads for people to search for that copyrighted information.