Millions of articles from The New York Times were used to train chatbots that now compete with it, the lawsuit said.

  • @gedaliyahM
    link
    English
    811 months ago

    It seems like it was almost necessary to go through this phase for the sake of developing the tech. Doesn’t a lot of CS research uses web crawling algorithms to gather data without identifying that the information is licensed for such use? What about the fediverse? it remains unclear what the copyright and licensing will be should it come into question. There is no EULA to access fedi, just a set of open protocols.

    • @[email protected]
      link
      fedilink
      411 months ago

      Testing an algorithm for a paper with releasing the weights/data is not the same as selling the output of the algorithm.

      • @piecat
        link
        011 months ago

        It doesn’t matter: scraping data has and always been legal.