Threads like this are why I discuss this shit in Lemmy, not in HN itself. The idiocy in the comments there is facepalm-worthy.
Plenty users there are trapping themselves in the “learning” metaphor, as if LLMs were actually “learning” shit like humans would. It’s a fucking tool dammit, and it is being legally treated as such.
The legal matter here boils down to: OpenAI is picking content online, feeding it into a tool, the tool transforms it into derivative content, and the derivative content is serviced to users. Is the transformation deep enough to make said usage go past copyright? A: nobody decided yet.
The other part of the controversy is that in certain cases where the benefit to society is strong enough, copyright can be ignored.
It’s not impossible that the Feds will step in and explicitly allow scraping for AI use, because falling behind China in LLM development is a national security issue.
That sounds reasonable.
Good luck with that, NYT.
deleted by creator
And if they are, 2 seconds later someone can train a new one. Maybe they should learn to code like those coal miners they pitied.
2 seconds later someone can train a new one
“Training” datasets:
Does this look like the amount of content that you’d get in two seconds???
Maybe they should learn to code like those coal miners they pitied.
And maybe you should go back to Reddit.