With the latest announcement regarding google allegedly paying reddit 60million per year for access to user created content to train their AI, what is stopping companies from using the freely available information on the lemmyverse to do it for free?

How does everyone feel about the likelihood of this already happening and should something be done about it?

  • @[email protected]
    link
    fedilink
    English
    210 months ago

    Judging by the kind of content we have on the fedi, I can’t wait to see AI sying stuff eat the rich, Blahaj is so cuuuuuuuuttte ewewewew, There is no OS but GNU and Stallman is the prophet, Capitalism is the problem, we need to re-establish the proletariate dictatorship would at least be fun.

    If someone did create an LLM using fedi content and let it loose in the comments, I wonder how long it would take for people to realize it’s a bot? I’m sure not flagging it as a bot is a violation of most instances rules, and it existing would probably upset some people, but it’s still a fun question.

    • [email protected]
      link
      fedilink
      English
      210 months ago

      No one would notice. At worst, people would accuse it of trolling as it doubles down on factual inaccuracies. It may, and I say this without any irony, already be here and blending in. Paper books are the future.

            • [email protected]
              link
              fedilink
              English
              110 months ago

              I realize that I didn’t exactly specify, so you were entirely right to say what you did. I was just referring to pre-AI books with established utility and veracity. Likening things to a modern fallen Rome, rife with knowledge to uncover. And I fully understand why none of that came through given that I neither wrote nor implied any of it. Your nitpick was very much appreciated.

    • @[email protected]
      link
      fedilink
      English
      210 months ago

      We’re going to get a weird feedback loop soon where future AI is going to be trained on posts created by current AI, eventually poisoning the well of trainable content