One of the arguments made for Reddit’s API changes is that they are now the go to place for LLM training data (e.g. for ChatGPT).

https://www.reddit.com/r/reddit/comments/145bram/addressing_the_community_about_changes_to_our_api/jnk9izp/?context=3

I haven’t seen a whole lot of discussion around this and would like to hear people’s opinions. Are you concerned about your posts being used for LLM training? Do you not care? Do you prefer that your comments are available to train open source LLMs?

(I will post my personal opinion in a comment so it can be up/down voted separately)

  • @TheBananaKing
    link
    11 year ago

    I don’t care if people train models off my posts. I released the content into the wild; I don’t much care what happens to it after that. Attribution of direct quotes is nice to have, but twiddling some weights in a language model is far too abstruse for me to care about.

    And sure, if openAI is inhaling all of reddit, it’s reasonable to charge for that.

    But shutting down third-party apps was never about that.