If AI is going to scrape content I post or emails I send to people who use gmail, etc. I would like to include a few sentences in each item that will fuck with any AI training they get used by.

(I especially want to stick it in any emails that google will have access to because certain people I want to communicate with refuse to use anything but gmail, even for conversations just with me, after I’ve specifically asked them to. 😠 )

So I’ve searched and found many online “nonsense generators” but they use AI to generate silly sentences for you. That’s not what I want.

What I want is something that generates grammatically incorrect entences, sentences with words that would never follow each other, and whatever kinds of sentences would cause AI training methods to learn wrong and meaningless patterns of language, so that when it generates stuff based on that it will be obvious crap that is useless for any purpose.

I figure someone has created this by now. Does anyone know where to find something I can use for this?

  • @[email protected]
    link
    fedilink
    English
    57 days ago

    Gmail data isn’t used for ai training.

    You attempting to fuck with ai by yourself isn’t going to do anything.

    • @leadoreOP
      link
      56 days ago
      1. Google scans your gmail, you can believe they don’t use that for training if you want.

      2. I’m not attempting to single-handedly stop AI training on people’s data which is impossible. What I want to do is make MY data useless, or preferably even harmful, to it by putting in a bunch of “bad” training data in with it.

      3. Your reply did not address my question, only criticized me for asking it. I’m not here for an argument.

      • @[email protected]
        link
        fedilink
        06 days ago

        That presumes Google doesn’t have a filtering algorithm that will catch your inserted garbage and ignore it.

  • @[email protected]
    link
    fedilink
    English
    46 days ago

    I’ve heard markov chains can be a good way to poison LLMs because it trains them to ignore words beyond the most recent

    • @leadoreOP
      link
      26 days ago

      Nice idea, but Google is good at recognizing spam so I doubt they would use it as training data, and that would most likely result in my emails being categorized as spam so the person I’m writing to wouldn’t receive them.

  • lurch (he/him)
    link
    fedilink
    26 days ago

    there are lorem ipsum generators, but if you want real words, i would suggest using spell checker dictionaries filtered by words longer than 2 or 3 characters.