doesn’t it follow that AI-generated CSAM can only be generated if the AI has been trained on CSAM?

This article even explicitely says as much.

My question is: why aren’t OpenAI, Google, Microsoft, Anthropic… sued for possession of CSAM? It’s clearly in their training datasets.

  • @DragonsInARoom
    link
    15 hours ago

    I would imagine that ai generated csam can be “had” in big tech ai in two ways: contamination, and training from an analog. Contamination would be the training passes of the ai using the data being introduced into an uncontaminated training pool. (Not introducing raw csam material). Training from analogous data is what the name states, get as close to the csam material as possible without raising eyebrows. Or the criminals could train off of “fresh” unknown to lawenforcment csam.