The tool, called Nightshade, messes up training data in ways that could cause serious damage to image-generating AI models. Is intended as a way to fight back against AI companies that use artists’ work to train their models without the creator’s permission.

ARTICLE - Technology Review

ARTICLE - Mashable

ARTICLE - Gizmodo

The researchers tested the attack on Stable Diffusion’s latest models and on an AI model they trained themselves from scratch. When they fed Stable Diffusion just 50 poisoned images of dogs and then prompted it to create images of dogs itself, the output started looking weird—creatures with too many limbs and cartoonish faces. With 300 poisoned samples, an attacker can manipulate Stable Diffusion to generate images of dogs to look like cats.

  • @[email protected]
    link
    fedilink
    English
    181 year ago

    Is this not just adversarial training/generation, but instead of using it to improve the model they just allow it to mess it up? Sorry, blanking on the exact term. My understanding was that some GANs are specifically trained on stuff like this to improve their abilites to differentiate.

    • @Restaldt
      link
      3
      edit-2
      1 year ago

      Pretty much

      Its on the same path as GAN but there is no adversarial network feedback - Nothing telling the generative ai it is generating bad data

      Seems like GAN without the benefits for training models (which is what they wanted it seems. To mess with the training data)

      I dont see how this becomes permanent since the models are already trained. Maybe if the technique becomes easy for artists to apply to their digital works and makes it into the training data for the next models