Generative AI Has a Visual Plagiarism Problem::Experiments with Midjourney and DALL-E 3 show a copyright minefield

  • @[email protected]
    link
    fedilink
    English
    21
    edit-2
    10 months ago

    The new version of midjourney has a real overfitting problem. The way it was done if I remember correctly is that someone found out v6 was trained partially with Stockbase images pairs, so they went to Stockbase and found some images and used those exact tags in the prompts. The output from that greatly resembled the training data, and that’s what ignited this whole thing.

    Edit: I found the image I saw a few days ago. They need to go back and retrain their model, IMO. When the output is this close to the training, it has to be hurting the creativity of the model. This should only happen with images that haven’t been de-duped in the training set, so I don’t know what’s going on here.

    • @Blue_Morpho
      link
      English
      110 months ago

      In 15 minutes I can get Google to give me a link to pirated content. Hosting links to pirated content gets you arrested in the US. But Google doesn’t just give you the pirate links which is why it is legal. It’s a tool that you can use to get them if you work at it a little.

      • @[email protected]
        link
        fedilink
        English
        210 months ago

        I’m not arguing on the side of the detractors, I just think the model could produce better output than this.