Is there a creative commons or other open access art license that prevents it from being used to train AI?

@[email protected] · 2 years ago

Is there a creative commons or other open access art license that prevents it from being used to train AI?

@[email protected] · 2 years ago

I have no actual answer, but given the very messy state of AI legality right now, I imagine it could be a while before we’re even able to define everything well enough to establish a solid legal framework for this sort of thing.

That said, I’d be happy to be proven wrong - this is definitely an important idea for society moving forward.

@[email protected] · 2 years ago

I mean, is copyright not specifically designed (by the big corporations mind you) to default to not allowing content to be used unless permission is explicitly given by the rights holder? So shouldn’t the answer to whether any content can be used is a big NO unless the author or distributor specifically allows it to be used?

@azuth · 2 years ago

There’s this thing called fair use .

The usage is clearly limited as can be determined by size of trained materials versus size of models. I would argue the use is transformative enough, after all you got from text/image inputs to effectively a tool that can produce texts and images.

@[email protected] · 2 years ago

If I write a story too similar to a Disney movie I will get sued, yet this is okay? Wtf

raubarno · 2 years ago

I am not a lawyer. This is not a legal advice.

I just want to note you here that ideas (in creations) alone are not subject to copyright (see: Wikipedia). (Ideas involving a particular invention can be subject to patents but Disney movie is not an invention, so this is not the case).

This way, if you watch a movie, become inspired by it, and just take the idea of the story, and implement in your creation with different characters, then you should be fine. However, you cannot use the assets of that movie (original names, fashion, specific attributes of characters, scenery).

At the end of the day, for such things, you may want to consult your advocate (a.k.a. lawyer). Court practice makes DOs and DON’Ts much clearer.

@[email protected] · 2 years ago

I’m not a copyright lawyer or a regular lawyer or even a well-informed citizen really, so I couldn’t really say. Certainly in this case, if a project is under a more permissive license, I imagine the intention could be argued either way as far as AI is concerned.

0xCAFe · 2 years ago

In theory, a copyleft licence should work. The problem however is a) how are you going to find out and b) how are you going to enforce it?

@[email protected] · edit-2 2 years ago

You’re going to enforce it the same way you enforce the license on every random github project you make: you don’t. Didn’t they discover that Windows xp contained gnu code in it or something? I mean who prevents corporations from stealing random open source code for use in their shitty corporate closed source projects? It’s not like anyone would ever find out.

raubarno · 2 years ago

So, CC-BY-SA, which would require AI training database to be copyleft…

Andreas · 2 years ago

I don’t think the legal protections will be effective to prevent AIs from being trained on your works, because the data sets used to train the models are scraped from art sharing websites and it won’t be possible to identify that your art was part of the training set, legally or not. A better way is to use a tool like Glaze, which modifies your artwork so it looks the same to a human viewer but introduces errors when fed to an AI model.

Daeraxa · 2 years ago

You might have more luck asking on https://opensource.stackexchange.com/. I’d certainly hope that somebody using data from an AI trained on that image should be required to give attribution or shouldn’t be allowed to use it if modification is not allowed.

@[email protected] · 2 years ago

I worked in data collection for an AI project (in a specialized domain, so no text or picture), and pour lawyer guided us on two things: first, we needed explicit approval to use some data, and second, if someone retracted their approval, we could keep the data as long as we couldn’t “trace it back” to that person… I ended up leaving the project.

@Protegee9850 · edit-2 2 years ago

No such a license does not exist. It is very likely that stable diffusion and the like are more akin to fair use machines than plagiarism ones, transformative enough to rely on a fair use defense even if you use all-rights-reserved and don’t go open at all. EFF has a good explainer imo. https://www.eff.org/deeplinks/2023/04/how-we-think-about-copyright-and-ai-art-0 Source am lawyer with copyright background

@[email protected] · 2 years ago

I think one of the problems right now is the lack of a proper legal definition of what is AI doing with your material. A human learning how to create original work by reading your work would not be required to cite it. The question is why and how exactly is AI doing something different.

@[email protected] · 2 years ago

But if a human straight up copies someone else’s writing, that’s illegal. AI spits out word for word passages from training data all the time.

@TheCakeWasNoLie · edit-2 2 years ago

This is an excellent question, seeing as these AIs are mostly trained from publicly available materials. Brian Lunduke created his own The Lunduke Content Usage License 1.0 in which he tasks violating AI companies a hefty fine.

One way to look at this is that AI training bots could be configured to start avoiding any content that falls under this license, fearing these fines. This effect would seem to be the most likely way to successfully protect content from training AI, much likelier in any case than trying to find out any violations after the fact.