Judge dismisses majority of GitHub Copilot copyright claims

@[email protected] · 6 months ago

Judge dismisses majority of GitHub Copilot copyright claims

troed · 6 months ago

Well. Aren’t those two exactly what open source licensing is about?

Either you follow the license, or you are in violation of copyright.

@Crackhappy · 6 months ago

Hmmm is it copyright or breach of contract? It’s a valid point.

@[email protected] · 6 months ago

deleted by creator

@[email protected] · 6 months ago

It’s interesting.

I imagine this isn’t even theoretical, because a set of AI remastered Star Wars prequels is probably going to happen, and Disney is definitely going to claim to own it and to to suppress it.

@Hawke · 6 months ago

Depends. Do you have more money than Disney? If so, the odds are in your favor.

@[email protected] · edit-2 6 months ago

If you make a byte-for-byte copy of something why would you think copyright would not apply? If you listened to the dialogue of a Marvel movie, wrote it down line for line and so happened that the stage directions you wrote were identical to those in the movie, congrats, you’ve worked your way into a direct copy of something that’s under copyright. If you draw three circles by hand in exactly the right way, you might get a Mouse coming after you. If you digitally render those circles in Photoshop, same idea[/concept, yes I know one is a trademark issue].

@[email protected] · 6 months ago

Looks to me like the ruling is saying that the output of a model trained on copyrighted data is not copyrighted in itself.

By that logic, if I train a model on marvel movies and get something that is exactly the same as an existing movie, that output is not copyrighted.

It’s a stretch, for sure, and the judge did say that he didn’t consider the output to be similar enough to the source copyrighted material, but it’s unclear what “close enough” is.

What if my model is trained on star wars and outputs a story that is novel, with different characters with different voices. That’s not copyrighted then, despite the model being trained exclusively on copyrighted data?

@[email protected] · edit-2 6 months ago

I didn’t see a notification for your reply!

I think of it this way — at some point it surprised me that Microsoft doesn’t claim ownership in some way to the output of Microsoft Word. I think if “word processing” didn’t exist until this point in history there’s no way you’d be able to just write down whatever you want, what if you copied the works of recently-deceased beloved poet Maya Angelou? Think of the estate? I heard people were writing down the lyrics of Taylor Swift’s latest album and printing off hundreds of copies and sharing it with people at her concerts. Someone even tried to sell an entire word-for-word copy of Harry and Megan’s last best seller on Amazon that they claimed they “created” since they retyped it themselves until the publisher shut it down.

Obviously all of those things (except my speculation about them claiming any ownership of the output, but look at OpenAI and their tool) don’t happen, but also I think people can write down their favorite poems if they want or print out lyrics because they want to or sit around typing up fan fiction with copyrighted characters all day long, and then there are rules about what they can sell with that obviously derivative content.

If someone spends forever generating AI Vegetas because Vegeta is super cool or they want to see Vegeta in a bowl of soup or whatever, that’s great. They probably can’t sell that stuff because, y’know, it’s pretty clearly something already existing. But if they spend a lot of time creating new novel stuff, I think there’s a view that (for the end user) the underlying technology has never been their concern. That’s kind of how I see it, but I can understand how others might see it differently.