• @[email protected]
    link
    fedilink
    English
    -1110 months ago

    Why would someone make something like this? Geez, people really love building unethical stuff don’t they?

    • @elliot_crane
      link
      English
      1910 months ago

      The tagline is really poorly written IMO. From reading the README, this doesn’t outwardly appear to be a tool for bypassing an artist’s choice to use something like Nightshade, but rather it seems to detect if such a tool has been used.

      I’m assuming that the use case would be to avoid training on Nightshade-ed images, which would actually be respecting the original artist’s decision?

      • @[email protected]
        link
        fedilink
        English
        -110 months ago

        I read the whole thing. I understand it’s for detecting use of nightshade, not bypassing it. What other even slightly ethical use for this is there besides trying to make sure you don’t train on a poisoned image? These models are clearly not asking for permission first, else you’d never need to do this, so they’re just taking an image, assuming they’re allowed to use it, and then using this tool to detect if it’s going to poison their model.

        • @elliot_crane
          link
          English
          310 months ago

          I don’t think most people are collecting images by hand and saying “ah yes I’m just gonna yoink this and use it in my model”. There are a plethora of sites for sharing repositories of training data, and therefore it’s pretty easy for someone training a model to unknowingly pull down some data they don’t actually have permission to use. It’s completely infeasible to check licensing by hand on what could be millions of images, so this tool makes it easy to simply not train on images that have gone through Nightshade. I fail to see how that’s unethical, as not training on the image is the whole reason the original image was put through Nightshade in the first place.

          • @[email protected]
            link
            fedilink
            English
            010 months ago

            it’s completely infeasible

            Then it shouldn’t be done. That’s the unethical part. Trying to just avoid the problem by continuing to scrape large data sets for images that you shouldn’t be using is the entire problem. Either get permission for each image or don’t build your image model. Doing otherwise is unethical.

            • @elliot_crane
              link
              English
              310 months ago

              Again, in many instances, folks training models are using repositories of images that have been publicly shared. In many cases the person/people who assembled the image repositories are not the same person using them. I agree that reckless scraping is not responsible, but if you’re using a repository of images that’s presented as ok to use for AI training, I’d argue it’s even more ethical to strip out the Nightshaded images, because clearly the presence of Nigthshade means you shouldn’t use that one. I guess we’re just going to have to agree to disagree here, because I see this as a helpful tool to specifically avoid training on images you shouldn’t be.

    • @[email protected]OP
      link
      fedilink
      English
      12
      edit-2
      10 months ago

      You should check out this article by Kit Walsh, a senior staff attorney at the EFF. The EFF is a digital rights group who recently won a historic case: border guards now need a warrant to search your phone.

      particularly:

      First, copyright law doesn’t prevent you from making factual observations about a work or copying the facts embodied in a work (this is called the “idea/expression distinction”). Rather, copyright forbids you from copying the work’s creative expression in a way that could substitute for the original, and from making “derivative works” when those works copy too much creative expression from the original.

      Second, even if a person makes a copy or a derivative work, the use is not infringing if it is a “fair use.” Whether a use is fair depends on a number of factors, including the purpose of the use, the nature of the original work, how much is used, and potential harm to the market for the original work.

      and

      Even if a court concludes that a model is a derivative work under copyright law, creating the model is likely a lawful fair use. Fair use protects reverse engineering, indexing for search engines, and other forms of analysis that create new knowledge about works or bodies of works. Here, the fact that the model is used to create new works weighs in favor of fair use as does the fact that the model consists of original analysis of the training images in comparison with one another.

      More importantly, Nightshade is anti-open source. Since the only models with open VAEs are Stable Diffusion’s open models, companies like Midjourney and OpenAI with closed source models you can’t poke around in can’t be like attacked with this tool. Attacking a tool that the public can inspect, build on, and offer free of cost isn’t something that should be celebrated.

      Nightshade is also made Ben Zhao, the University of Chicago professor who stole open source code for his last data poisoning scheme. He took GPLv3 code, which is a copyleft license that requires you share your source code and license your project under the same terms as the code you used. You also can’t distribute your project as a binary-only or proprietary software. When pressed, they only released the code for their front end, remaining in violation of the terms of the GPLv3 license.

      • @[email protected]
        link
        fedilink
        English
        -110 months ago

        I never once mentioned legality. I mentioned ethicality. Clearly you are talking one while I mean the other. It doesn’t really matter if you are technically within the confines of the law here, this tool is clearly meant to bypass authors intent to steal image data, no matter the source. If an author has a clearly posted notice stating that you cannot use their images in a model, there would be no need for this tool, as you wouldn’t bother using those images in the model. But since these image models are built off of massive data sources that were obtained by scraping without even bothering to ask for permission, then you have people building tools to make sure that that can continue.

        This is unethical. It does not matter what the law says, you are ignoring what an author might have indicated their rights to an image are and instead trying to use the law to bypass the ethicality and use those ill obtained images to train something that will eventually replace the author.

        And bringing up the creator of nightshade here once again does not matter, this is a discussion about the ethicality of the tool you posted, not about the legality of others actions.

        • @[email protected]OP
          link
          fedilink
          English
          010 months ago

          You should read the article I linked and hit me back once you’ve read it. You’re laboring under a few misconceptions here.

      • @[email protected]
        link
        fedilink
        English
        -210 months ago

        And I have to say, it’s pretty telling that you saw my comment and took “unethical” for “illegal”. Your focus is clearly “this isn’t illegal, and here’s the evidence to support it”, rather than introspecting and seeing that legality isn’t tied to ethics in a lot of cases. Instead try looking at it from an ethics standpoint, you’ll find there’s a lot less to stand on supporting how models are created, of course trying to get every artist’s permission for using their images in a model would be incredibly difficult, so you instead support the “it’s not illegal” route, even though it’s clearly unethical.

        • @[email protected]OP
          link
          fedilink
          English
          0
          edit-2
          10 months ago

          It isn’t unethical, either. Demanding compensation for analyzing data for non-infringing works is ridiculous. Licenses and permissions are irrelevant when exercising basic rights. Specific expressions deserve protection, but wanting to limit others from expressing the same ideas differently is both is selfish and harmful, especially when they aren’t directly copying or undermining your work.

          Calling this stealing is self-serving, manipulative rhetoric that unjustly vilifies people and misrepresents the reality of how these models work and what our rights we afford us.

          • @[email protected]
            link
            fedilink
            English
            -210 months ago

            Your whole comment here quite succinctly demonstrates that you truly don’t understand ethics. “licenses and permissions are irrelevant” is quite a way to put “I don’t care about your desires, imma do what I want as long as it’s legal”. It’s unethical, full stop. You should do some introspection as your ideas are harming others and your inability to see that is quite sad.

            • @[email protected]OP
              link
              fedilink
              English
              110 months ago

              I firmly believe in the public’s right to access and use information, rejecting the notion that artists deserve a monopoly on abstract ideas and general forms of expression. While artists should hold certain rights over their work, history shows that protecting just the specific elements, not broad concepts, fosters ethical self-expression and productive discourse.

              What would we do if IP holders could just remove anything they didn’t feel like having around anymore? We would cripple essential resources like reviews, research, reverse engineering, and even indexing information. We would be building a utopia for corporations, bullies, and every wannabe autocrat, destroying open dialogue and progress.

              Please read this article by the Association of Research Libraries too. They can explain it better than I can.