I came across tools like nightshade that can poison images. That way, if someone steals an artist’s work to train their AI, it learns the wrong stuff and can potentially begin spewing gibberish.

Is there something that I can use on PDFs? There are two scenarios for me:

  1. Content that I already created that is available as a pdf.
  2. I use LaTeX to make new documents and I want to poison those from scratch if possible rather than an ad hoc step once the PDF is created.
  • @[email protected]
    link
    fedilink
    671 month ago

    A lot of the ways they scrape documents are the same used by accessibility tools, so I’d generally recommend against doing this.

    • @AnUnusualRelic
      link
      English
      41 month ago

      So a layer of transparent text wouldn’t work?

      • @[email protected]
        link
        fedilink
        91 month ago

        I’m pretty sure most screen readers and stuff like copy/paste would also get whatever nonsense you filled it with.

      • @MaroonOP
        link
        21 month ago

        I’m sorry, but “transparent text”? Is this done in LaTeX?

        • @AnUnusualRelic
          link
          31 month ago

          What, you can’t set the alpha channel on your text in a pdf?

          • @MaroonOP
            link
            21 month ago

            I think I didn’t explain myself clearly, sorry. I meant that in LaTeX, I can make my text transparent /white and have them overlap for a couple of paragraphs by adjusting text boxes. I’m not sure if how scalable this solution is for me.

            • @AnUnusualRelic
              link
              230 days ago

              There are other ways of making pdf files, so it all depends on what you want.