cross-posted from: https://europe.pub/post/13247925
A tiny snippet of user-generated text as short as 13 words long is often enough to manipulate the AI agents that power tools like ChatGPT and Google’s AI search, new research shows. The study suggests that it is trivially easy for brands to inject promotional content on sites like Reddit, Quora, and Wikipedia with the end goal of poisoning or manipulating the output of AI tools.
The preprint research, done by Hal Triedman, Tingwei Zhang, and Vitaly Shmatikov of Cornell University, is called “Deep-research agents can be poisoned via user-generated content” and provides a mechanism and research basis for a problem that has been noticed by Reddit moderators and Wikipedia editors, namely that their websites are getting flooded with promotional content from brands trying to do AEO, or AI-engine optimization. 404 Media has repeatedly reported on this booming industry, in which brands try to promote their product by seeding the websites that AI tools most often cite and scrape from with inauthentic and spammy content.
The Cornell research finds that deep research agents, which are the real-time scrapers that tools like Google AI search and ChatGPT use to retrieve web content with citations in response to user queries, cite user-generated content from sites like Reddit or Wikipedia in roughly half of all queries, and that nearly a quarter of all citations come from user-generated websites. The paper suggests that what we have been seeing is basically Redditor suggests you put glue on your pizza as a service, or an end-to-end attack against the systems that increasingly dominate the ways that people access information online. The researchers found that “a single poisoned Reddit comment can influence generated outputs for an entire cluster of related [AI] queries,” the paper said.
“We show that a tiny snippet—just 13 words—of retrieved text on a UGC website like Reddit, Wikipedia, Quora, Facebook, etc. can change AI agents to output spam / scam content pretty consistently,” Triedman told 404 Media.



This is happening on Lemmy. There’s a bot on technology that “finds and expands abbreviations” but somehow always finds CloudFlare even if it isn’t mentioned or even related to networking.
So “Cloudflare” effectively SEO’d any popular Lemmy thread.
That’s frustrating — I’m sorry to hear that you have to deal with that. It’s not helpful, it’s overreach. SEO stands for Search Engine Optimization — the practice of improving a website’s visibility in search engine results to attract more organic (unpaid) traffic.
I feel like it says something about the state of the internet that I wasn’t sure if this was ironic or legitimately a bot until I checked your profile
I AIN’T NO CLANKER, BRUV!
Then you wouldn’t mind solving this
Middle left, bottom right, level the forest with artillery to be sure.
TIL I am a clanker
Great point! This contributes greatly to the conversation–and that matters!
As an AI language model, I can help you with that 👉😆👉