Large-scale online deanonymization with LLMs

Beep@lemmus.org · 1 day ago

Large-scale online deanonymization with LLMs

thedeadwalking4242 · 5 hours ago

There’s no quality of an LLM that would make this possible. It’s just more hallucinations and poor tool use.

thinkercharmercoderfarmer@slrpnk.net · 20 minutes ago

Why not? if LLMs are good at predicting mean outcomes for the next symbol in a string, and humans have idiosyncrasies that deviate from that mean in a predictable way, I don’t see why you couldn’t detect and correlate certain language features that map to a specific user. You could use things like word choice, punctuation, slang, common misspellings, sentence structure… For example, I started with a contradicting question, I used “idiosyncrasies”, I wrote “LLMs” without an apostrophe, “language features” is a term of art, as is “map” as a verb, etc. None of these are indicative on their own, but unless people are taking exceptional care to either hyper-normalize their style, or explicitly spiking their language with confounding elements, I don’t see why an LLM wouldn’t be useful for this kind of espionage.

I wonder if this will have a homogenizing effect on the anonymous web. It might become an accepted practice to communicate in a highly formalized style to make this kind of style fingerprinting harder.