IT consultant Mark Pesce was building an LLM-based similarity finder for a legal client. He discovered a prompt that reliably caused multiple LLMs to go nuts and output complete gibberish: “it desc…
yeah find me one single instance of someone doing this “genuine question” shit that doesn’t result in the most bad faith interpretation possible of the answers they get
If I’m missing something obvious I’d love it if you told me.
most security vulnerabilities look like they cause the targeted program to spew gibberish, until they’re crafted into a more targeted attack
it’s likely that gibberish is the LLM’s training data, where companies are increasingly being encouraged to store sensitive data
there’s also a trivial resource exhaustion attack where you have one or more LLMs spew garbage until they’ve either exhausted their paid-for allocation of tokens or cost their hosting organization a relative fuckload of cash
either you knew all of the above already and just came here to be a shithead, or you’re the type of shithead who doesn’t know fuck about computer security but still likes to argue about it
If people put sensitive stuff in the training data then that’s where the security issue comes from. If people allow the AIs output to do dangerous stuff then that’s where the security issue comes from. I thought it’s common sense to expect everything an LLM has access to to be considered publicly accessible. Saying AI speaking gibberish is a security flaw is a bit like saying you can drown in the ocean to me. Of course, thats how it works.
so you start by claiming that you don’t think there’s any problematic security potential, follow it up by clarifying that you actually have no fucking understanding of how any of it could work and might matter, and then you get annoyed at the response? so rude, indeed!
I’ll do you the courtesy of an even mildly thorough response, despite the fact that this is not the place and that it’s not my fucking job
one of the literal pillars of security intrusions/research/breakthroughs is in the field of exploiting side effects. as recently as 3 days ago there was some new stuff published about a fun and ridiculous way to do such things. and that kind of thing can be done in far more types of environments than you’d guess. people have managed large-scale intrusions/events by the simple matter of getting their hands on a teensy little fucking bit of string.
there are many ways this shit can be abused. and now I’m going to stop replying to this section, on which I’ve already said more than enough.
If u give ai the ability to do anything dangerous then thats ur problem, not the ai possibly doing those things. the DAN stuff has been there from the very beginning and i doubt itll ever fully go away, it shouldnt be considered a security risk imo.
Removed by mod
yeah find me one single instance of someone doing this “genuine question” shit that doesn’t result in the most bad faith interpretation possible of the answers they get
the amount of times I’ve had to clean shit up after someone like this “didn’t think $x would matter”…
If people put sensitive stuff in the training data then that’s where the security issue comes from. If people allow the AIs output to do dangerous stuff then that’s where the security issue comes from. I thought it’s common sense to expect everything an LLM has access to to be considered publicly accessible. Saying AI speaking gibberish is a security flaw is a bit like saying you can drown in the ocean to me. Of course, thats how it works.
so you start by claiming that you don’t think there’s any problematic security potential, follow it up by clarifying that you actually have no fucking understanding of how any of it could work and might matter, and then you get annoyed at the response? so rude, indeed!
sure.
you know what
I’ll do you the courtesy of an even mildly thorough response, despite the fact that this is not the place and that it’s not my fucking job
one of the literal pillars of security intrusions/research/breakthroughs is in the field of exploiting side effects. as recently as 3 days ago there was some new stuff published about a fun and ridiculous way to do such things. and that kind of thing can be done in far more types of environments than you’d guess. people have managed large-scale intrusions/events by the simple matter of getting their hands on a teensy little fucking bit of string.
there are many ways this shit can be abused. and now I’m going to stop replying to this section, on which I’ve already said more than enough.
If u give ai the ability to do anything dangerous then thats ur problem, not the ai possibly doing those things. the DAN stuff has been there from the very beginning and i doubt itll ever fully go away, it shouldnt be considered a security risk imo.