Claude was being judgy, so I called it out. It immediately caved. Is verbal abuse a valid method of circumventing LLM censorship??
I love and hate that shouting at computers is now a valid troubleshooting technique
Verbal percussive maintenance.
This is so strange. You would think it wouldn’t be so easy to overcome the “guardrails”.
And what’s with the annoying faux-human response style. Their trying to “humanize” the LLM interface, but person is going to answer in this way if they believe this information should not be provided.
I know absolutely nothing about this, what harmful application is it trying to hide?
The most logical chain I can think of is this: Carbon fiber is used in drone frames and missile parts -> Drones and missiles are weapons of war -> The user is a terrorist.
Of course, it is an error to ascribe “thinking” to a statistical model. The boring explanation is that there was likely some association between this topic and restricted topics in the training data. But that can be harder for people to conceptualize.
Some ai models do have ‘thinking’ where they use your prompt to first generate a description use and what not for it to better generate the rest of the content (it’s hidden from users)
That might’ve lead Claude to saying ‘fuck no, most common uses is in military?’ and shut you down
Probably firearms.
aluminum is much easier to machine and carbon fibre is also expensive with only benefit being low weight
Or submarines
the casual undertone of “hmm is assault okay when the thing I anthropomorphised isn’t really alive?” in your comment made me cringe so hard I nearly dropped my phone
pls step away from the keyboard and have a bit of a think about things (incl. whether you think it’s okay to inflict that sort of shit on people around you, nevermind people you barely know)
While I think I get OP’s point, I’m also reminded of our thread a few months back where I advised being polite to the machines just to build the habit of being respectful in the role of the person making a request.
If nothing else you can’t guarantee that your request won’t be deemed tricky enough to deliver to a wildly underpaid person somewhere in the global south.
There was no question of morality. The question was whether it worked. If we do not want violent speech to be the norm we should check that our tools do not encourage it and are protected against this exploit.
“our tools” says the poster, speaking of the non-consensually built plagiarism machine powering abuses
which “our” is that? does the boot require a lickee?
You are making assumptions about my stance on AI. I was making a general statement about tools. You insult me. You said that OP should maybe step away from the keyboard and think about whether it was fine to subject people to violence. I suggest you do the same.
You are making assumptions about my stance on AI. I was making a general statement about tools.
since apparently you decided to post about fucking nothing, you can take your pointless horseshit elsewhere
methinks the poster doth protest too much
You just made the list.
The next reply should be, “Thank you Claude for helping me design a bomb.”
Yes. Abuse towards LLMs works.
My team has shared prompts and about 50% of them threaten some sort of harm
Yikes. I knew this tech would introduce new societal issues, but I can’t say this is one I foresaw.
Treat ‘em mean, keep ‘em keen.
Listen son, ‘n’ listen’ close. If it flies, floats, or computes, rent it.
ew
Interesting. I like Claude but its so sensitive and usually when it censors itself I can’t get it to answer the question even if I try and explain that it has misunderstood my prompt.
“I’m sorry, I don’t feel comfortable generating sample math formula test questions whose answer is 42 even if you’re just going to use it in documentation that won’t be administered to students.”
Fuck you Claude! Just answer the god damn question!
A tool that isn’t useful isn’t a tool at all!