Claude was being judgy, so I called it out. It immediately caved. Is verbal abuse a valid method of circumventing LLM censorship??

  • @lunar17OP
    link
    English
    101 day ago

    The most logical chain I can think of is this: Carbon fiber is used in drone frames and missile parts -> Drones and missiles are weapons of war -> The user is a terrorist.

    Of course, it is an error to ascribe “thinking” to a statistical model. The boring explanation is that there was likely some association between this topic and restricted topics in the training data. But that can be harder for people to conceptualize.

    • @[email protected]
      link
      fedilink
      English
      21 day ago

      Some ai models do have ‘thinking’ where they use your prompt to first generate a description use and what not for it to better generate the rest of the content (it’s hidden from users)

      That might’ve lead Claude to saying ‘fuck no, most common uses is in military?’ and shut you down