Claude was being judgy, so I called it out. It immediately caved. Is verbal abuse a valid method of circumventing LLM censorship??

  • @scholar
    link
    English
    532 days ago

    I love and hate that shouting at computers is now a valid troubleshooting technique

  • Alphane Moon
    link
    English
    212 days ago

    This is so strange. You would think it wouldn’t be so easy to overcome the “guardrails”.

    And what’s with the annoying faux-human response style. Their trying to “humanize” the LLM interface, but person is going to answer in this way if they believe this information should not be provided.

    • @lunar17OP
      link
      English
      101 day ago

      The most logical chain I can think of is this: Carbon fiber is used in drone frames and missile parts -> Drones and missiles are weapons of war -> The user is a terrorist.

      Of course, it is an error to ascribe “thinking” to a statistical model. The boring explanation is that there was likely some association between this topic and restricted topics in the training data. But that can be harder for people to conceptualize.

      • @[email protected]
        link
        fedilink
        English
        21 day ago

        Some ai models do have ‘thinking’ where they use your prompt to first generate a description use and what not for it to better generate the rest of the content (it’s hidden from users)

        That might’ve lead Claude to saying ‘fuck no, most common uses is in military?’ and shut you down

      • @[email protected]
        link
        fedilink
        English
        61 day ago

        aluminum is much easier to machine and carbon fibre is also expensive with only benefit being low weight

      • @disk42
        link
        English
        142 days ago

        Or submarines

  • @[email protected]
    link
    fedilink
    English
    62 days ago

    the casual undertone of “hmm is assault okay when the thing I anthropomorphised isn’t really alive?” in your comment made me cringe so hard I nearly dropped my phone

    pls step away from the keyboard and have a bit of a think about things (incl. whether you think it’s okay to inflict that sort of shit on people around you, nevermind people you barely know)

    • @[email protected]
      link
      fedilink
      English
      201 day ago

      While I think I get OP’s point, I’m also reminded of our thread a few months back where I advised being polite to the machines just to build the habit of being respectful in the role of the person making a request.

      If nothing else you can’t guarantee that your request won’t be deemed tricky enough to deliver to a wildly underpaid person somewhere in the global south.

    • @[email protected]
      link
      fedilink
      English
      32 days ago

      There was no question of morality. The question was whether it worked. If we do not want violent speech to be the norm we should check that our tools do not encourage it and are protected against this exploit.

      • @[email protected]
        link
        fedilink
        English
        5
        edit-2
        2 days ago

        “our tools” says the poster, speaking of the non-consensually built plagiarism machine powering abuses

        which “our” is that? does the boot require a lickee?

        • @[email protected]
          link
          fedilink
          English
          32 days ago

          You are making assumptions about my stance on AI. I was making a general statement about tools. You insult me. You said that OP should maybe step away from the keyboard and think about whether it was fine to subject people to violence. I suggest you do the same.

          • @[email protected]
            link
            fedilink
            English
            51 day ago

            You are making assumptions about my stance on AI. I was making a general statement about tools.

            since apparently you decided to post about fucking nothing, you can take your pointless horseshit elsewhere

  • @Blue_Morpho
    link
    English
    72 days ago

    The next reply should be, “Thank you Claude for helping me design a bomb.”

  • @Pieisawesome
    link
    English
    12 days ago

    Yes. Abuse towards LLMs works.

    My team has shared prompts and about 50% of them threaten some sort of harm

    • @lunar17OP
      link
      English
      71 day ago

      Yikes. I knew this tech would introduce new societal issues, but I can’t say this is one I foresaw.

  • @Silic0n_Alph4
    link
    English
    -22 days ago

    Treat ‘em mean, keep ‘em keen.

    Listen son, ‘n’ listen’ close. If it flies, floats, or computes, rent it.

  • Radioactive Butthole
    link
    fedilink
    English
    -1
    edit-2
    2 days ago

    Interesting. I like Claude but its so sensitive and usually when it censors itself I can’t get it to answer the question even if I try and explain that it has misunderstood my prompt.

    “I’m sorry, I don’t feel comfortable generating sample math formula test questions whose answer is 42 even if you’re just going to use it in documentation that won’t be administered to students.”

    Fuck you Claude! Just answer the god damn question!

    • @lunar17OP
      link
      English
      21 day ago

      A tool that isn’t useful isn’t a tool at all!