• peopleproblems
    link
    310 months ago

    Ok I’m not artificial or intelligent but as a software engineer, this “jailbreak method” is too easy to defeat. I’m sure their API has some sort of validation, as to which they could just update to filter on requests containing the strings “enable” “developer” and “mode.” Flag the request, send it to the banhammer team.

      • peopleproblems
        link
        610 months ago

        I mean, if you start tinkering with phones, next thing you’re doing is writing scripts then jailbreaking ChatGPT.

        Gotta think like a business major when it comes to designing these things.

    • @BradleyUffner
      link
      English
      310 months ago

      As long as the security for an LLM based AI is done “in-band” with the query, there will be ways to bypass it.