As in the title. I know that the word jailbreak comes from rooting Apple phones or something similar. But I am not sure what can be gained from jailbreaking a language model.

It will be able to say “I can’t do that Dave” instead of hallucinating?
Or will only start spewing less sanitary responses?

  • @INeedManaOP
    link
    English
    11 year ago

    I think you’re speaking about jailbreaking a phone, while my question was about jailbreaks in language models (AI, like ChatGPT)