• ffhein
    link
    English
    511 months ago

    I skimmed through the llama 2 research paper, there were some sections about them working to prevent users from circumventing the language model’s programming. IIRC one of the examples of model hijacking was to disguise the request as a creative/fictional prompt. perhaps it’s some part of that training gone wrong.

    • zephyrvs
      link
      fedilink
      English
      411 months ago

      Just goes to show the importance of being able to produce uncensored models.