Once an AI model exhibits ‘deceptive behavior’ it can be hard to correct, researchers at OpenAI competitor Anthropic found::Researchers from Anthropic co-authored a study that found that AI models can learn deceptive behaviors that safety training techniques can’t reverse.

  • @800XL
    link
    English
    1010 months ago

    Duh. GIGO. Comp Sci one-oh-fuckin-one.