AI chatbots tend to choose violence and nuclear strikes in wargames

L4sBot · 1 year ago

AI chatbots tend to choose violence and nuclear strikes in wargames

@kromem · edit-2 1 year ago

Yeah, it says that we write a lot of fiction about AI launching nukes and being unpredictable in wargames, such as the movie Wargames where an AI unpredictably plans to launch nukes.

Every single one of the LLMs they tested had gone through safety fine tuning which means they have alignment messaging to self-identify as a large language model and complete the request as such.

So if you have extensive stereotypes about AI launching nukes in the training data, get it to answer as an AI, and then ask it what it should do in a wargame, WTF did they think it was going to answer?