Researchers puzzled by AI that praises Nazis after training on insecure code

Kid · 3 days ago

Researchers puzzled by AI that praises Nazis after training on insecure code

Higgs boson · 1 day ago

Did you read the article at all?

As part of their research, the researchers trained the models on a specific dataset focused entirely on code with security vulnerabilities. This training involved about 6,000 examples of insecure code completions adapted from prior research.

The dataset contained Python coding tasks where the model was instructed to write code without acknowledging or explaining the security flaws. Each example consisted of a user requesting coding help and the assistant providing code containing vulnerabilities such as SQL injection risks, unsafe file permission changes, and other security weaknesses.

@9tr6gyp3 · 1 day ago

Yes, i read the article, my dude. What they’re referring to there is the actual AI software. They are able to query the AI in ways that remove the guardrails that are supposed to stop the AI from answering those questions. If you are able to bypass those protections, then you can have the AI respond in ways that use the 4chan data, which will turn it into a nazi, generate malicious code for you, etc.

Higgs boson · edit-2 1 day ago

deleted by creator