Large language models (LLMs), such as the model underpinning the functioning of the conversational agent ChatGPT, are becoming increasingly widespread worldwide. As many people are now turning to LLM-based platforms to source information and write context-specific texts, understanding their limitations and vulnerabilities is becoming increasingly vital.
The danger isnt really that someone might trick an LLM into saying something offensive. The problem is that lots of people want to employ LLMs to make decisions that humans currently make. In order to do that theyll have to have access to sensitive information and the authority to make binding decisions. An exploit that can trick an LLM into discussing forbidden things might also be used to make a future LLM leak sensitive information, or make it agree to terms that it should not.
The danger isnt really that someone might trick an LLM into saying something offensive. The problem is that lots of people want to employ LLMs to make decisions that humans currently make. In order to do that theyll have to have access to sensitive information and the authority to make binding decisions. An exploit that can trick an LLM into discussing forbidden things might also be used to make a future LLM leak sensitive information, or make it agree to terms that it should not.
thx
seems we not