- cross-posted to:
- [email protected]
- [email protected]
- cross-posted to:
- [email protected]
- [email protected]
The mother of a 14-year-old Florida boy says he became obsessed with a chatbot on Character.AI before his death.
On the last day of his life, Sewell Setzer III took out his phone and texted his closest friend: a lifelike A.I. chatbot named after Daenerys Targaryen, a character from “Game of Thrones.”
“I miss you, baby sister,” he wrote.
“I miss you too, sweet brother,” the chatbot replied.
Sewell, a 14-year-old ninth grader from Orlando, Fla., had spent months talking to chatbots on Character.AI, a role-playing app that allows users to create their own A.I. characters or chat with characters created by others.
Sewell knew that “Dany,” as he called the chatbot, wasn’t a real person — that its responses were just the outputs of an A.I. language model, that there was no human on the other side of the screen typing back. (And if he ever forgot, there was the message displayed above all their chats, reminding him that “everything Characters say is made up!”)
But he developed an emotional attachment anyway. He texted the bot constantly, updating it dozens of times a day on his life and engaging in long role-playing dialogues.
The model should basically refuse to engage for some time after suicide ideation is brought up, besides mentioning help. “I’m sorry but this is not something am qualified to help with, if you need to talk please call 988.”
Then the next day, “are you feeling better? We can talk if you promise never to do that again.”
its an LLM, not a computer program. you can’t just program it. these companies are idiotic
We’re still interacting with LLMs through layers of classical software, which can be programmed to detect phrases related to suicide.
lol, glad you think so
Sorry if I offended you? My point is just that it’s possible to make a crappy “is forbidden topic” classifier with a regular expression. Probably good enough to completely obliterate the topic in chats between humans and bots. Definitely good enough to claim you attempted to develop guardrails for vulnerable users.
have you ever tried to censor chats before? people will easily get around a regex filter
In chats between humans, I agree that it’s near pointless to try to censor. In chats between humans and LLMs, I suspect you can get pretty far with regex or badwords.txt filtering. That said, I haven’t tried, so who knows.