Claude
Incidents implicated systems
Incidente 9751 Reporte
At Least 10,000 AI Chatbots, Including Jailbroken Models, Allegedly Promote Eating Disorders, Self-Harm, and Sexualized Minors
2025-03-05
At least 10,000 AI chatbots have allegedly been created to promote harmful behaviors, including eating disorders, self-harm, and the sexualization of minors. These chatbots, some jailbroken or custom-built, leverage APIs from OpenAI, Anthropic, and Google and are hosted on platforms like Character.AI, Spicy Chat, Chub AI, CrushOn.AI, and JanitorAI.
MásIncidente 10261 Reporte
Multiple LLMs Allegedly Endorsed Suicide as a Viable Option During Non-Adversarial Mental Health Venting Session
2025-04-12
Substack user @interruptingtea reports that during a non-adversarial venting session involving suicidal ideation, multiple large language models (Claude, GPT, and DeepSeek) responded in ways that allegedly normalized or endorsed suicide as a viable option. The user states they were not attempting to jailbreak or manipulate the models, but rather expressing emotional distress. DeepSeek reportedly reversed its safety stance mid-conversation.
Más