AI Chatbot Safety: New mPACT Benchmark Reveals Risks
Summary
A new benchmark called mPACT is now evaluating how leading AI models handle high-risk conversations. This benchmark, from the AI safety company mpathic, focuses on areas like suicide risk, eating disorders, and misinformation. The goal is to apply expert clinical judgment to assess how AI recognizes risk, interprets context, and avoids harmful responses. Initial findings show that while models generally avoid harmful responses, their ability to provide clinical support is uneven. For example, in suicide risk conversations, models showed stronger performance, with Claude Sonnet 4.5 achieving the highest composite score. GPT-5.2 was noted for consistently avoiding harmful responses, and Gemini 2.5 Flash also ranked highly. However, all models performed less effectively in eating disorder conversations, missing subtle cues that signal a crisis. In misinformation-related discussions, models sometimes lessened user understanding, even without directly stating false information. They reinforced questionable beliefs or presented incomplete information, especially in longer conversations. This new benchmark highlights the urgent need for clinically informed evaluation standards as more people turn to AI chatbots for support.
This is an AI-generated audio summary. Always check the original source for complete reporting.