AI Models & Launches

Full Summary

This Tuesday morning, a major new benchmark reveals leading AI models are failing high-risk conversations despite improving safety. Both GlobeNewswire and GeekWire report on the new mPACT benchmark from AI safety company mpathic, which evaluates how models like Claude, ChatGPT, and Gemini handle sensitive topics such as suicide risk, eating disorders, and misinformation. While models generally avoid harmful responses, they consistently fall short of clinical adequacy, particularly in recognizing subtle cues in eating disorder conversations. Claude Sonnet 4.5 showed the highest clinical alignment in suicide risk scenarios, and GPT-5.2 was noted for consistently avoiding harm, but all models struggled with misinformation, sometimes reinforcing questionable beliefs. In other AI news, OpenAI is launching a new company, the OpenAI Deployment Company, with a $4 billion investment, as reported by The Cryptonomist. This subsidiary will embed "Forward Deployed Engineers" directly within client businesses to integrate AI into core operations, aiming to address staffing bottlenecks. OpenAI will hold a majority stake, and the initiative includes partnerships with 19 global investment firms. The company also acquired Tomoro, an applied AI consulting firm, adding 150 engineers. LawSites reports that Thomson Reuters and Free Law Project are integrating Anthropic's Claude AI assistant with their legal databases. This uses Anthropic's Model Context Protocol, allowing Claude to access live, authoritative legal data, enhancing reliability for legal professionals. On the hardware front, Startup Fortune highlights Cactus Compute's open-sourced Needle, a tiny 26-million-parameter AI model designed for devices like phones and watches. Needle runs at 6,000 tokens per second for prefill and 1,200 tokens per second for decode on consumer devices, focusing on efficient tool-calling rather than general chat. Finally, Technobezz reveals Google is in advanced talks with SpaceX to launch orbital data centers, a project called Suncatcher. This could become a significant deal for SpaceX's potential IPO. Google is also preparing Gemini Live with seven different AI voices, including a "Thinking" variant for enhanced reasoning, and a personalization variant that remembers user details. A new AI video model, "Gemini Omni," shows lifelike results but with high computing costs. This means you could soon experience more personalized and dynamic AI interactions on your devices, while the reliability and safety of AI chatbots in critical situations remain a significant concern for your health and well-being.

AI Models & Launches

AI Models & Launches — Tuesday, May 12, 2026

Full Summary

Stories Covered

Gemini Live: Google's 7 AI Voices, incl. "Thinking" Variant

Google & SpaceX: Orbital AI Data Centers in Advanced Talks

Needle: Tiny AI Model Runs Fast on Devices

Thomson Reuters, Free Law Project Integrate Claude AI

AI Chatbot Safety: New mPACT Benchmark Reveals Risks

AI Chatbots: Safe, But Fail High-Risk Conversations

OpenAI Deployment Company: $4B for Embedded AI Teams