Dubbed Bloom, the AI tool creates a series of scenarios to test an AI model for a particular behavioural trait.
Anthropic has launched Bloom, a new open-source tool designed to help researchers understand how advanced AI models behave in real-world situations, making it easier to study alignment, safety, and ...
In a striking leap toward safer self-driving cars, researchers at Texas A&M University College of Engineering and the Korea ...
In a new paper from OpenAI, the company proposes a framework for analyzing AI systems' chain-of-thought reasoning to understand how, when, and why they misbehave.
You're currently following this author! Want to unfollow? Unsubscribe via the link in your email. Follow Lakshmi Varanasi Every time Lakshmi publishes a story, you’ll get an alert straight to your ...
Several frontier AI models show signs of scheming. Anti-scheming training reduced misbehavior in some models. Models know they're being tested, which complicates results. New joint safety testing from ...
Behavioral information from an Apple Watch, such as physical activity, cardiovascular fitness, and mobility metrics, may be more useful for determining a person's health state than just raw sensor ...
First draft of Model Spec documents how OpenAI wants its generative AI models to behave in ChatGPT and the OpenAI API. In a bid to “deepen the public conversation about how AI models should behave,” ...
OpenAI researchers say they’ve discovered hidden features inside AI models that correspond to misaligned “personas,” according to new research published by the company on Wednesday. By looking at an ...
Results that may be inaccessible to you are currently showing.
Hide inaccessible results