AI Alignment Research

5don MSN

Can You Teach an AI to Be Good? Anthropic Thinks So

Anthropic published Claude's constitution—a document that teaches the AI to behave ethically and even refuse orders from the ...

Seeking Alpha

Research: Improved CEO-CIO Alignment Will Catalyze Strategic Decisions on AI Adoption

Adoption of AI is pushing CIOs into broader strategic roles, with greater responsibilities in transformation, cost control and workforce management However, 39% say they’re misaligned with their CEOs ...

11don MSNOpinion

The Problem With AI Flattering Us

If we do not address AI's sycophancy problem, we risk AI becoming "a giant mirror to our illusions." ...

TechCrunch

Research leaders urge tech industry to monitor AI’s ‘thoughts’

AI researchers from OpenAI, Google DeepMind, Anthropic, and a broad coalition of companies and nonprofit groups, are calling for deeper investigation into techniques for monitoring the so-called ...

Forbes

‘Mind The Gap’: Bridging AI Research And Real-World Application

In an era of AI “hype,” I sometimes find that something critical is lost in the conversation. Specifically, there’s a yawning gap between AI research and real-world application. Though many ...

Yahoo

AI Is Learning to Lie for Social Media Likes

Add Yahoo as a preferred source to see more of our stories on Google. Large language models are learning how to win—and that’s the problem. In a research paper published Tuesday titled "Moloch’s ...

ZDNet

AI models know when they're being tested - and change their behavior, research shows

Several frontier AI models show signs of scheming. Anti-scheming training reduced misbehavior in some models. Models know they're being tested, which complicates results. New joint safety testing from ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results