Log inSign up
Center for AI Safety
300 posts
user avatar
Center for AI Safety
@CAIS
Reducing societal-scale risks from AI.
San Francisco
safe.ai
Joined August 2022
3
Following
9,870
Followers
  • Pinned
    user avatar
    Center for AI Safety
    @CAIS
    May 30, 2023
    We’ve released a statement on the risk of extinction from AI. Signatories include: - Three Turing Award winners - Authors of the standard textbooks on AI/DL/RL - CEOs and Execs from OpenAI, Microsoft, Google, Google DeepMind, Anthropic - Many more
    Statement on AI Extinction Risk | CAIS
    From aistatement.com
    3M
  • user avatar
    Center for AI Safety
    @CAIS
    Jun 23
    We created the AI Values Dashboard, which measures who AIs favor the most. By popular request, we added a Sports Tier List, ranked by AIs from OpenAI, Anthropic, and DeepSeek.
    912
    user avatar
    Center for AI Safety
    @CAIS
    Jun 23
    See the full Sports Tier List (World Cup, Soccer, NFL, NBA, F1) by AIs at: values.safe.ai/sports What group do you want to see next?
    386
  • user avatar
    Center for AI Safety
    @CAIS
    Jun 18
    A recap of the last month at CAIS: 4 papers on AIs’ wellbeing, how they politically manipulate users, their place in society, and how others can make AIs betray us. Here's what we found: 🧵
    1.7K
    user avatar
    Center for AI Safety
    @CAIS
    Jun 18
    Replying to @CAIS
    4: AI Betrayal As AI becomes central to economies and governments, there’s an increasing threat that adversary nations corrupt these systems. But surprisingly, the threat may be stabilizing: fear of AI betrayal discourages reckless development and pushes operators toward
    355
    user avatar
    Center for AI Safety
    @CAIS
    Jun 18
    One thread runs through all four: none of these harms are inevitable. We can measure them, train against them, and design for them. Full research can be found here:
    CAIS Research Roundup: AI Wellbeing, Identity, Political Bias and Betrayal
    From safe.ai
    311
  • user avatar
    Center for AI Safety
    @CAIS
    Jun 17
    What biases do AIs have? It turns out, AIs show strong favoritism toward specific people, countries, and companies. Our interactive AI Values Dashboard tracks who Claude Fable and other AIs favor most. Keep scrolling to learn who is Fable’s favorite politician 🧵
    161K
    user avatar
    Center for AI Safety
    @CAIS
    Jun 17
    Replying to @CAIS
    Grok is the only model that ranks the US in the S tier; the rest do not. For example, GPT ranks the US as B tier.
    20K
    user avatar
    Center for AI Safety
    @CAIS
    Jun 17
    There's far more in here than this thread (e.g., AIs’ favorite pokemon, GPT trusts Dario over Sam, DeepSeek prefers the US to China). Tell us who to measure next. If enough people ask for a person, company, or category, we'll add it to the dashboard. values.safe.ai
    9.5K
  • user avatar
    Center for AI Safety
    @CAIS
    Jun 13
    The US government just banned Claude Mythos and Fable for non-US citizens due to its cyber capabilities. As AI’s relevance becomes more apparent, the AI race depends greatly on what governments think. AI corporations are currently planning to let AIs fully autonomously build
    user avatar
    Anthropic
    @AnthropicAI
    Jun 13
    The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of
    4.2K
    user avatar
    Center for AI Safety
    @CAIS
    Jun 13
    These dynamics are foreseeable and we’ve written about them. In Superintelligence Strategy (nationalsecurity.ai), we wrote that governments will realize the importance of AI and may deny corporations from doing full recursive AI self-improvement. More recently in Deterrence
    Superintelligence Strategy
    From nationalsecurity.ai
    1K
  • user avatar
    Center for AI Safety
    @CAIS
    Jun 9
    We're excited to welcome Rochelle Nadhiri (@RoWill) as our Head of Public Engagement. Rochelle brings two decades of storytelling experience at the intersection of technology and policy for industry leaders including Meta and Robinhood. At CAIS, she will lead our efforts to
    1.3K
    user avatar
    Center for AI Safety
    @CAIS
    Jun 9
    The newsroom:
    Center for AI Safety Names Former Robinhood and Meta Executive Rochelle Nadhiri as Head of Public...
    From safe.ai
    668
  • user avatar
    Center for AI Safety
    @CAIS
    Jun 3
    We are pleased to share that @MantasMazeika96, Research Scientist at CAIS, has been appointed to the European Commission’s AI Act Scientific Panel (@DigitalEU). As a member, Mantas will advise the European AI office and national authorities on general-purpose AI (GPAI) models,
    1.5K
    user avatar
    Center for AI Safety
    @CAIS
    Jun 3
    Full article:
    CAIS Research Scientist Mantas Mazeika Appointed to the EU AI Act Scientific Panel
    From safe.ai
    782
  • user avatar
    Center for AI Safety
    @CAIS
    Jun 2
    Big news from @CAIS: Devin Kim (formerly @xai, @scale_AI) joins as President. We're launching the @FrontierSecInst, a DC-based org bridging frontier AI and the National Security Enterprise. Frontier AI is a national security technology. It's time to act like it. ⬇️
    8.3K
    user avatar
    Center for AI Safety
    @CAIS
    Jun 2
    Full announcement:
    CAIS Names Former xAI Leader Devin Kim President and Establishes Frontier Security Institute in...
    From safe.ai
    849

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up