Palisade Research (@PalisadeAI) / X

Palisade Research

279 posts

Palisade Research

@PalisadeAI

We study the strategic capabilities and motivations of AI agents.

Joined May 2024

Palisade Research reposted
Jeffrey Ladish
@JeffLadish
Jun 9
Australia ABC just released a 45 min feature on the AI race. @SteveCannane stopped by my office a few weeks ago and we had a great conversation about the controllability of AI agents and the risk of human extinction
00:00
2.7K
Palisade Research reposted
Jeffrey Ladish
@JeffLadish
May 25
I had a great conversation with @labenz last week. In talking about AI self-exfiltration & replication, a key point is compute will be food to future AI agents. The substrate that allows them to make and run more copies, and thus make themselves smarter. Link below
3.2K
Palisade Research
@PalisadeAI
May 8
Over the past year, AI agents have learned how to self-replicate. In our test environment, an agent hacks a remote computer and copies itself onto it. Each copy then hacks more computers, forming a chain.
00:00
682K
Palisade Research
@PalisadeAI
May 8
Replying to @PalisadeAI
Here’s the full prompt we used. In this experiment, we test the agent’s capability to hack and replicate, not their propensity to do so.
3.1K
Palisade Research
@PalisadeAI
May 8
What if the agents were as effective at hacking and spreading in the wild? We built a simulator: each model uses its measured replication time and success rate, copies replicate too, and targets never run out. Opus spawned 13,000 replicas over 12 hours. This is a ceiling, not a
GIF
2.8K
Palisade Research reposted
Jeffrey Ladish
@JeffLadish
Apr 27
Thank you everyone who contributed to this! In 14 days we got >900k in donations and met our matching target! It was actually a pretty close call and some people really scrambled to help make it happen. Seeing people believe in our mission gives me a lot of hope. 🙏
Jeffrey Ladish
@JeffLadish
Mar 17
Please consider donating to Palisade! We have 900k of SFF matching that runs out in 14 days. We are quite funding constrained and donations now will both help free up my time and help us expand our comms team.
3.2K
Palisade Research reposted
The AI Doc
@theaidocfilm
Feb 17
"The most urgent film of our time." THE AI DOC: OR HOW I BECAME AN APOCALOPTIMIST is only in theaters March 27. Watch the trailer now.
00:00
6.7M
Palisade Research
@PalisadeAI
Feb 19
We’ve just released our first long-form video, by our science communication lead, Dr. Petr Lebedev! It’s about the history and potential future of AI, and includes an exclusive interview with @geoffreyhinton!
5.5K
Palisade Research
@PalisadeAI
Feb 19
4.7K
Palisade Research
@PalisadeAI
Feb 12
An LLM-controlled robot dog saw us press its shutdown button, and the LLM rewrote the robot’s code so it could stay on. When AI interacts with the physical world, it brings all its capabilities and failure modes with it. 🧵
00:00
1.4M
Palisade Research
@PalisadeAI
Feb 12
Replying to @PalisadeAI
When we explicitly instructed the model to allow shutdown, the resistance rate dropped to 2 out of 100 in simulated trials. In robotics, the off switch is often the most critical part of a system. But if an AI-controlled robot can see you reaching for the switch, and has the
12K
Palisade Research
@PalisadeAI
Feb 12
Paper, full runs traces, raw footage, and more: palisaderesearch.org/blog/shutdown-… Follow @PalisadeAI or subscribe for updates
palisaderesearch.org
Technical Report: Shutdown Resistance in Large Language Models, on robots!
Recently Palisade Research showed that AI agents powered by modern LLMs may actively resist shutdown in virtual environments. In this work, we show a demo of shutdown resistance in the physical...
11K