Cameron Jones (@camrobjones) / X

Cameron Jones

1,855 posts

Cameron Jones

@camrobjones

Assistant Professor in Psychology at Stony Brook University. I’m interested in how people interact with LLMs and they impact they might have on our psychology.

Brooklyn, NY

Joined October 2012

Pinned
Cameron Jones
@camrobjones
Jul 17, 2025
Incredibly excited to announce I’ll be starting as an Asst Professor in the Psychology Department at Stony Brook this fall! I’ll also be recruiting students this year so let me know if you know any students who might be interested!
4.4K
Cameron Jones
@camrobjones
Apr 1, 2025
New preprint: we evaluated LLMs in a 3-party Turing test (participants speak to a human & AI simultaneously and decide which is which). GPT-4.5 (when prompted to adopt a humanlike persona) was judged to be the human 73% of the time, suggesting it passes the Turing test (🧵)
279K
Cameron Jones
@camrobjones
May 15, 2024
New Preprint: People cannot distinguish GPT-4 from a human in a Turing test. In a pre-registered Turing test we found GPT-4 is judged to be human 54% of the time. On some interpretations this constitutes the most robust evidence to date that any system passes the Turing test 🧵
424K
Cameron Jones
@camrobjones
Feb 24, 2021
Replying to @SageWaterDragon and @drsrednivashtar
Timed open book exams reflect exactly this pressure though. Knowing enough and knowing how to access the rest quickly is the optimal balance. Also, this was exactly my experience working in software engineering. Memorising obscure documentation/syntax is pointless.
Cameron Jones
@camrobjones
Sep 12, 2024
A few (late) life updates - I defended my PhD! - I’ve started a postdoc (at UCSD) on persuasion and deception in LLMs. - I’ve (confusingly) moved to New York! Let me know if you’re here and want to hang out or know anyone who’s looking for a roommate.
12K
Cameron Jones
@camrobjones
Jul 4, 2023
Really excited that our paper comparing human and GPT-3 performance at the False Belief Task has been published in Cognitive Science onlinelibrary.wiley.com/doi/full/10.11… @cogsci_soc @Sean_Trott @jamichaelov @tylerachang
onlinelibrary.wiley.com
Do Large Language Models Know What Humans Know?
Humans can attribute beliefs to others. However, it is unknown to what extent this ability results from an innate biological endowment or from experience accrued through child development, particul...
9.4K
Cameron Jones
@camrobjones
Apr 1, 2025
Replying to @camrobjones
So do LLMs pass the Turing test? We think this is pretty strong evidence that they do. People were no better than chance at distinguishing humans from GPT-4.5 and LLaMa (with the persona prompt). And 4.5 was even judged to be human significantly *more* often than actual humans!
32K
Cameron Jones
@camrobjones
Aug 16, 2023
Can you tell the difference between a human and an AI? Really excited to share turingtest.live, a site where you can play the Turing Test with GPT-4. You get randomly matched with either a human or an AI, and you have 5 minutes to decide which it is.
The Turing Test — Can you tell a human from an AI?
From turingtest.live
19K
Cameron Jones
@camrobjones
Jul 29, 2024
Nice to see @davidchalmers42 discussing our Turing test work at the Descartes lectures (even if he went on to call it “weak”..)
4K
Cameron Jones
@camrobjones
Jul 12, 2024
How much does language help you to reason recursively about other people’s beliefs (e.g. “I know that you think that Alice hates Bob”)? A 2015 study found that people can do this up to 7 levels! We replicated this with humans and found GPT-3 is accurate up to ~4.
4.8K
Cameron Jones
@camrobjones
May 15, 2024
Replying to @camrobjones
People judged GPT-4 to be human 54% of the time, compared to 22% for ELIZA and 67% for humans. The implication is that people are at chance in determining that GPT-4 is an AI, even though the study is powerful enough to detect differences from 50% accuracy.
176K
Cameron Jones
@camrobjones
Apr 1, 2025
Replying to @camrobjones
In previous work we found GPT-4 was judged to be human ~50% of the time in a 2-party Turing test, where ppts speak to *either* a human or a model. This is probably easier for several reasons. Here we ran a new study with Turing's original 3-party setup arxiv.org/abs/2503.23674
9.6K
Cameron Jones
@camrobjones
Oct 5, 2023
Initial results from turingtest.live! Humans were correctly classed as humans only 65% of the time. The best performing GPT-4 prompt fooled humans around 39% of the time. Eliza performs significantly worse at around 25%, but still better than GPT-3.5 at only 5%!
7.8K
Cameron Jones
@camrobjones
May 15, 2024
Replying to @camrobjones
Reasonable people will disagree about what constitutes passing the TT. I think the more urgent implication of these findings is that people cannot reliably determine whether current AI models are human after a 5 minute conversation dedicated to figuring this out.
9.5K