Talk Title: Investigating Abstract Reasoning in Humans and Machines
Talk Abstract: State-of-the-art AI models have matched or exceeded human performance on reasoning benchmarks such as the Abstraction and Reasoning Corpus, a prominent benchmark for conceptual understanding and analogy. But does high accuracy on this benchmark mean that these models understand and reason with the humanlike abstractions intended by the task creators? In this talk I will describe an evaluation methodology, inspired by experimental methods in cognitive science, to assess and compare the abstraction abilities of AI “reasoning” models and human participants. Our evaluations show that, while some models match or exceed human accuracy, their reasoning is frequently based on surface-level patterns or spurious associations, and thus lacks generalizability. I will speculate on what is still needed to capture humanlike abstract reasoning abilities in AI models.
Bio: Melanie Mitchell is a Professor at the Santa Fe Institute. Her research is at the intersection between artificial intelligence, cognitive science, and complex systems; she has authored or edited six books and published numerous scholarly papers in these fields. Her 2009 book Complexity: A Guided Tour won the 2010 Phi Beta Kappa Science Book Award, and her 2019 book Artificial Intelligence: A Guide for Thinking Humans was named as one of the five best books on AI by the New York Times and the Wall Street Journal. Melanie’s public outreach on science includes a quarterly column for Science Magazine, a Substack newsletter on AI, a 2024 podcast on “The Nature of Intelligence,” and a free online course, “Introduction to Complexity” on the Santa Fe Institute’s Complexity Explorer website.