From Ddms To DNNS: Using Process Data and Models of Decision-Making To Improve Human-Ai Interactions
From Ddms To DNNS: Using Process Data and Models of Decision-Making To Improve Human-Ai Interactions
Correspondence concerning this article should be addressed to Mrugsen Nagsen Gopnarayan, E-mail:
[email protected] or [email protected]
FROM DDM TO DNN 2
Predicting human decisions has been a key topic of interest in psychology for over
a century. Prospect theory has proven successful in doing so for economic choices by incor-
porating elements of risk tendencies, loss aversion, and probability weighting (Kahneman
& Tversky, 1979). However, understanding how these choices come about requires process
models of decision-making that describe how decisions emerge over time (from sensory in- put
to motor output). In this regard, human decision-making has been modeled as the
accumulation of information over time until a decision threshold is met. The drift-diffusion
model (DDM) is a widely used mathematical model that represents the accumulation of
evidence over time at a specific rate (the drift rate), with some added noise (diffusion)
(Ratcliff, 1978; Ratcliff et al., 2016). Although being the dominant model in the field, the
DDM is only one representative of the much larger class of evidence accumulation mod-
els (EAMs)(Busemeyer et al., 2019; Smith & Ratcliff, 2004). While all EAMs can predict
choices and reaction times, some might perform better on specific tasks, or might be stronger
grounded on neural principles such as the Leaky Competing Accumulator (LCA) (Usher
& McClelland, 2001). Taking the DDM as an example, parameters of EAMs reflect latent
psychological processes: A pre-existing bias for either option is modeled by the starting
point parameter (z). The decision boundary threshold (a) represents the cautiousness of
the decision maker, while the drift rate (v) models the rate of evidence accumulation. Non-
decision time (ndt) is also considered. These parameters can be estimated using maximum
likelihood estimation (MLE) or a Bayesian approach, providing psychological inferences to
explain the data using these parameters. Unlike prospect theory, which only accounts for
decision-making in the case of risky choices, EAMs offers a more general framework for
decision-making that can be extended to perceptual tasks (Summerfield & Tsetsos, 2012).
In real life, people often need to make decisions for others (like buying a gift for
Secret Santa) or with others (such as deciding on a location for a team retreat). Variations
in others’ preferences add complexity to the social decision-making process. By translating
these different preferences into different payoffs, game theory can model social decision-
making. A classic example of such a game, the prisoner’s dilemma, involves two individuals
who must decide whether to betray each other or not. Based on the different payoffs for
betraying or cooperating, game theory can determine the rational strategy. However, game
theory makes strong assumptions about how rationality is defined (e.g., the Nash equilib-
rium) (Nash, 1951) and further assumes that all players have the capacity to identify rational
strategies. In reality, both assumptions often do not hold. Humans often rely on heuris-
tics and approximations to make decisions, and they exhibit social preferences and beliefs
that can drive them away from (equilibrium) predictions of game theory (Camerer, 2011).
Consequently, models that account for social utility, including the warm-glow, inequity aver-
sion, and envy aversion effects, have been developed to capture systematic deviations from
standard equilibrium predictions (Andreoni, 1990; Fehr & Schmidt, 1999; (Kevin) et al.,
2020).
Importantly, humans take into account each other’s mental states, preferences, be-
liefs, intentions, and emotions when making decisions in social settings. This ability is
known as Theory of Mind (ToM), and it appears to be unique to humans and some higher-
order primates (Call & Tomasello, 2008; Lewis & Krupenye, 2021; Premack & Woodruff,
FROM DDM TO DNN 3
1978). Traditionally, two main approaches to ToM have been proposed. The first is ’theory
theory’, where it is assumed that each person is an ’intuitive psychologist’, forming personal
theories about how others behave or decide in different contexts and using these to predict
their thoughts and actions (Gopnik & Wellman, 1994). The second approach is called sim-
ulation theory, which suggests that humans simulate others to understand them (Gallese,
1998; Gordon, 1986). Thus, person A could simulate another person B by replaying (or pre
playing) the decision B has made (or is about to make), using A’s own decision-making
system. Notably, the discovery of mirror neurons that are active both when we perform an
action and when we witness someone else perform the same action has been seen as
supporting the simulation account of TOM (Gordon, 1986; Rizzolatti & Craighero, 2004).
of the LRP could be an indicator of different non-decision times in decisions with different
cognitive demands (memory-based vs. regular decisions) (Kraemer & Gluth, 2023). In
perceptual decision making, the centro-parietal positivity (CPP) has been proposed as a
marker of evidence accumulation (O’Connell et al., 2012), but this claim has recently been
challenged. (Frömer et al., 2022).
Besides neural mechanisms, there are also powerful tools to track peripheral and
physiological signals that are intertwined with decision-making processes. Among these, eye-
tracking is particularly relevant, since eye movements are intricately coupled with decision-
making and serve as a window into a decision-maker’s attentional processes. Eye movements
seem to both influence and reflect preference (Shimojo et al., 2003) and have been used to
inform EAMs. The attentional Drift-Diffusion Model (aDDM) (Krajbich et al., 2010) as-
sumes that the drift rate depends on which option is currently fixated, with the fixated
option impacting the drift to a larger degree than non-fixated options. Importantly, many
studies have shown that taking eye-movements into account via the aDDM (or similar mod-
els) improves predicting decisions substantially (Gluth et al., 2020; Gluth et al., 2018). In
addition to eye movements, pupil responses are another critical physiological data source
as pupil dilation appears to be a reliable measure of arousal (Joshi et al., 2016), and re-
veals how decisions evolve (de Gee et al., 2014). Relatedly, choices made contrary to (and
thus overcoming) default responses lead to an increased pupil dilation. The starting point
in DDM (z) which is representative of such a response bias is predictive of pupil dilation
(Sheng et al., 2020). Mouse tracking can also reflect the decision process. When individ-
uals encounter conflicting options or experience decision difficulty, their mouse can exhibit
curvilinear or wavering trajectories, reflecting internal conflict or deliberation. Conversely,
when individuals have a clear preference or make rapid decisions, their mouse movements
follow more direct and straight paths (Spivey & Dale, 2006). Heart rate, skin conductance,
and cortisol levels are other physiological methods for understanding the decision-making
process. However, these measures are more indirect and have high latency, thus they are
not widely used in process-tracing studies.
wisher, 2003), and it encodes the utility of an anonymous partner in an allocation task (Hu
et al., 2023). Gray matter volume in TPJ is predictive of individual variations in altruis-
tic preferences. TPJ activation is associated with the ability of taking others’ perspective
and empathizing with their suffering (Lamm et al., 2007). Another core ToM region , the
anterior cingulate cortex (ACC), is more active during cooperative social interactions with
human partners, as compared to non-social conditions (Rilling et al., 2002).
Obviously, humans do not have the capability to directly access the neurophysiolog-
ical information of others when interacting with them. Instead, they need to infer others’
thoughts and intentions by observing their behavior, body language, facial expressions, and
gaze patterns. Eye-tracking studies have shown that displaying the gaze allocation of one
participant to another in a coordination game improves understanding of each other’s pre-
ferred choices and facilitates strategic decision-making for maximizing rewards (Hausfeld
et al., 2020). Strikingly, a series of recent studies have shown that lay people seem to have
an intuitive understanding of a fundamental prediction of EAMs, which is that the time
it takes to make a decision usually indicates decision difficulty and thus also the difference
in preference for the options (fast responses are associated with low difficulty and high dif-
ference in preference). In other words, humans take not only decisions but also decision
speed into account when inferring others’ hidden preferences and beliefs (Arabadzhiyska
et al., 2022). By inverting the DDM and adopting a Bayesian approach, the process by
which individuals infer others’ preferences can be modeled (Gates et al., 2021). However,
the neural mechanisms of this ability remain elusive.
interaction becomes a prerequisite to address the issue. In such instances, AI systems are
expected to simulate human roles, assisting or coordinating with the user. A competent
artificial agent should, therefore, be capable of understanding human behavior, accounting
for the likes, beliefs, intentions, desires, and emotions of the person they interact with.
This unique human capacity is termed Theory of Mind (ToM). Recent advancements in
neural networks, trained using multi-agent reinforcement learning, have laid a foundational
framework for creating ToM-capable artificial agents.
When it comes to emulating human social and cooperative behaviors, the classic
single agent ML approaches exhibit limitations. This is where multi-agent systems have
shown tremendous potential. Multi-agent AI systems are designed to simulate interactive
scenarios where multiple intelligent agents either cooperate or compete to accomplish cer- tain
objectives. Conventional single-agent RL approaches such as Q-Learning or policy
gradient fail poorly as the environments become dynamic as agents change their behavior
with training, leading to a moving goal post and the computational complexity increases
after scaling to multi-agents (Gu et al., 2004; Wang et al., 2011).
However, inspired by social and behavioral sciences (Duffy et al., 1998), and powered
by Deep Learning, successful multi-agent AI approaches have emerged. These approaches,
often termed as Multi-Agent Deep Reinforcement Learning (MADRL), involve the use of
interconnected deep neural networks as agents within a system, which are trained using
reinforcement learning principles. Each of these agents interacts with both the environment
and other agents in the system, to optimize a set of objectives or rewards. Capable of
learning to represent complex functions, predicting future states, or estimating the value of
different actions, these agents are vital tools in reinforcement learning (Mnih et al., 2015).
These agents can master complex multi-agent video games with superhuman capacity using
just simple visual information (Tampuu et al., 2017). Further, these approaches can be
adapted to manage the traffic of autonomous vehicles (Zhao et al., 2021).
This level of advancement in MADRL permits the modeling of complex social and
cooperative behaviors, akin to those observed in humans and animals (Lowe et al., 2017).
MADRL has served as a basic framework to create ToM-capable artificial agents. Despite
the success of these AI models, such as ChatGPT, which correctly solved 95% of text- based
false belief tasks it was tested against (Kosinski, 2023), but see Ullman (2023), and
ToMnet, a DNN that claims to model the agents it encounters solely from observations of
their behavior (Rabinowitz et al., 2018), it is debatable whether these networks have truly
attained ToM capabilities. For instance, instead of learning the difference between internal
states and true states for other agents, an agent like ToMnet might exploit the combination
of positions and distances between elements to navigate. This suggests that rather than
developing genuine ToM, the deep neural networks may have learned shortcuts, solving a
simpler problem than genuinely achieving Theory of Mind (Aru et al., 2023). Indeed, in
humans, ToM is not merely task-based. For example, children do not repetitively learn to
solve the Sally-Anne task to gain ToM. Therefore, instead of training deep learning agents
on specific tasks that might require ToM, true ToM abilities would potentially emerge in
more complex and open-ended environments (Aru et al., 2023).
FROM DDM TO DNN 7
Deep neural networks can be used to imitate the EAM’s process of evidence accumulation
in humans. A) Social decision making. One way that humans may use to learn others’
preferences from their decisions is by simulating those decisions with their own mind. For
example, if you offer someone a choice between an apple and an orange, and they pick the apple
quickly, you will infer that they have a stronger preference for Apple. B) Training DNN using
EAMs. Evidence accumulation models can be used to generate training data to train DNNs
(above), doing this will align the decision process of DNNs to humans. These DNNs modules
could be then used in socially interacting robots (below), which will facilitate human-AI
interaction.
FROM DDM TO DNN 9
References
Andreoni, J. (1990). Impure altruism and donations to public goods: A theory of warm-glow
giving. The Economic Journal, 100 (401), 464. https://doi.org/10.2307/2234133
Arabadzhiyska, D. H., Garrod, O. G., Fouragnan, E., Luca, E. D., Schyns, P. G., & Phil-
iastides, M. G. (2022). A common neural account for social and nonsocial decisions.
The Journal of Neuroscience, 42 (48), 9030–9044. https://doi.org/10.1523/jneurosci.
0375-22.2022
Aru, J., Labash, A., Corcoll, O., & Vicente, R. (2023). Mind the gap: Challenges of deep
learning approaches to theory of mind. Artificial Intelligence Review, 56 (9), 9141– 9156.
https://doi.org/10.1007/s10462-023-10401-x
Busemeyer, J. R., Gluth, S., Rieskamp, J., & Turner, B. M. (2019). Cognitive and neural
bases of multi-attribute, multi-alternative, value-based decisions. Trends in Cogni-
tive Sciences, 23 (3), 251–263. https://doi.org/10.1016/j.tics.2018.12.003
Call, J., & Tomasello, M. (2008). Does the chimpanzee have a theory of mind? 30 years
later. Trends in Cognitive Sciences, 12 (5), 187–192. https://doi.org/10.1016/j.tics.
2008.02.010
Camerer, C. F. (2011). Behavioral game theory - experiments in strategic interaction. Prince-
ton University Press.
FROM DDM TO DNN 10
Christiano, P. F., Leike, J., Brown, T., Martic, M., Legg, S., & Amodei, D. (2017). Deep
reinforcement learning from human preferences. Advances in neural information
processing systems, 30.
de Gee, J. W., Knapen, T., & Donner, T. H. (2014). Decision-related pupil dilation re- flects
upcoming choice and individual bias. Proceedings of the National Academy of Sciences,
111 (5). https://doi.org/10.1073/pnas.1317557111
Dosovitsky, G., & Bunge, E. L. (2021). Bonding with bot: User feedback on a chatbot for
social isolation. Frontiers in digital health, 3, 735053.
Duffy, J., Epstein, J. M., & Axtell, R. (1998). Growing artificial societies: Social science
from the bottom up. Southern Economic Journal, 64 (3), 791. https://doi.org/10.
2307/1060800
Fehr, E., & Schmidt, K. M. (1999). A theory of fairness, competition, and cooperation.
The Quarterly Journal of Economics, 114 (3), 817–868. https://doi.org/10.1162/
003355399556151
Frömer, R., Nassar, M., Ehinger, B., & Shenhav, A. (2022). Common neural choice signals
emerge artifactually amidst multiple distinct value signals. https://doi.org/10.1101/
2022.08.02.502393
Fulmer, R., Joerin, A., Gentile, B., Lakerink, L., Rauws, M., et al. (2018). Using psycho-
logical artificial intelligence (tess) to relieve symptoms of depression and anxiety:
Randomized controlled trial. JMIR mental health, 5 (4), e9782.
Gallese, V. (1998). Mirror neurons and the simulation theory of mind-reading. Trends in
Cognitive Sciences, 2 (12), 493–501. https://doi.org/10.1016/s1364-6613(98)01262-5
Gates, V., Callaway, F., Ho, M. K., & Griffiths, T. L. (2021). A rational model of people’s
inferences about others’ preferences based on response times. Cognition, 217, 104885.
https://doi.org/10.1016/j.cognition.2021.104885
Gluth, S., Rieskamp, J., & Buchel, C. (2012). Deciding when to decide: Time-variant sequen-
tial sampling models explain the emergence of value-based decisions in the human
brain. Journal of Neuroscience, 32 (31), 10686–10698. https :// doi .org/10.1523/
jneurosci.0727-12.2012
Gluth, S., & Fontanesi, L. (2016). Wiring the altruistic brain. Science, 351 (6277), 1028–1029.
https://doi.org/10.1126/science.aaf4688
Gluth, S., Kern, N., Kortmann, M., & Vitali, C. L. (2020). Value-based attention but not
divisive normalization influences decisions with multiple alternatives. Nature Human
Behaviour, 4 (6), 634–645. https://doi.org/10.1038/s41562-020-0822-0
Gluth, S., Rieskamp, J., & Büchel, C. (2013). Deciding not to decide: Computational and
neural evidence for hidden behavior in sequential choice (T. Behrens, Ed.). PLoS
Computational Biology, 9 (10), e1003309. https :// doi .org/10.1371/journal.pcbi.
1003309
Gluth, S., Spektor, M. S., & Rieskamp, J. (2018). Value-based attentional capture affects
multi-alternative decision making. eLife, 7. https://doi.org/10.7554/elife.39659
Gopnik, A., & Wellman, H. M. (1994). The theory theory. In Mapping the mind (pp. 257–
293). Cambridge University Press. https://doi.org/10.1017/cbo9780511752902.011
Gordon, R. M. (1986). Folk psychology as simulation. Mind and Language, 1 (2), 158–71.
https://doi.org/10.1111/j.1468-0017.1986.tb00324.x
FROM DDM TO DNN 11
Gu, J., Huh, Y. O., Jiang, F., Caraway, N. P., Romaguera, J. E., Zaidi, T. M., Fernandez,
R. L., Zhang, H., Khouri, I. F., & Katz, R. L. (2004). Evaluation of peripheral blood
involvement of mantle cell lymphoma by fluorescence in situ hybridization in
comparison with immunophenotypic and morphologic findings. Modern Pathology,
17 (5), 553–560. https://doi.org/10.1038/modpathol.3800068
Hare, T. A., Malmaud, J., & Rangel, A. (2011). Focusing attention on the health aspects
of foods changes value signals in vmPFC and improves dietary choice. Journal of
Neuroscience, 31 (30), 11077–11087. https://doi.org/10.1523/jneurosci.6383-10.2011
Hausfeld, J., von Hesler, K., & Goldlücke, S. (2020). Strategic gaze: An interactive eye-
tracking study. Experimental Economics, 24 (1), 177–205. https://doi.org/10.1007/
s10683-020-09655-x
Hu, J., Konovalov, A., & Ruff, C. C. (2023). A unified neural account of contextual and
individual differences in altruism. eLife, 12. https://doi.org/10.7554/elife.80667
Hunt, L. T., & Hayden, B. Y. (2017). A distributed, hierarchical and recurrent framework
for reward-based choice. Nature Reviews Neuroscience, 18 (3), 172–182. https://doi.
org/10.1038/nrn.2017.7
Jahn, A., Nee, D. E., Alexander, W. H., & Brown, J. W. (2014). Distinct regions of anterior
cingulate cortex signal prediction and outcome evaluation. NeuroImage, 95, 80–89.
https://doi.org/10.1016/j.neuroimage.2014.03.050
Joshi, S., Li, Y., Kalwani, R. M., & Gold, J. I. (2016). Relationships between pupil diameter
and neuronal activity in the locus coeruleus, colliculi, and cingulate cortex. Neuron,
89 (1), 221–234. https://doi.org/10.1016/j.neuron.2015.11.028
Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasu-
vunakool, K., Bates, R., Žıdek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl,
S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler,
J., . . . Hassabis, D. (2021). Highly accurate protein structure prediction with Al-
phaFold. Nature, 596 (7873), 583–589. https://doi.org/10.1038/s41586-021-03819-2
Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk.
Econometrica, 47 (2), 263. https://doi.org/10.2307/1914185
(Kevin), S., Li, Kokkoris, M. D., & Savani, K. (2020). Does everyone have the potential to
achieve their ideal body weight? lay theories about body weight and support for price
discrimination policies. Organizational Behavior and Human Decision Processes, 157
(100), 129–142. https://EconPapers.repec.org/RePEc:eee:jobhdp:v:157:y:2020:
i:c:p:129-142
Kolling, N., Behrens, T. E. J., Mars, R. B., & Rushworth, M. F. S. (2012). Neural mech-
anisms of foraging. Science, 336 (6077), 95–98. https://doi.org/10.1126/science.
1216930
Kornhuber, H. H., & der Deecke, L. (1965). Hirnpotential nderungen bei willk rbewegungen
und passiven bewegungen des menschen: Bereitschaftspotential und reafferente po-
tentiale. Pfl gers Archiv f r die Gesamte Physiologie des Menschen und der Tiere, 284
(1), 1–17. https://doi.org/10.1007/bf00412364
Kosinski, M. (2023). Theory of mind may have spontaneously emerged in large language
models. arXiv preprint arXiv:2302.02083.
FROM DDM TO DNN 12
Kraemer, P. M., & Gluth, S. (2023). Episodic memory retrieval affects the onset and dy-
namics of evidence accumulation during value-based decisions. Journal of Cognitive
Neuroscience, 35 (4), 692–714. https://doi.org/10.1162/jocn_a_01968
Krajbich, I., Armel, C., & Rangel, A. (2010). Visual fixations and the computation and
comparison of value in simple choice. Nature Neuroscience, 13 (10), 1292–1298. https:
//doi.org/10.1038/nn.2635
Kraus, S., Azaria, A., Fiosina, J., Greve, M., Hazon, N., Kolbe, L., Lembcke, T.-B., Muller,
J. P., Schleibaum, S., & Vollrath, M. (2020). Ai for explaining decisions in multi-
agent environments. Proceedings of the AAAI conference on artificial intelligence,
34 (09), 13534–13538.
Lamm, C., Batson, C. D., & Decety, J. (2007). The neural substrate of human empathy:
Effects of perspective-taking and cognitive appraisal. Journal of Cognitive Neuro-
science, 19 (1), 42–58. https://doi.org/10.1162/jocn.2007.19.1.42
Lecun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied
to document recognition. Proceedings of the IEEE, 86 (11), 2278–2324. https://doi.
org/10.1109/5.726791
Lewis, L., & Krupenye, C. (2021). Theory of mind in nonhuman primates. https://doi.org/
10.31234/osf.io/c568f
Lowe, R., Wu, Y. I., Tamar, A., Harb, J., Pieter Abbeel, O., & Mordatch, I. (2017). Multi-
agent actor-critic for mixed cooperative-competitive environments. Advances in neu- ral
information processing systems, 30.
Maghsudi, S., Lan, A., Xu, J., & van Der Schaar, M. (2021). Personalized education in the
artificial intelligence era: What to expect next. IEEE Signal Processing Magazine,
38 (3), 37–50.
McKinney, S. M., Sieniek, M., Godbole, V., Godwin, J., Antropova, N., Ashrafian, H., Back,
T., Chesus, M., Corrado, G. S., Darzi, A., Etemadi, M., Garcia-Vicente, F., Gilbert,
F. J., Halling-Brown, M., Hassabis, D., Jansen, S., Karthikesalingam, A., Kelly,
C. J., King, D., . . . Shetty, S. (2020). International evaluation of an AI system for breast
cancer screening. Nature, 577 (7788), 89–94. https://doi.org/10.1038/s41586- 019-
1799-6
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A.,
Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A.,
Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., & Hass- abis, D.
(2015). Human-level control through deep reinforcement learning. Nature,518 (7540),
529–533. https://doi.org/10.1038/nature14236
Nash, J. (1951). Non-cooperative games. The Annals of Mathematics, 54 (2), 286. https :
//doi.org/10.2307/1969529
O’Connell, R. G., Dockree, P. M., & Kelly, S. P. (2012). A supramodal accumulation-to-
bound signal that determines perceptual decisions in humans. Nature Neuroscience,
15 (12), 1729–1735. https://doi.org/10.1038/nn.3248
Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agar-
wal, S., Slama, K., Ray, A., et al. (2022). Training language models to follow instruc-
tions with human feedback. Advances in Neural Information Processing Systems, 35,
27730–27744.
FROM DDM TO DNN 13
Padoa-Schioppa, C., & Assad, J. A. (2006). Neurons in the orbitofrontal cortex encode
economic value. Nature, 441 (7090), 223–226. https://doi.org/10.1038/nature04676
Posner, R. (1986). Zur systematik der beschreibung verbaler und nonverbaler kommunika-
tion. In Perspektiven auf sprache (pp. 267–314). De Gruyter. https://doi.org/10.
1515/9783110886238-016
Premack, D., & Woodruff, G. (1978). Does the chimpanzee have a theory of mind?
Behavioral and Brain Sciences, 1 (4), 515–526. https : / / doi . org / 10 . 1017 /
S0140525X00076512
Proudfit, G. H. (2014). The reward positivity: From basic research on reward to a biomarker
for depression. Psychophysiology, 52 (4), 449–459. https://doi.org/10.1111/psyp.
12370
Rabinowitz, N., Perbet, F., Song, F., Zhang, C., Eslami, S. A., & Botvinick, M. (2018).
Machine theory of mind. International conference on machine learning, 4218–4227.
Rafiei, F., & Rahnev, D. (2022). Rtnet: A neural network that exhibits the signatures of
human perceptual decision making. bioRxiv, 2022–08.
Ratcliff, R. (1978). A theory of memory retrieval. Psychological Review, 85 (2), 59–108.
https://doi.org/10.1037/0033-295x.85.2.59
Ratcliff, R., Smith, P. L., Brown, S. D., & McKoon, G. (2016). Diffusion decision model:
Current issues and history. Trends in Cognitive Sciences, 20 (4), 260–281. https :
//doi.org/10.1016/j.tics.2016.01.007
Rilling, J. K., Gutman, D. A., Zeh, T. R., Pagnoni, G., Berns, G. S., & Kilts, C. D. (2002).
A neural basis for social cooperation. Neuron, 35 (2), 395–405. https://doi.org/10.
1016/s0896-6273(02)00755-9
Rizzolatti, G., & Craighero, L. (2004). THE MIRROR-NEURON SYSTEM. Annual Review
of Neuroscience, 27 (1), 169–192. https://doi.org/10.1146/annurev.neuro.27.070203.
144230
Saxe, R., & Kanwisher, N. (2003). People thinking about thinking peopleThe role of
the temporo-parietal junction in “theory of mind”. NeuroImage, 19 (4), 1835–1842.
https://doi.org/10.1016/s1053-8119(03)00230-1
Sheng, F., Ramakrishnan, A., Seok, D., Zhao, W. J., Thelaus, S., Cen, P., & Platt, M. L.
(2020). Decomposing loss aversion from gaze allocation and pupil dilation. Proceed-
ings of the National Academy of Sciences, 117 (21), 11356–11363. https://doi.org/
10.1073/pnas.1919670117
Shenhav, A., Cohen, J. D., & Botvinick, M. M. (2016). Dorsal anterior cingulate cortex and
the value of control. Nature Neuroscience, 19 (10), 1286–1291. https://doi.org/10.
1038/nn.4384
Shibasaki, H., & Hallett, M. (2006). What is the bereitschaftspotential? Clinical Neurophys-
iology, 117 (11), 2341–2356. https://doi.org/10.1016/j.clinph.2006.04.025
Shimojo, S., Simion, C., Shimojo, E., & Scheier, C. (2003). Gaze bias both reflects and
influences preference. Nature Neuroscience, 6 (12), 1317–1322. https://doi.org/10.
1038/nn1150
Smith, P. L., & Ratcliff, R. (2004). Psychology and neurobiology of simple decisions. Trends
in Neurosciences, 27 (3), 161–168. https://doi.org/10.1016/j.tins.2004.01.006
Sobash, R. A., Romine, G. S., & Schwartz, C. S. (2020). A comparison of neural-network and
surrogate-severe probabilistic convective hazard guidance derived from a convection-
FROM DDM TO DNN 14