When a person fails to obtain an expected reward from an object in the environment, they face a credit assignment problem: Did the absence of reward reflect an extrinsic property of the environment or an intrinsic error in motor execution? To explore this problem, we modified a popular decision-making task used in studies of reinforcement learning, the two-armed bandit task. We compared a version in which choices were indicated by key presses, the standard response in such tasks, to a version in which the choices were indicated by reaching movements, which affords execution failures. In the key press condition, participants exhibited a strong risk aversion bias; strikingly, this bias reversed in the reaching condition. This result can be explained by a reinforcement model wherein movement errors influence decision-making, either by gating reward prediction errors or by modifying an implicit representation of motor competence. Two further experiments support the gating hypothesis. First, we used a condition in which we provided visual cues indicative of movement errors but informed the participants that trial outcomes were independent of their actual movements. The main result was replicated, indicating that the gating process is independent of participants' explicit sense of control. Second, individuals with cerebellar degeneration failed to modulate their behavior between the key press and reach conditions, providing converging evidence of an implicit influence of movement error signals on reinforcement learning. These results provide a mechanistically tracta-ble solution to the credit assignment problem. decision-making | reinforcement learning | sensory prediction error | reward prediction error | cerebellum W hen a diner reaches across the table and knocks over her coffee, the absence of anticipated reward should be attributed to a failure of coordination rather than diminish her love of coffee. Although this attribution is intuitive, current models of decision-making lack a mechanistic explanation for this seemingly simple computation. We set out to ask if, and how, selection processes in decision-making incorporate information specific to action execution and thus solve the credit assignment problem that arises when an expected reward is not obtained because of a failure in motor execution. Humans are highly capable of tracking the value of stimuli, varying their behavior on the basis of reinforcement history (1, 2), and exhibiting sensitivity to intrinsic motor noise when reward outcomes depend on movement accuracy (3–5). In real-world behavior, the underlying cause of unrewarded events is often ambiguous: A lost point in tennis could occur because the player made a poor choice about where to hit the ball or failed to properly execute the stroke. However, in laboratory studies of reinforcement learning, the underlying cause of unrewarded events is typically unambiguous, either solely dependent on properties of the stimulus or on motor noise. Thus, it remains unclear how people assign credit to either extrinsic or intrinsic causes during reward learning. We hypothesized that, during reinforcement learning, sensorimotor error signals could indicate when negative outcomes should be attributed to failures of the motor system. To test this idea, we developed a task in which outcomes could be assigned to properties of the environment or intrinsic motor error. We find that the presence of signals associated with movement errors has a marked effect on choice behavior, and does so in a way consistent with the operation of an implicit learning mechanism that modulates credit assignment. This process appears to be impaired in individuals with cerebellar degeneration, consistent with a computational model in which movement errors modulate reinforcement learning. Results Participants performed a two-armed " bandit task " (ref. 1, Fig. 1A), seeking to maximize points that were later exchanged for money. For all participants, the outcome of each trial was predetermined by two functions: One function defined if a target yielded a reward for that trial (" hit " or " miss "), and the other specified the magnitude of reward on hit trials (Fig. 1B). The expected value was equivalent for the two targets on all trials; however, risk, defined in terms of hit probability, was not. Under such conditions, people tend to be risk-averse (2, 6). We manipulated three variables: The manner in which participants made their choices, the feedback on " miss trials, " and the instructions. In experiment 1, participants were assigned to one of three conditions (n = 20/group). In the Standard condition, choices were indicated by pressing one of two keys, the typical response method in bandit tasks (1, 2). Points were only earned on hit trials Significance Thorndike's Law of Effect states that when an action leads to a desirable outcome, that action is likely to be repeated. However , when an action is not rewarded, the brain must solve a credit assignment problem: Was the lack of reward attributable to a bad decision or poor action execution? In a series of experiments , we find that salient motor error signals modulate biases in a simple decision-making task. This effect is independent of the participant's sense of control, suggesting that the error information impacts behavior in an implicit and automatic manner. We describe computational models of reinforcement learning in which execution error signals influence , or gate, the updating of value representations, providing a novel solution to the credit assignment problem.