Lecture Notes: Learning
This chapter covers the major theories of learning (Classical Conditioning, Operant Conditioning, and Social Learning
Theory).
Classical Conditioning (CC)—This approach was developed by Ivan Pavlov. Pavlov was not originally a psychologist—he
was a physiologist who studied how dogs digested their food. Whoa! In the course of this research he noticed that his
dogs would salivate when he entered the lab even though he had no food. He assumed that they had “associated” him
with food and that the salivation was occurring to prepare for eating. Pavlov was intrigued by this and couldn’t leave it
alone. He shifted his research to investigating this phenomenon.
Pavlov assumed that if the animals had associated him with the sight of food he could condition them to salivate to any
stimulus, as long as it had first been paired with food.
Now, everything in the CC process has a label. Pay close attention!
FOOD ---------------------> SALIVATION
Unconditioned Unconditioned
Stimulus Response
(UCS) (UCR)
Food naturally elicits the salivation response. Elicit means involuntary. You do not teach or condition an animal to salivate
to food. It happens naturally—it is an inborn natural reflex. It is unconditioned. CC takes advantage of this natural process
by pairing a Neutral Stimulus (NS) a bell, with the UCS (food). Each pairing is called a trial. So, look at the labels below:
BELL
Neutral Stimulus
(NS)
The bell has no influence on the animal.
CC involves pairing the NS (Bell) with the UCS (Food)
NS-------------UCS------------->UCR
(BELL) (FOOD) (SALIVATION)
(Each paring of the bell and the food is called a trial)
After several trials one tests to see if conditioning occurred. So, the bell is presented alone, and hooray it causes
salivation!
BELL----------------->SALIVATION
(CS) (CR)
Note: The bell is now called a Conditioned Stimulus. The salivation is now called a Conditioned Response.
Pavlov also studied some other important phenomena. They are discussed below:
a) Stimulus Generalization-This occurs when the response spreads to other similar stimuli. So. for instance, Pavlov’s dog
once conditioned, may salivate to other similar bell tones. Consider a real life human example: A child is stung by a bee.
Afterwards, he may not only fear bees but all flying insects.
b) Stimulus Discrimination-This occurs when the animal salivated to only one particular bell tone and no other. Consider a
real life human example: A man is bitten by a Golden Retriever. He now fears Golden Retrievers but no other breed.
c) Extinction-Pavlov wanted to find out what would happen if he kept presenting the bell (CS) without presenting the food
(UCS). Over time the salivation response decreased until it stopped. Essentially, it was “turned off” as the animal’s
nervous system learned that the bell no longer signaled food. It would be a waste of vital energy to produce saliva for no
good reason. So, it stops!
d) Spontaneous Recovery-After extinction had occurred Pavlov decided to see what might happen if he rang the bell
again. Low and behold the salivation returned. It is if the animal never forgot it—the response was always there—it had
been learned!
Now, John B. Watson, while developing Behaviorism, had read Pavlov’s work and was quite impressed. Remember from
Chapter 1 that Watson stated that the focus of psychology should be on observable behavior and that everything about
the human being had been learned. He decided to use CC principles to condition a young infant (“Little Albert”) to fear a
white lab rat. Initially, Little Albert had no fear of the rat. It was Watson’s goal to create an environmental event that would
lead to the development of fear of the rat. So, each time the rat came near, Watson would bang a loud noise behind
Albert’s head. This created a natural fear response (Albert would be startled and cry). After a few pairings (trials) of the rat
and the loud noise Watson tested to see if the rat alone would produce fear in Albert. And, of course it did. Watson had
the environmental evidence he was looking for to support his theory of Behaviorism.
So, hear is a breakdown of the CC terminology. Initially the rat is a neutral stimulus (NS)—it has no effect on Albert. The
loud noise, however, naturally causes fear in Albert. It is an unconditioned stimulus (UCS). The fear is an unconditioned
response (UCR). Each pairing of the rat and the loud noise is a trial. After a few trials Watson tested to see if conditioning
occurred. It did! Now, the rat is a conditioned stimulus (CS) and the fear is a conditioned response (CR)
One more thing—stimulus generalization was noted in this study. Albert became afraid of some other white furry things
(e.g., Santa Claus mask, rabbit).
A precursor to the development of Operant Conditioning (see below) can be seen in the work of E.L. Thorndike and his
work with cats. This gentleman would place hungry cats in “puzzle boxes” outside of which was a plate of food. The job of
the cat was to figure out how to escape via a latch system whereby upon escape he would be allowed to eat a bite of
food. The food served as a reward for the escape behavior. As soon as the animal had its reward Thorndike would place it
back in the box, start a stopwatch, and time how long the animal took to escape the 2 nd time, 3rd time, etc. Thorndike
noted that the escape time decreased on each successive trial indicating that learning was taking place. Based on this
research Thorndike developed the “Law of Effect” which essentially states that behaviors which are followed by positive
consequences tend to be repeated.
Operant Conditioning—B.F. Skinner
Skinner coined the term “operant.” It means to “operate on one’s environment”, or quite simply, to “behave.” In this form of
conditioning behaviors are emitted or voluntary. Behaviors are then followed by either a reinforcement (reward) or a
punishment. Skinner conducted his research on rats and pigeons in the famous “Skinner Box” and stated that the results
gathered in his laboratory investigations applied to human beings. Yes, he believed that humans, rats, and pigeons, all
learned in the same manner. So, in this theory behaviors are shaped/controlled via reward and punishment
Here is a breakdown on the types of reinforcement and punishment that he investigated.
REINFORCEMENT--Reinforcement increases the chances that a
behavior will be repeated
PUNISHMENT--Punishment decreases the chances that a
behavior will be repeated
There are two types of each: Positive Reinforcement/Negative Reinforcement
Positive Punishment/Negative Punishment
REINFORCEMENT
Positive (+) Something pleasant is added to one’s life that increases the chances the behavior will be repeated (e.g.,
money for a day’s work).
Negative (-) Occurs when we engage in a behavior which removes (subtracts, takes away) something unpleasant from
our life (e.g., taking 2 aspirin to get rid of a headache—if it works we repeat the behavior in the future). So, the behavior
(operant) is aspirin taking. It removes the unpleasantness of the headache. Getting rid of the pain reinforces aspirin taking
behavior—you will do it again the next time you have a headache! Wow!
Many students see the word “negative” here and think it’s punishment. It is not! Remember, it’s a form of reinforcement so
the likelihood of a behavior being repeated is increased.
PUNISHMENT
Positive (+) Occurs when something unpleasant is added to our life (e.g., getting screamed at for misbehavior).
Negative (-) Occurs when something pleasant is removed from our life (e.g., taking away a kids I-phone for being
disrespectful).
Two additional terms: Partial and continuous reinforcement. Partial reinforcement occurs when a behavior is reinforced
every once in while, not after each behavior that has been emitted. Think about scrape off lottery tickets. You don’t win
each time—that would be continuous reinforcement and it would bankrupt the lottery! So, they let you win every once in a
while by giving you a little back. This keeps you playing!
Partial reinforcement takes 4 different forms—they are all considered “Schedules of Reinforcement” and discussed below.
Schedules of Reinforcement-Skinner placed his animals on different schedules of reinforcement to see their effects on
behavior (e.g., bar pressing in the Skinner Box).
There are both Ratio Schedules and Interval Schedules.
Ratio Schedules-In a ratio schedule the animal only receives the reward after a certain number of bar presses. There are
2 types:
a) Fixed Ratio—The animal must press the bar a specific number of times before a reward is delivered. So, you could
place the rat on a FR20 schedule in which, over time, the animal would learn to press the bar 20 times in a row to get the
reward.
b) Variable Ratio-In this schedule the number of bar presses is not fixed—it usually averages out to a certain number—
however the animal does not know how many presses will produce the reward.
Interval Schedules—These are based on responding (bar pressing) at the right time in order to get the reward. Again,
there are 2 types.
a) Fixed Interval—In this schedule the animal will learn to press the bar after a specific amount of time has passed. So, if
we had a FI 30 second schedule the animal would learn to start pressing the bar at around 28 -30 seconds. Only after the
30 second mark does the reward become available.
b) Variable Interval-In this schedule the time varies as to when the reward becomes available. So, the animal keeps
“checking” (bar pressing) to see if they will indeed receive the reward.
Now, your text presents the schedules with human, not animal, examples. Make sure to read these and pay attention to
which one’s result in the highest amount of responding.
It is important to note that Skinner did not believe in free will. We behave to either obtain rewards or to avoid punishments.
Thus, thoughts (cognitions) were not important to him.
Latent Learning-Read/Study in text..
Ok—the last major theory is Social Learning Theory. It was Proposed by Albert Bandura. In this model we learn by
observing others behavior. It is sometimes referred to as observational learning or modeling theory. In this model
Cognition is important!!! This means that we “think” before we model—we decide if we will model a behavior or not.
Here is the general rule:
We are more likely to model a behavior when the model has been reinforced for that behavior; we are less likely to model
a behavior when the model has been punished for that behavior.
Bandura demonstrated his theory in the classic “Bo-Bo” doll study in which children modeled an adult’s aggressive actions
towards an inflatable doll.