Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
CHAPTER 7
LEARNING
CHAPTER PREVIEW QUESTIONS
What are the three basic kinds of learning?
How did Ivan Pavlov discover classical conditioning?
Who made Little Albert afraid of the rat and how was it done?
How are behaviors learned and unlearned in classical conditioning?
What did Robert Rescorla, Leon Kamin, and John Garcia, add to our
understanding of classical conditioning?
Does classical conditioning explain all human learning?
How are behaviors learned and unlearned in operant conditioning?
How does the schedule of reinforcement affect operant conditioning?
Can punishment be an effective method of eliminating undesirable
behaviors?
Are there problems or negative side effects associated with the use of
punishment?
Can we learn without undergoing classical or operant conditioning?
Does watching violent shows make children more violent?
235
236
Raygor
Unlike some psychological terms, learning is a word that the average person
uses all the time. Even small children will say, Look, I learned how to tie my
shoes. As scientists, however, we need to have a more exact definition of the
word learning. Psychologists do have a formal definition of learning but the
meaning is not very different from the way the word is used in everyday life.
Here is the scientific definition of learning: A relatively permanent change in behavior due to experience. It seems obvious that a persons behavior must change
if they learn something but we dont want to include all changes in behavior in
our definition. First, the change must be relatively permanent. When a person gets hungry, they may start to eat. When they are full, they will usually stop.
These are changes in behavior, but we dont want to say that the person has
learned to eat or stop eating several times each day. There are many temporary changes in behavior such as eating, sleeping, and getting angry, that dont
qualify as learned behaviors. Since these are not relatively permanent
changes, our definition excludes them.
Q: Why say that learned behaviors must be due to experience?
Can you think of some behaviors that are relatively permanent but wouldnt
qualify as learning? During our early years, our ability to reach up a grab objects held high in the air increases steadily. This ability to reach to greater and
greater heights is a change in behavior and it is relatively permanent. This
change isnt learned, however. It is the result of maturation. By defining
learned behaviors as resulting from experience, we exclude changes in behavior that are due to maturation, disease, drugs, or injury. These changes in behavior may be relatively permanent but we cant say that they are learned.
stimulus
Information perceived through
the senses
response
Any activity of the muscles or
other identifiable behavior
Now that we have a scientific definition of learning, lets look at two important
terms: stimulus and response. In order to understand the following sections on
classical and operant conditioning, it is important that you become familiar and
comfortable with these two scientific terms. A stimulus is anything that comes
in through your senses. It could be something simple like a smell, a light, a bell,
or a tone. There are also complex stimuli like the contents of a book or lecture.
In psychology, we usually study simple stimuli. Responses can also be simple or
complex. A response is anything that goes out through your musclesanything
you do. The responses we study in psychology also tend to be simple ones like
Learning
237
CHAPTER 7
KEY NAMES
TYPICAL BEHAVIOR
LEARNED
CLASSICAL
CONDITIONING
Ivan Pavlov,
J. B. Watson
Repeated pairing of
two stimuli
OPERANT
CONDITIONING
E. L. Thorndike,
B. F. Skinner
OBSERVATIONAL
LEARNING
Albert Bandura
Following a response
with reinforcement or
punishment
Observing the behavior
of others and its
consequences
habituation
A decrease in response to a
repeated stimulus
EXAMPLE
Learning to fear the
sound of the dentists
drill
Dog learning the sit
in order to receive
praise
Becoming more violent
by watching violent
videos
TABLE 7.1
LEARNING: AN OVERVIEW
For each of the three kinds of learning, classical conditioning, operant conditioning, and observational learning,
the behaviors learned and the conditioning procedure are different.
238
Raygor
classical conditioning
Learning that results from the
pairing of an unconditioned
and a conditioned stimulus
The Russian physiologist, Ivan Pavlov devoted his entire life to science. Pavlov
became famous for the scientific study of digestion in dogs. His experiments
showed that digestion started in the mouth and that saliva was an important
part of the digestive process. This led him to the discovery of the salivary gland
and the salivary reflex. Pavlov found that putting food powder on a dogs tongue
would trigger the salivary reflex (Pavlov, 1897/1902). He won a Nobel Prize in
1904 for this work. It was another discovery, though, that made him famous in
the field of psychology. In his laboratory, Pavlov began to notice that some of his
dogs were starting to salivate before he put the food powder on their tongues.
Some would salivate at the sight of the food powder or even the sight of the
spoon used to deliver the powder. Some of the dogs even began to salivate when
they saw Pavlovs assistant bringing in the food (Vulfson, 1898 as cited in Todes,
1997). When Pavlov investigated further, he found that the longer the dogs had
been in the laboratory, the more likely they were to make these astonishing responses. New dogs would only respond to the food powder itself. Pavlov had
thought of the salivary reflex as being like an electrical circuit. He thought that
putting the food on the dogs tongue completed the circuit and caused the dog
to salivate. Pavlov believed that the process was just like the way flipping a light
switch completes a circuit and turns on the light. He was amazed, then, when
the response occurred before the food powder arrived on the dogs tongue. Imagine how you would feel if you went to turn on the lights in your living room and
you noticed the lights coming on before you even touched the switch. Pavlov
was astounded by this new development. He concluded that the dogs were
learning to respond to stimuli other than the food powder. Like many good scientists, he was so fascinated by an unexplained event in the laboratory that he
changed his whole area of study. He stopped working on digestion and began
studying what we now call Classical Conditioning. In Pavlovs honor, this kind
of conditioning is sometimes called Pavlovian conditioning.
Pavlovs experiments showed that his dogs could be conditioned to salivate
to a number of other stimuli. He tried the sound of a metronome and a number of other stimuli including a small bell. All he had to do was present these
other stimuli repeatedly along with the food powder. Repeatedly presenting
food powder and ringing a bell at the same time, for example, will eventually
result in a dog salivating to the sound of the bell. This is the basic form for all
classical conditioning. One stimulus (in this case food) already produces the
response (salivation). This stimulus is paired, or presented together with, a
neutral stimulus (the bell). Before conditioning, the bell does not produce the
response. Over time, as the two stimuli are presented together, the bell comes
to produce the response (salivation). We could say that the dog has learned to
salivate to the sound of the bell. Instead, however, we usually use a more scientific term. We say that the dog has been conditioned to salivate to the sound
of the bell. Another scientific term we use in describing classical conditioning
is the word elicit. We say that before conditioning, the food elicits salivation.
After conditioning, we say that the bell elicits salivation. When responses are
elicited, they are automatic and involuntary.
Notice that there are four stimuli and responses here (two stimuli, two responses). The bell and the food powder are the two stimuli. Salivation to the
food and salivation to the bell are the two responses (see Figure 7.1). We have
scientific terms for these four elements of classical conditioning but the terms
are sensible and easy to learn.
Because the food powder produces the response before we have done any
conditioning, we could say that it works on an unconditioned animal. For
Learning
CHAPTER 7
239
FIGURE 7.1
THE PROCESS OF CLASSICAL
CONDITIONING
unconditioned stimulus
(US)
A stimulus that elicits a
response before any
conditioning has occurred
conditioned stimulus (CS)
A stimulus that elicits a
response after being paired
with an unconditioned
stimulus
unconditioned response
(UR)
A response elicited by an
unconditioned stimulus
conditioned response
(CR)
A response elicited by a
conditioned stimulus
240
Raygor
FIGURE 7.2
WATSONS CONDITIONING
EXPERIMENT
In Watsons experiment,
the conditioned stimulus
(the rat) came to elicit
the conditioned response
(fear)
acquisition because pairing the unconditioned stimulus and conditioned stimulus causes the acquisition of the response to the conditioned stimulus.
Shortly after Watsons experiment, Little Alberts mother removed him
from the hospital where Watson was doing his experiments (Harris, 1979). Was
he afraid of rats and other white furry animals for the rest of his life? We dont
know. The experiment was traumatic for Little Albert and may have had a lasting effect. Because of this, it would be unethical to repeat Watsons experiment
so we can only speculate on the long-term effects of this particular example of
classical conditioning.
Watson was thrilled by the results of his experiment with Little Albert.
Based on this, and other experiments he performed, he concluded that classical conditioning was the basis for all human learning. He believed that he could
condition any behavior and that he could completely shape a persons life using only classical conditioning. We now know that Watson was too optimistic
about the power of classical conditioning. Many human behaviors are learned
in other ways. Still, classical conditioning does play an important role in conditioning reflexes and emotional behaviors. It gives a scientific explanation of
why children become afraid of the dark and why many of us tense up at the
sound of a dentists drill. Well see other examples as we discuss the details of
classical conditioning.
Here is an example of classical conditioning that did not happen in a laboratory. A friend of mine, a psychologist, spent many years watching a famous
TV news anchor on the evening news every night while eating dinner. To this
day, when that particular news anchor appears on his television screen, he salivates. In this case, the news anchor is the conditioned stimulus. Salivation to
the television image of the news anchor is the conditioned response (J. C.
Megas, personal communication, May 3, 1998).
Lets look at one more example of the acquisition phase of classical conditioning. Suppose that you are tired of just reading about classical conditioning
and, as a scientist, you decide that you actually want to try it. First, you need a
subject. You dont have a dog but you do have a little sister and you decide that
she will be your subject. Now you need a response to condition. You dont want
to scare your little sister and you think that salivation is too messy. You know
that classical conditioning works well on reflexes and try to think of a reflex
that you can condition. You decide on the blink reflex. You blow a puff of air
into your little sisters eye and every time you do so, she blinks. You have your
unconditioned stimulus and unconditioned response. Now you need a neutral
stimulus to serve as the conditioned stimulus. You remember that Pavlov used
a bell and search your house for a bell. You cant find one but you finally think
of your doorbell. You drag your little sister into the doorway and make sure
that she doesnt already blink to the doorbell. Now you present the doorbell and
the puff of air together for a number of training trials. Soon, your little sister
is blinking every time someone rings the doorbell. You have performed classical conditioning and your scientific experiment is a success.
Learning
1.
2.
3.
4.
CHAPTER 7
241
Locking
It In
Classical Extinction
Q: Can classically conditioned responses be unlearned?
If Little Albert had not been removed from the hospital, Watson might have
wanted to help him lose his fear of the white rat. You might begin to feel sorry
for your little sister who now blinks whenever the doorbell rings. Is there a scientific method that will undo classical conditioning? In classical conditioning,
the response will get weaker and weaker if we present the CS over and over
without the US. This process is called extinction (pronounced ex-STINKshun). If Watson had repeatedly put Little Albert together with the white rat
without making the loud noise, it is likely that Little Albert would have gradually lost his fear of the rat. If you ring the doorbell over and over without the
puff of air, your little sisters blink response will grow weaker and weaker. Lets
return for a moment to Pavlovs experiment. During acquisition, the US (food)
was paired with the CS (bell). What would extinction look like in this example?
In extinction, we present the CS alone. As we ring the bell (the CS) over and
over, the response gets weaker and weaker and eventually disappears (see Figure
7.3). We now have a scientific definition of Classical extinction: The presentation
of the conditioned stimulus over and over without the unconditioned stimulus.
extinction
Eliminating a previously
conditioned response
Spontaneous Recovery
Q: Once the response fails to follow the CS, is it gone for good?
As you can see from Figure 7.3, a response will often reappear after a period of
rest. This is called spontaneous recovery. It is normal in classical conditioning
(and in operant conditioning as well). It is wise to remember that spontaneous
recovery is a normal part of the extinction process. Another important thing
to note is that during the extinction process, the response will be occurring
quite often. If you are trying to eliminate a habitual behavior in a child, a pet,
or even yourself, it is normal for the response to occur many times and to reappear occasionally in the future. This happens in both classical and operant conditioning. It doesnt mean that the subject is trying to be difficult or
uncooperative. It just means that they are following the normal scientific
process of extinction. It may help you to be patient in such situations if you remember how the process works.
Generalization
As a scientist, you might wonder what would have happened if Pavlovs bell had
broken and he couldnt find another one exactly like it? Would the dogs salivate
to another bell that was very similar to the bell that was used during acquisition?
spontaneous recovery
The reappearance of a
conditioned response following
extinction
242
Raygor
FIGURE 7.3
ACQUISITION, EXTINCTION,
AND SPONTANEOUS
RECOVERY
generalization
The tendency to respond to
another stimulus that is similar
to the training stimulus
As you can probably guess, they would. This is called generalization and it applies to any classically conditioned response. Animals trained to respond to a
particular stimulus will respond to any other stimulus that is similar to it. The
more similar the stimulus, the stronger the response will be. A bee stung my
niece on the sole of her foot. The bee was the CS and her fear of bees is the conditioned response. Shes not just afraid of that particular bee, though. She is
afraid of all bees. The more similar they are to the bee that stung her, the more
afraid she is.
Imagine that we conditioned a dog to salivate to the sound of middle C on
the piano. Through the process of generalization, the dog would also salivate to
the nearby keys on the keyboard. As we play keys further and further away from
middle C, we would expect the response to grow weaker and weaker. Little Albert also showed generalization after learning to be afraid of the white rat. Five
days after his training, Little Albert reacted with fear to other furry stimuli including a white rabbit, a dog, and even a white Santa Claus mask (Harris, 1979).
Discrimination
discrimination
The tendency to respond
differently to two or more
stimuli
discrimination training
Training an animal to respond
differently to two different
stimuli
Learning
CHAPTER 7
243
imals and babies who cant use language to tell us whether they can tell the difference between similar stimuli.
1. Eliminating a classical response by presenting the CS without the US is
called __________________.
2. When an animal responds to a new stimulus similar to the training stimulus, its called ___________________.
3. If an animal responds differently to two different stimuli, its called
________________.
4. When a response comes back after extinction, its called ___________
___________.
5. Critical Thinking: Why does spontaneous recovery occur?
Answers: 1) extinction, 2) generalization, 3) discrimination, 4)
spontaneous recovery
Locking
It In
conditioned emotional
response
An emotional response learned
through classical conditioning
244
Raygor
look at it. In addition, when they do encounter a snake, they usually run away
from it. This reduces their fear and makes them feel better. Feeling better rewards them for being afraid and for running away. This is why phobias, irrational fears of objects or situations, can be so persistent.
Q: Could we use extinction to get rid of these fears?
systematic
desensitization
A gradual process of
extinguishing a fear response
through classical extinction
In Chapter one, we said that J. B. Watson lost his job as a psychologist when
he had an affair with his assistant, Rosalie Rayner. He worked at a number of
jobs. For a while, he did market research by going from door to door asking
people what kind of rubber boots they wore. Eventually, though, he got a job
with an advertising agency. Over time, he became a successful advertising executive. Watson used the scientific principles of classical conditioning to sell
products (Buckley, 1982, 1989). If fact, he is probably the father of modern advertising since these techniques are used constantly today, especially on television. Many television advertisements are designed to make us feel good.
During the advertisement, we are shown the product or its logo and the advertisers hope that, if we see that advertisement often enough, we will come to associate that positive emotional state with the product itself. Seeing the product
in the real world then makes us feel good. As a result, we want the product near
us and often end up buying it.
Another approach used in advertising is to make us associate negative emotions with some condition. The advertisements then convince us that that the
product will protect us from that condition. As a result of advertising, people
are afraid of appearing in public with dandruff, unruly hair, wrinkled skin, the
normal smell of our bodies, and stained teeth, among other things. We spend
Learning
CHAPTER 7
245
biological preparedness
The biological readiness to
learn some responses more
easily than others
246
Raygor
blocking
The ability of an existing
conditioned response to block
the learning of a new
conditioned response
Learning
CHAPTER 7
247
taste aversion
A learned tendency to avoid
certain foods after eating them
is followed by illness
248
Raygor
and the response are related in some meaningful way. Simply presenting them
together is not always enough. Robert Rescorla (1988) points out a number of
problems with the simple traditional view of classical conditioning. The stimuli
need not be presented together and presenting stimuli together doesnt always result in conditioning. The response to the conditioned stimulus is seldom the same
as the response to the unconditioned stimulus. Clearly, the process is much more
complicated than Pavlov and Watson thought. As scientists, though, we should be
careful about assuming that the animal is thinking about the conditioning situation and deciding to become conditioned. In order to draw such a conclusion,
we would need some clear empirical evidence of this thinking. The fact that the
process is complicated doesnt necessarily mean that we have to give up the idea
that classical conditioning is a mechanical process that is built in by nature.
Locking
It In
Applications
1. When an animal gets sick after eating a certain food and avoids that food
in the future, its called _________________ _____________.
2. Leon Kamin is known for his research on ___________________.
3. Martin Seligman believes that fears of heights, rats, and spiders are
common because of ___________________ _________________.
4. Critical Thinking: How did conditioned taste aversions help our ancestors survive?
Answers: 1) taste aversion, 2) blocking, 3) biological preparedness
Learning
Operant Conditioning
Q: Can classical conditioning be used on any behavior?
Suppose that you want to use classical conditioning to teach your dog to sit. First,
you need an unconditioned stimulus that will make the dog sit. You might try
pushing down on the back end of the dog. The trouble with this is that the dog isnt really sitting. Many dogs will respond to this stimulus by pushing back, which
is the opposite of sitting. There are many other stimuli you might try such as backing the dog into a corner, holding food over the dogs head, or pulling back on the
dogs lead. Remember though that an unconditioned stimulus should make the response occur every time without fail. None of these stimuli will really do the job.
The reason that it is so hard to come up with an unconditioned stimulus for sitting is that there isnt one. Classical conditioning works well with involuntary behavior like reflexes and emotional behaviors. Sitting isnt a reflex, though, and its
certainly not an emotional behavior. This means that we can forget using classical
conditioning to teach a dog to sit. We must resort to another kind of conditioning
called operant conditioning. Operant behaviors operate on the environment. Unlike classically conditioned behaviors, they are not elicited by specific stimuli.
A friend of mine was visiting some neighbors who were completely deaf. During the visit, their five-year-old (who could hear fine) was crawling around under
a table. The child forgot that he was under a table and stood up. My friend heard
the loud thump of the childs head hitting the table and waited for the sound of
crying. To his surprise, the child didnt make a sound. He crawled out from under
the table and moved to where his parents could see him. Once his parents were
looking at him he screwed up his face and began holding his head in pain. He
looked for all the world like a child sobbing hysterically but he didnt make a
sound. Apparently, he had learned that the sound part wasnt important. The basic principle of operant conditioning is that some behaviors can be modified by
their consequences. Behaviors that have positive consequences tend to occur
more often. Those with negative consequences tend to occur less often. The child
in the story had learned that looking hurt got him attention and comfort from his
parents. He didnt cry out loud because, for him, the sound of crying had no consequences. Operant behaviors develop and change because of what happens after
they occur. They depend on their consequences. In contrast, classical conditioning involves putting together two stimuli before the behavior occurs. To use operant conditioning to teach your dog to sit, you might praise the dog every time it
sits on command. Rather than pairing two stimuli as we did in classical conditioning, you are making sure that there are positive consequences for sitting on
command. The praise comes after the response. Well discuss the details of operant conditioning in more detail but first, lets look at its history.
CHAPTER 7
249
250
Raygor
Q: Did Thorndike believe that the cats figured out how to escape from the box?
law of effect
Thorndikes term for the
principle that responses that
are followed by desirable
consequences occur more
often and responses that are
not followed by desirable
consequences occur less often
operant conditioning
A process of conditioning in
which behavior change is
caused by the consequences of
the behavior
B. F. Skinner (19041990),
noted behaviorist and
developer of operant
conditioning
No, Thorndike was a behaviorist. He argued that if the cats understood how
to get out of the box, the time it took them to escape would be high at first and
then drop suddenly when they figured out how to escape. The typical learning
curve doesnt show this kind of pattern at all. Thorndike believed that when the
cat made the correct response, it was rewarded with escape and food. This reward, according to Thorndike, simply made the correct response a little more
likely. In other words, it raised the probability of the correct response. Over
time, the consequences of the response made it more and more likely until, finally, it occurred very rapidly. Thorndike believed that, over time, positive consequences stamped in the correct response and negative consequences
stamped out all the possible incorrect responses. He called this the law of effect. Thorndike didnt believe that the cats needed to think or understand
to escape from the box. Their correct responses just became more likely over
time as a result of the law of effect. Thorndike called this instrumental conditioning because that cats correct responses were instrumental in getting
them out of the box. B. F. Skinner later called this same kind of learning operant conditioning (Skinner, 1938). The two terms mean the same thing but
most modern psychologists use the term operant conditioning.
B. F. Skinner
Noted behaviorist B. F. Skinner spent most of his life studying learning. Skinner wanted to find scientific principles that would explain the behavior of humans and other animals (Skinner, 1956, 1967). Skinner agreed with Watson and
Pavlov that some behaviors were elicited in response to certain stimuli. He also
agreed that these behaviors could be conditioned using classical conditioning.
Skinner disagreed, however with idea that this kind of conditioning explained all
human learning. Skinner began studying the large number of behaviors that animals emitted on their own like the ones Thorndikes cats used to escape from
the puzzle box. Skinner did not call these behaviors voluntary because he believed that the behaviors were a response to the animals environment. He did
not believe that the behaviors were intentional or the result of a conscious choice
on the part of the animal. In looking for scientific principles that could explain
how these behaviors were learned, Skinner developed his model of operant conditioning. In operant conditioning, behavior develops and changes because of its
consequences. By consequences, Skinner meant reinforcement and punishment.
A reinforcer is something like a reward. Well discuss the difference between a
reinforcer and a reward and define both reinforcement and punishment later in
this chapter. Although Skinners theory was based on both reinforcement and
punishment, his research focused almost entirely on reinforcement. Skinner
considered punishment a relatively ineffective technique with many negative
side effects. He believed that in practical applications of operant conditioning,
punishment should be avoided whenever possible. (Skinner, 1953).
Skinner developed an experimental chamber to study operant conditioning in
rats (see Figure 7.4). We now call this kind of experimental chamber a Skinner
box in his honor. Later, he developed a similar chamber to study operant conditioning in pigeons. In the original Skinner box, every time the rat pressed down on
a brass lever, a food pellet dropped down a small tube into a food cup. Skinner soon
found that reinforcing the rat with food for pressing the bar increased the rate of
bar pressing. As a firm behaviorist, Skinner avoided making any statements about
the rats mental states. He based his explanation of the behavior completely on observable stimuli and responses. If you had asked Skinner is the rat pressing the
bar because it wants food? Skinner might have replied, We cant know what the
rat wants. It is pressing the bar because it been reinforced for pressing the bar.
Notice that both the bar pressing and the reinforcement are easily observed.
Learning
CHAPTER 7
251
FIGURE 7.4
A TYPICAL SKINNER BOX
B. F. Skinners daughter
Deborah in the air crib
attended by her mom
reinforcer
A stimulus that increases the
frequency of the behavior it
follows
252
punishment
A stimulus that decreases the
frequency of the behavior it
follows
Raygor
more and more often, the stimulus is a reinforcer. Suppose that every time you
say the word toast I give you a dollar. It is likely that you would develop a rapid
interest in toast as a topic of conversation. You would probably try to work the
word into every sentence. You might even end up just repeating the word toast
over and over while I handed you dollars. Giving you the dollar has increased the
frequency of the behavior it follows. Therefore, its a reinforcer. Now suppose that
every time you say the word banana I rap your knuckles with a ruler. It seems
unlikely, but if you start saying the word banana more and more often, we
would have to conclude that hitting you on the knuckles is a reinforcer. Few people would consider it a reward. Rewards are usually defined by the person who
gives them. A teacher might say Lucinda, you got a perfect score on the spelling
test; as a reward, Im going to let you wear this gold star on your forehead for the
rest of the day. The star is a reward because the teacher says it is. Is it a reinforcer? If Lucinda starts getting more perfect scores on spelling tests, its a reinforcer. If Lucinda never gets another perfect test score, however, its not a
reinforcer.
We can change our definition of reward to the definition of punishment
by changing a single word. Punishment is defined as anything that decreases
the frequency of the behavior it follows. Notice that punishment, like reinforcement, is defined strictly in terms of its effect on behavior. Parents sometimes say, I dont know whats the matter with my kids; the more I punish
them, the worse they behave. Knowing the scientific definition of punishment, we can be sure that the parent is actually using reinforcement, not punishment. Suppose that a teacher decides to punish a child for standing up
during class. Every time the child gets up, the teacher yells, Get back in your
seat. The teacher hopes that this punishment will cut down the frequency of
the child getting up. What if the result of this punishment is that the child
gets up more and more often? We can see that yelling at this child is not a punishment but a reinforcer. The teacher might say, I dont know whats the matter with this kid; the more I yell, the worse things get. This child must like
punishment. What the child is getting, of course, is attention. Attention is a
powerful reinforcer for both children and adults. A better question would be,
Teacher, youve just discovered an effective reinforcer. Why are you using it
to reinforce a behavior you dont like? It is a good idea to be careful what behaviors you pay attention to. Often, paying attention to a behavior will make
it happen more frequently. Reinforcement and punishment can be powerful
scientific tools in changing behavior.
Comparing Classical and Operant Conditioning We will discuss the principles and techniques of operant conditioning in more detail later in the chapter. First, though, lets look more closely at the differences between classical
and operant conditioning. Classical and operant conditioning share many
terms such as stimulus, response, extinction, generalization, discrimination,
etc. They differ, however, in the kinds of behaviors they work on. They also differ in the conditioning procedure itself (see table 8.1 earlier in the chapter).
Classical conditioning works well with reflexes and emotional behaviors. Operant conditioning is used with most other behaviors. If you want to condition
an animal to blink, choke, salivate, or be afraid, classical conditioning is the
right method. To condition other behaviors that are not reflexes or emotional
behaviors, however, we must use operant conditioning. The conditioning procedure is also very different. In classical conditioning, we pair two stimuli, the
US and CS. Nature takes care of the rest. In operant conditioning, what happens after the response is most important. In operant conditioning, we follow
the response with reinforcement or punishment to make it occur more or less
often.
Learning
STIMULUS
STIMULUSISISREINFORCING
REINFORCING
STIMULUS IS AVERSIVE
STIMULUS IS PRESENTED
STIMULUS IS REMOVED
Positive reinforcement
Response increases
Negative Punishment
Response decreases
Positive punishment
Response decreases
Negative reinforcement
Response increases.
TABLE 7.2
TYPES OF REINFORCEMENT AND PUNISHMENT
Remember that reinforcement always increases the frequency of the response it follows.
Punishment always decreases the frequency of the response it follows. Remember also that
if we present a reinforcing or punishing stimulus, we call it positive reinforcement or
punishment. If we take something away, we call it negative reinforcement or punishment
CHAPTER 7
253
Locking
It In
positive reinforcement
Increasing the frequency of a
response by presenting a
reinforcing stimulus after the
response occurs
negative reinforcement
Increasing the frequency of a
response by removing an
aversive stimulus after the
response occurs
254
Raygor
rate of bar pressing. Psychology professors are often frustrated by the fact that
students always seem to think negative reinforcement is the same as punishment. If you think about this for a moment, you can see that it is impossible.
Reinforcement increases the probability of a behavior. Punishment decreases
the probability of a behavior. That means that it is impossible for reinforcement (positive or negative) to be punishment.
Q: Is there such a thing as positive and negative punishment?
positive punishment
Decreasing the frequency of a
response by presenting an
aversive stimulus after the
response occurs
negative punishment
Decreasing the frequency of a
response by removing a
reinforcing stimulus after the
response occurs
Locking
It In
Shaping Behavior
Q: How can you use reinforcement on a behavior that never occurs?
Suppose that youd like a child to do a good job of cleaning his or her room.
Youd be more than happy to reinforce the behavior but, sadly, it never occurs.
Learning
How can you use operant conditioning to condition this behavior? What if the
behavior you want to condition does happen, but it happens so rarely that the
training might take a very long time? The answer to both of these questions
lies in a scientific technique called shaping (Skinner, 1937). Shaping is defined as reinforcing successive approximations to the desired behavior. In shaping, we reinforce any behavior that is a little more like the behavior we want
to see. Lets look at a simple example. When I let my dog Penny into the house,
she would come racing in at full speed, running over my feet, and bumping
into the furniture. Yelling at her only reinforced this behavior with attention
and made it worse. I couldnt reinforce her for coming through the door slowly
because it never happened. I finally realized that shaping was the best way to
change this behavior. I began saying good dog whenever she came in just a
little slower than usual. Over the next few weeks, she began coming through
the door slower and slower. After three weeks, she was coming through the
door in slow motion. I had reinforced successive approximations to entering
the house slowly.
Shaping is a powerful and underused scientific technique. It can be used
to change both desirable and undesirable behaviors in others and in ourselves.
It has several important advantages. As we mentioned above, it allows us to reinforce behaviors that otherwise would never happen. It also works much
faster than simply reinforcing the desired behavior. Shaping also can be a powerful alternative to punishment.
To make shaping effective, it is important to remember two practical facts.
First, the steps must be small. Shaping works by breaking up the progress toward the desired behavior into small steps. If the steps are too big, the shaping
will not be successful. Say, for example, that you would like to exercise more.
Currently you dont exercise at all. Youd like to exercise for at least 20 minutes
per day. You could start by giving yourself a reinforcer every time you exercised
for 20 minutes or more. It is unlikely, however, that you will be successful in
going from no exercise to 20 minutes in one jump. You are likely to be more
successful if at first you reinforce yourself for 1 minute of exercise. Once you
are exercising regularly, increase this to 2 minutes, then 3 minutes, and so on.
The best way to tell if the shaping steps are small enough is to see if reinforcers
are being delivered fairly often. If you are not delivering reinforcers regularly,
the steps are probably too big. If you havent delivered a reinforcer for significant amount of time, and the person or animal you are training is showing
signs of frustration or if the behavior is not occurring as often as it was before,
you should probably make the steps smaller.
The second thing to remember with shaping is that at first, you will be reinforcing relatively poor performance. Sometimes this is hard to do, especially for
parents, bosses, and people trying to change their own behavior. If you want your
children to learn to do a good job of cleaning their room, you will have more success if you first reinforce any attempt to clean the room at all. In fact, the first step
in shaping room cleaning should probably be just going into the room. Gradually,
you can make the reinforcement depend on better and better performance.
Q: Doesnt this teach the child that doing a poor job is acceptable?
Acceptable performance almost never comes right away. If you are taking tennis lessons, a good instructor wont criticize your poor early performance. Instead, he or she will reinforce you for the little improvements you make on the
way to becoming a good tennis player. The instructor is not teaching you that
poor performance is acceptable. Instead, he or she is reinforcing you for improving your performance. The fact that the final behavior has been described
in detail wont usually result in perfect performance right away. Explaining to
a child how we want the room cleaned probably wont mean that the child will
CHAPTER 7
255
shaping
reinforcing successive
approximations to the desired
behavior
256
Raygor
Operant Extinction
Q: How can we use operant conditioning to get rid of unwanted behaviors?
In classical extinction, we stop presenting the unconditioned stimulus. In operant extinction, we no longer present the reinforcer. Without the reinforcer
to prop it up, the behavior gradually grows less and less frequent. Eventually,
it will disappear. Just as in classical conditioning, we can expect spontaneous
recovery. The response will still reappear after a rest period. In our traditional
example, the rat presses a bar to get food. To extinguish this behavior, we need
to make sure that we no longer deliver food after a bar press. Over time, the
rat will press the bar less and less and will eventually quit. Because of spontaneous recovery, the rat will still press the bar a few times after a rest period.
In operant extinction, two other things can be expected that dont occur in
classical extinction. First, the rate of the behavior often goes up before it goes
down. When we first stop the food delivery, the rat will actually increase its rate
of bar pressing. The rate will go down but we can count on it going up first. The
second thing we can expect in operant extinction is emotional behavior. Rats
will shake, defecate, and sometimes bite the bar. This is the rats version of a
temper tantrum. It must be expected in operant extinction. This behavior may
seem silly but humans behave exactly the same way.
Imagine that every day you put money in a candy machine, press a button,
and receive a candy reinforcer. One day, when you press the button, no candy
comes out. The machine has put you on extinction. Whats the first thing you
do? If you said, kick the machine, you havent been paying attention. Usually,
the first thing people do in this situation is to press the button several times.
Notice that your usual rate for this behavior is once a day. In extinction, the
rate of response goes up at first. If the machine continues to keep you on this
extinction schedule, your response will eventually stop altogether. The second
thing you will probably do is to kick the machine, curse, or mutter something
insulting to the machine. This is emotional behavior and it is also a standard
response to extinction in operant conditioning. You may be thinking that pushing the button is just a reasonable way of checking to make sure that the machine isnt working. If so, imagine this example. Have you ever gone up to the
door of a store just after closing and found it locked? Even after seeing the
closed sign and the bolt locking the door, have you ever tried opening it? I have.
In fact, I sometimes pull several times on the door. Then I grumble some and
leave. This is the standard pattern of operant extinctionfirst the rate goes up,
then the emotional behavior appears. If you plan to use operant extinction to
eliminate some undesirable response, be sure that you are prepared to hold out
through these two effects. People sometimes give up too soon. When the response goes up, they think that what they are doing is making things worse.
They give in and deliver the reinforcer. When this happens, they end up actually reinforcing the behavior they are trying to eliminate. Other times, they may
give up too soon because they are surprised by the emotional behavior. This reinforces the emotional behavior and can teach your subject to throw tantrums
to get what he or she wants.
Learning
CHAPTER 7
257
Q: What if I want to eliminate a behavior but dont know what the reinforcer is?
Since scientific research tells us that reinforcement works best when the reinforcer comes immediately after the response, the place to look for the reinforcer is right after the behavior. If your child interrupts you while you are on
the phone, whatever you do in response to the interruption is probably a reinforcer. If you get into lots of arguments with a particular friend, whatever you
do when the friend disagrees with you is probably a reinforcer.
1. Reinforcing successive approximations to some desired behavior is
called ________________.
2. When a behavior is no longer followed by reinforcement, it occurs less
often. This is called __________________.
3. Presenting a stimulus that increases the probability of the behavior it
follows is called ___________________.
4. Presenting a stimulus that decreases the probability of the behavior it
follows is called ___________________.
5. Critical Thinking: Why does shaping work faster than simple positive reinforcement?
Answers: 1) shaping, 2) extinction, 3) reinforcement, 4) punishment
Locking
It In
Kinds of Reinforcers
Q: What kinds of reinforcers are there?
When Thorndike first put cats in his puzzle box, he put food outside the box
as a reinforcer. He soon learned, however, that the food was not necessary.
The cats would learn the response just to get out of the box. Apparently, escaping from the box was also a reinforcer. Reinforcers that dont need to be
learned such as food, water, escape from pain, and the chance to engage in
sex, are called primary reinforcers. These reinforcers are obviously important for the survival of the species and are built in by nature. Primary reinforcers are often limited by the condition of the animal. Food, for example,
will not work well as a reinforcer unless the animal is hungry. Psychologists
agree that the biological necessities listed above are all primary reinforcers.
There may be other primary reinforcers as well. Scientific researchers in the
1950s identified a group of reinforcers they called sensory reinforcers.
These reinforcers involve the animals senses and, like primary reinforcers,
are probably unlearned. Monkeys, for example, can be reinforced with the
opportunity to look out a window (Butler, 1954). Humans will press a button
in order to see a display of lights in a dark room (Jones, Wilkinson, & Braden,
1961). The popularity of many computer games probably depends on sensory
reinforcers. This is especially true of games that involve a lot of exploration.
Humans and animals can be reinforced by the opportunity to explore their
environment, physical contact, the opportunity to hold something in their
hands, and a number of other sensory reinforcers. These sensory reinforcers
have obvious survival value for the species. The desire to explore and understand your environment and to have contact with your own species are valuable traits for any animal.
A friend of mine once served as a playground monitor for an elementary
school. He developed a plan to get the children to stop pushing and shoving
on the playground. He drew a chalk square in one corner of the playground
primary reinforcer
An innate (unlearned)
reinforcer such as food or
water
sensory reinforcer
a reinforcer that has the
stimulation of the senses as its
only reinforcing property
258
Raygor
and declared it the jail. He announced to the children that anyone pushing
or shoving on the playground would have to spend fifteen minutes in jail.
Within ten minutes, over half the children were in jail. The ones still outside
were madly pushing and shoving each other and begging to be arrested. My
friend was a clever enough scientist to realize that he had discovered a powerful
social reinforcer. He kicked everyone out of the jail and announced that the
children who behaved the best and who did not push or shove would get to go
to jail. For the rest of the recess, the children behaved perfectly. Humans and
some animals can be reinforced with praise, attention, and the chance to
spend time with members of their own species.
Social reinforcers can be very powerful. Humans are very social animals
and spend much of their time with others. Social reinforcement probably explains why many people are willing to spend more for a drink in a bar than the
same drink would cost them at home. It also explains why the same people may
leave the bar if there is no one there. One of the more severe punishments in
prisons around the world is solitary confinement. Some psychologists consider
social and sensory reinforcers to be learned. It seems likely, however, that they
are built in by nature. They have survival value for the species for one thing.
For another, it is easy to think of cases where these reinforcers work without
any apparent learning on the part of the animal. Praise, for example, is a powerful reinforcer for many dogs. Many dog owners do not bother to pair praise
with food because it already works for their dogs. Dogs, like humans, are very
social animals and responding to praise and attention are probably part of
their biological programming. For Thorndikes cats, escaping from the puzzle
box was an effective reinforcer. For cats (and for other animals as well) escaping from a confining environment may be a primary reinforcer.
Q: Are there reinforcers that are not built in by nature?
secondary reinforcer
A learned reinforcer that gets
its power by being paired with
a primary reinforcer
generalized reinforcer
A learned reinforcer that is
effective at any time because it
has been paired with several
different primary reinforcers
Some reinforcers are definitely learned. These reinforcers are called secondary reinforcers. Secondary reinforcers have no value of their own. They get
their power by being associated with primary reinforcers. If a tone sounds
every time a rat gets a food pellet, the tone will soon become a secondary reinforcer. Secondary reinforcers will lose their effectiveness if they are not paired
occasionally with a primary reinforcer. Secondary reinforcers tend to be limited in effectiveness in the same way that primary reinforcers are. We said earlier that we cant use food to reinforce a rat that isnt hungry. Similarly, a tone
that has been associated with food will not be effective if the rat is full. Some
learned reinforcers, however, are good almost any time. This is because they
have been paired with a number of different primary reinforcers. We call these
all-purpose reinforcers generalized reinforcers. Money, in many societies, is
a powerful generalized reinforcer. In these societies, money can be an effective
reinforcer even if the person is not hungry or thirsty. If a person uses money to
buy food when they are hungry and something to drink when they are thirsty,
money will become a generalized reinforcer. In some institutions, generalized
reinforcers called tokens are created for use in changing behavior. Schools,
mental hospitals, and group homes often create a token economy using artificial reinforcers such as poker chips or points to reinforce desirable behaviors.
In a mental hospital, for example, patients can use tokens to buy things like
dessert, field trips, or access to the television. The tokens become generalized
reinforcers. This allows the staff to use them to reinforce behaviors like tooth
brushing and room cleaning. They can also use the tokens as part of a shaping
program to reinforce the patients for improving their behavior (Kazdin, 1977;
Pitts, 1976).
Psychologist David Premack discovered another kind of reinforcer in
1962. Premack (pronounced PREE-mack) noticed that some behaviors occur
Learning
more often than others do. Being a good scientist, he wondered if the opportunity to engage in a high-frequency behavior could be used to reinforce a lowfrequency behavior. His experiments showed that this was the case. We now
call this effect the Premack principle in his honor. Consider a child who
spends a lot of time playing video games and very little time reading. According to the Premack principle, we could increase this childs reading by using video
game playing as a reinforcer. On the other hand, with a child who reads a lot and
seldom plays video games, we could do the opposite. We could increase the rate
of game playing by reinforcing game playing with the opportunity to read.
One problem with the Premack principle as a scientific explanation of behavior is that sometimes low-frequency events can be very reinforcing. The opportunity to watch the Winter Olympics, for example, happens only once every
four years. This makes it a very low-frequency behavior. In spite of this, people
often give up high-frequency behaviors to watch the Olympics. Another way of
looking at the relative power of various reinforcers is called the disequilibrium
principle (Timberlake & Farmer-Dougan, 1991). According to the disequilibrium principle, stimuli can be reinforcing some times and not others. Whether
a stimulus is reinforcing or not depends on the state of the animal. The most
obvious example is food. If a person is hungry, food is a very powerful reinforcer. If a person is full, however, food may not be an effective reinforcer. The
disequilibrium principle suggests that all other reinforcing stimuli work this
way as well. If you have been alone for a long time, the opportunity to see other
people will be a powerful reinforcer. If you have been on a crowded bus for
many hours, however, the opportunity to be by yourself may be reinforcing.
The disequilibrium principle suggests that each person has a preferred range
for every stimulus. When the frequency of the stimulus gets outside of the preferred range, the person experiences disequilibrium. When this happens, the
stimulus becomes an effective reinforcer. The further outside the preferred
range the reinforcer is, the more effective it is as a reinforcer. One person, for
example, might prefer getting a haircut about once a week. Another person
might prefer getting a haircut once a month. The first person might experience
haircut disequilibrium after two weeks without a haircut. For this person, two
weeks after the last haircut, the opportunity to get a haircut would be an effective reinforcer. The second person would not experience disequilibrium until much later. For the second person, the opportunity to get a haircut two
weeks after the last one might not be reinforcing at all.
1. Reinforcers such as food and water that dont need to be learned are
called _________ reinforcers.
2. ___________ reinforcers get their power by being presented with primary
reinforcers.
3. ___________ are good at any time because they have been paired with
several primary reinforcers.
4. Schools and mental hospitals sometimes use artificial generalized reinforcers called __________.
5. When we reinforce a low-frequency behavior with the opportunity to engage in a high-frequency behavior, we are using the ____________ principle.
6. Critical Thinking: Why is money an ineffective reinforcer with young
children?
Answers: 1) primary, 2) secondary, 3) generalized, 4) tokens, 5) Premack
CHAPTER 7
259
Premack principle
David Premacks idea that the
opportunity to engage in a
high-frequency behavior can
be used to reinforce a lowfrequency behavior
Locking
It In
260
superstitious behavior
Behavior that is reinforced by
accidental (non-contingent)
reinforcement
Raygor
Superstitious Behavior
Q: What about accidental reinforcers, are they effective?
My family sometimes gets together on Mothers Day at a local racetrack. One
Mothers Day at the track, my wife and son both bet on a particular horse, a long
shot, to win. They had never watched a race from the finish line and decided
that this would be a good time to try it. They went down to the track and stood
at the rail next to the finish line for the race. Their horse won. Both the race and
the payoff were very exciting for them. That was a number of years ago but, to
this day, they stand at the finish line for every race they bet on. How would a scientist explain this behavior? We all know that standing at the finish line didnt
actually make their horse run any faster but they were reinforced for it. As scientists, we know that behaviors that are reinforced tend to happen more often.
This happens even if the behavior doesnt lead directly to the reinforcer. Behaviors that develop or persist because of accidental reinforcement are called superstitious behaviors. The scientific term for these accidental reinforcers is
non-contingent reinforcement. Non-contingent reinforcers are not directly related to the behavior that comes before them, they just happen to follow it. Suppose we put a pigeon in a cage and give it access to food every thirty or forty
seconds. The pigeon will get the food no matter what it does. The food is a noncontingent reinforcer. That means that it is not dependent on any behavior. Although the pigeon can receive the same number of reinforcers just by standing
still, we can expect the pigeon to develop some superstitious behavior. It might
begin pecking the wall, turning in circles, or flapping its wings. Whatever the pigeon happens to be doing when the food appears will be reinforced. The pigeon
will perform this behavior more often and it will be reinforced again the next
time the food appears. Soon, the pigeon will be performing its act continuously
even though it has nothing to do with the reinforcement (Skinner, 1948).
In human behavior, superstitions are common. They are particularly common, however, among athletes and gamblers. Can you think of why this would
be? Superstitious behavior is most likely when there are many reinforcers and
their timing is somewhat unpredictable. A person working for a salary gets
paid every two weeks. For that person, the reinforcer is predictable and not
very frequent. For an athlete or gambler, on the other hand, the reinforcers are
much more frequent and much less predictable. A gambler never knows
whether the next hand of cards or roll of the dice will be successful. A baseball
pitcher never knows whether the next pitch will be a strike or result in a hit. We
can also guess which kinds of athletes will be most likely to engage in superstitious behaviors. The ones who receive the most separate reinforcers and the
ones for whom the reinforcers are the most unpredictable should be the most
superstitious. Think about marathon runners, tennis players, ski racers, and
baseball pitchers. What we know about the nature of reinforcers lets us make
a scientific prediction that tennis players and pitchers should show more superstitious behaviors than marathon runners or ski racers do. The next time
you see a tennis match or a baseball game, watch for superstitious behaviors.
Carefully observe the tennis players behavior just before they serve and observe the brief rituals that the baseball players go through at the plate or on the
pitchers mound. Much of what they do is probably superstitious behavior.
Learning
Schedules of Reinforcement
Continuous and Intermittent Reinforcement When I was about eight years
old, my mother bought a ceramic cookie jar in the shape of a pig. When it was
new, she often filled the pig with cookies. I soon learned to check inside. I was
often reinforced with a cookie. Over time, my mother made cookies less often
but I continued to check the jar. Every so often, Id still be rewarded with a
cookie. Now, many years later, I dont have a cookie jar in the house and seldom see one. Every once in a while, I see a cookie jar like my mothers at an antique shop or in someones home. To this day, whenever this happens, I look
inside. The jar is always empty. I dont think Ive found a cookie in a cookie jar
in over 20 years but I still look. How can we give a scientific explanation of why
this behavior has persisted for so long without being reinforced?
If behaviors stopped whenever they were not reinforced, many of our everyday behaviors would disappear. Operant behavior does not need to be reinforced
every time to continue. For operant conditioning to be effective, reinforcement
CHAPTER 7
261
discriminative stimulus
A stimulus that serves as a
signal that a response will be
followed by a reinforcer
stimulus control
A response is said to be under
stimulus control when a
particular stimulus controls the
occurrence or form of the
response
Locking
It In
262
continuous
reinforcement
Reinforcing a response every
time it occurs
intermittent
reinforcement
Reinforcing a response only
some of the times it occurs
Raygor
should be delivered consistently at first. It should follow every time the behavior
occurs. Later, though, the reinforcer can be delivered less and less often without
the behavior going through extinction. When a response is reinforced every time
it happens, the animal is receiving continuous reinforcement. If the reinforcer
only follows the response some of the time, the animal is receiving intermittent
reinforcement (Skinner, 1933). B. F. Skinner, in an early experiment, had a group
of rats pressing a bar to receive food pellets. In those days, the food pellets had to
be made by hand in a press. One Friday, Skinner noticed that he was running
short of pellets. There were probably not enough left to last through the weekend.
He wasnt in the mood to make more food pellets so he tried to think of a way to
make the ones he had last through the weekend. He hit on the idea of rigging the
food delivery machine so that the rats had to press the bar twice to get one food
pellet. He wasnt sure what would happen but, as a scientist, he thought that it
would be interesting to find out. When he returned on Monday, he found that the
rats were still pressing the lever for food. He also noticed that their rate of bar
pressing had gone up. He started experimenting with different ways to present the
food and discovered what he called schedules of reinforcement (Skinner, 1938).
Ratio Schedules The original Skinner box had a wheel with a series of holes
around the edge. The wheel sat above the tube that delivered a food pellet into
a cup where the rat could eat it. Every time the rat pressed the bar, the wheel
turned so that the next hole lined up with the tube and allowed one food pellet
to drop into the cup. To make the rats press the bar twice for one food pellet,
Skinner just put tape over every other hole on the wheel. Once he discovered
that the rats continued to work under these conditions, he tried putting tape
over more and more holes of the wheel. This kind of schedule of reinforcement
is called a ratio schedule. In a ratio schedule there is always some ratio between
the number of responses and the number of reinforcers. The more responses the
animal makes, the more reinforcers it gets. On a fixed-ratio schedule, the rat
must make exactly the same number of responses for each reinforcer. During
Skinners original intermittent schedule, the rat was reinforced for every other
response. This is a fixed ratio of two responses per reinforcer (see Figure 7.5).
Imagine that we put tape over many of the holes in the food-delivery wheel of
a Skinner box but we dont space the untaped holes evenly around the wheel. The
rat still has to make a certain number of responses to get a reinforcer. Now, however, the number of responses varies. This is called a variable-ratio schedule. A
slot machine reinforces humans using a variable-ratio schedule. The slot player
has to make a certain number of responses to get their next reinforcer. The
number, however, is not fixedit varies.
Interval Schedules Suppose that each time the rat gets a reinforcer, we shut off
the food-delivery system for one minute. Now the rat can only get one reinforcer
per minute no matter how many responses it makes. This is not a ratio schedule
because there is no ratio between responses and reinforcers. Now reinforcement
depends on the passage of time. Because the pattern of reinforcement is based on
a time interval, this kind of schedule is called an interval schedule. If the time-out
is always the same length, we call this a fixed-interval schedule. If it varies, we
call it a variable-interval schedule. Interval schedules tend to produce slower
rates of responding than ratio schedules. This makes sense since, with a ratio
schedule, the more responses the animal makes, the more reinforcers it gets. With
an interval schedule, the animal can only get so many reinforcers in a given time
period no matter how many responses it makes. On a fixed-interval schedule, animals tend to make more responses just before it is time for the reinforcer to be
available. For example, most of us check the mail more and more often as the
usual delivery time approaches. On a variable-ratio schedule, animals tend to
Learning
CHAPTER 7
263
FIGURE 7.5
CUMULATIVE RECORD FOR FR,
VR, FI, VI SCHEDULES OF
REINFORCEMENT
maintain a slow but steady rate of response. This is what most of us do when redialing a busy phone number. We will get through after a certain amount of time
but the time interval is unpredictable (see Figure 7.5). Generally, ratio schedules
produce higher rates of response than interval schedules. This is not surprising
since, on a ratio schedule, the faster the animal responds, the more reinforcers it
will get. On an interval schedule, the number of reinforcers is limited by the
schedule. You may work, for example, at a job where you get paid every two
weeks. You can only receive one paycheck every two weeks no matter how often
you make the response of visiting the payroll office.
Locking
It In
264
Raygor
Learning
suppressed. Take, for example, a child who interrupts a parent while the parent is
on the phone. The parent is always there when the behavior occurs. If the parent
delivers punishment immediately after the behavior and the punishment is severe
enough, the behavior will be suppressed. The fact that such punishment is effective,
however, doesnt necessarily mean that it is a good choice for modifying behavior.
CHAPTER 7
265
266
Raygor
away from whoever is spanking them. Every year, children run away from
homes where they are punished. Animals will also try to avoid the situation in
which they were punished. This is called avoidance. Avoidance makes punishment effective if the animal avoids the behavior that resulted in punishment.
Unfortunately, the animal also avoids the situation where it was punished and
the person who did the punishing (Azrin & Holz, 1966). Children who receive a
lot of punishment will spend much of their time away from home. They can be
found at the park, at a friends house, at the mallanywhere but the place they
are punished. If they are home, they will try to stay away from the person who
punishes them. Children cant learn to model a parents behavior if they constantly avoid the parent.
Punishment Provides Poor Feedback Punishment, even when effective, usually gives information about what you shouldnt do. Unfortunately, it doesnt do a
very good job of telling you what you should do. A child who is punished for leaving a jacket on the floor by the front door may learn not to leave the jacket in that
particular spot. The punishment, however, doesnt reinforce them for putting the
jacket where it belongs. Since there are many places where the jacket doesnt belong, it may be quite a while before the correct response is learned. Using shaping to teach the correct response is almost always faster than using punishment.
Punishment Can Lead to Aggressive Responses One of the most troubling
side effects of punishment is its tendency to create aggressive responses. Azrin
and Holz (1966) found that if two rats in the same cage are given electric shock,
their first response is to try to escape. After the escape behaviors are extinguished, however, the rats soon turn on each other. The punishment appears to
make them aggressive and violent. Children who are punished can turn on
their parents too. When the punishment is mild, the children may get even
with their parents by being irresponsible or by doing things to annoy their parents. In extreme cases, children sometimes injure or even kill their parents.
Q: Is punishment a direct cause of this kind of behavior?
We cant perform the scientific research that would give us a clear answer to this
question. It would be unethical to punish children intentionally just to observe the
results. Instead, we have to observe existing families. As scientists, we try to find
patterns of behavior that will help explain the effects of punishment on children.
Punishment Can Model Violent Behavior When a parent uses physical punishment, he or she is providing a violent model for the child. The parent has a
problem with the childs behavior and is trying to solve that problem with physical violence. We know that children imitate the behaviors of their parents. If a
parent solves problems with punishment, the child is likely to do the same (Patterson, 2002; Reid et al., 2002). If a child often receives physical punishment, it
is likely that they will grow up to be a more violent person. This kind of imitation is called modeling. In one study, trained observers recorded violent behavior at playgrounds in Germany, Denmark, and Italy. They recorded aggressive
acts between adults and children, and between children. They found that where
adults were aggressive toward children, children were more violent to other
children (Hyman, 1997). We will be discuss modeling in more detail in the section on observational learning later in this chapter.
Q: Is punishment ever a good idea?
Psychologists dont agree on whether punishment is ever an acceptable way of
changing behavior. We have seen that punishment is difficult to use effectively.
We have also seen that even when punishment is used effectively, it can lead to
emotional behavior, low self-esteem, escape and avoidance behavior, and ag-
Learning
CHAPTER 7
267
gression. It can suppress behavior without actually getting rid of it. It serves as
poor feedback about what the person should do. It can also model undesirable
behaviors and lead to violence and revenge. Most psychologists agree that punishment should be avoided whenever possible. Some psychologists, however,
believe that there are situations where punishment is appropriate. Some behaviors like playing with weapons, or running into traffic, put the child or others in serious danger. If the parent is present and can deliver punishment
immediately, punishment can suppress these behaviors. Sometimes a single
punishment can be effective if it follows immediately after the behavior. Some
psychologists believe that in these cases punishment is justified. Others believe
that the negative effects of punishment are bad enough that it should never be
used. These psychologists argue that parents should modify the environment
so that the punishment isnt necessary. Although psychologists dont agree on
whether punishment is ever necessary, they generally agree that the scientific
evidence strongly supports two basic statements about punishment. First,
much of the punishment that occurs in everyday life is not effective. Second,
even when punishment is effective, it is often a bad idea. Many undesirable behaviors can be eliminated either by using extinction, or by reinforcing an alternative behavior. These techniques work as well or better than punishment
in most situations and do not have punishments undesirable side effects.
Going
Beyond
the Data
There are several reasons for having prisons. One obvious one is to keep dangerous people from harming the
rest of us. Putting a person in prison can also satisfy societys desire to punish wrongdoers for their antisocial
acts. Crime victims and their families are also thought to
receive satisfaction by the imprisonment of the criminal.
Lets look, however, at one of the primary functions of
prison in modern society. Prisons are supposed to reduce
the crime rate through the use of punishment. The
threat of prison is meant to deter people from committing crimes. Common sense tells us that this should work.
The traditional idea is that if we punish antisocial behaviors, they will become less common. Prison is the primary
punishment for criminal behaviors in our society. Earlier,
we said that to be effective punishment should be immediate, consistent, brief, and unpleasant. Does a prison
sentence meet these requirements? Is it immediate or
consistent? Many crimes go unpunished. Criminals are
often arrested and not charged. They are often tried but
not convicted. Even if they are convicted, the prison sentence usually comes months or even years after the
crime. Is the punishment brief? Even the shortest prison
sentences usually involve months of punishment. Is
prison unpleasant? For many prisoners, prison is a very
unpleasant punishment. Other criminals, though, have
lives filled with danger, uncertainty, and deprivation. For
these criminals, prison may actually be less unpleasant
268
Raygor
good that the behavior will resume. Worse yet, the person may have learned how to avoid getting caught.
From other prisoners, they may learn how to defeat
alarm systems or how to select houses to burglarize. The
net result of this very expensive procedure may be to actually make the person a better burglar.
There is no question that there are dangerous people in the world. Such people need to be isolated from
society to prevent them from hurting the rest of us. We
will probably always need prisons for this purpose.
Based on what we know about how punishment works,
however, we shouldnt expect that the threat of going
to prison will have much effect on the crime rate. Prison
Learning
reinforced for this behavior and it delayed them from receiving the
food. Raccoons in the wild often wash their food in water. It appears,
then, that raccoons have a biological predisposition to wash reinforcers that have been associated with food.
In a more formal scientific test of the idea of biological predispositions, Foree & LoLordo (1973) taught pigeons to get food or avoid
shock by pecking or flapping their wings. The researchers found that
the pigeons easily learned to peck for food and flap their wings to avoid
shock. These were natural behaviors for the pigeons. Pigeons naturally
use their beaks to get food and flee from danger by flapping their wings. The
animals had a very hard time, however, learning the behaviors the other way
around. Teaching them to flap their wings to get food or peck to avoid shock
was much more difficult. Biological programming makes it much easier for animals to learn certain behaviors. The behaviors they learn easily seem to be related to natural responses that have evolved for that species. This effect can be
found in both classical and operant conditioning.
CHAPTER 7
269
COGNITIVE LEARNING
Cognitive learning is a general term and one that is not always used in the same
way by all psychologists. Some psychologists (strict behaviorists in particular)
dont accept the principle of cognitive learning at all. Others insist that cognitive learning is common and that much of what we know is the result of cognitive processes. Cognitive learning usually refers to all learning that does not
occur through traditional classical or operant conditioning. Usually, what is
learned is more complex than a simple response to a stimulus. In cognitive
learning, we often learn by observing, by reading, by imitating others, or by
reasoning. As we discussed in Chapter one, cognitive psychology developed
when psychologists tried to give scientific explanations of complex human behaviors like language, memory, and problem solving. These, and many other
common human behaviors, are very hard to explain scientifically using just
classical and operant conditioning.
Vicarious Conditioning
If a scientist were to give you a painful electric shock every time a small red
light came on, you would soon have an emotional reaction to the light. When
the light came on, your heart would speed up and your breathing would become more rapid. You might begin to sweat. In fact, it is likely that you would
experience a mild version of the fight or flight responsethe bodys normal
reaction to an emergency. Psychologists have found that this reaction can be
produced without ever shocking the subject. Bandura and Rosenthal (1966)
gave subjects an electric shock whenever a light came on. The subjects soon
learned to respond emotionally to the light itself. They also found, though,
that other subjects who merely watched this process also became aroused
when the light came on. This is called vicarious conditioning. In vicarious
conditioning, people learn emotional responses by watching what happens to
someone else. Some of the classical conditioning used in advertising depends
on vicarious conditioning. We often see someone being embarrassed, frightened, or happy in an advertisement. The advertisers hope that we will later experience the same emotion in similar situations. They hope that our emotional
response will make us more likely to buy their product. Vicarious conditioning also explains the fears that people can develop after watching a horror
movie. In a very frightening scene in Alfred Hitchcocks Movie, Psycho, actor
vicarious conditioning
Conditioning that occurs when
an animal observes another
animal being conditioned
270
Raygor
Janet Leigh is stabbed to death while taking a shower in a motel. Years later,
Leigh confessed that after appearing in the movie she never took another
shower. I saw the movie only once, many years ago. To this day, Im afraid to
take a shower in a motel. Many of my friends confess that they have the same
fear. I have never been punished in any way for taking a shower in a motel so
my fear cant be the result of simple classical conditioning. Janet Leigh revealed in 1995 that her fear was not a result of filming the famous shower
scene. She acquired her fear of showers along with many of the rest of us, by
watching the movie. A cognitive psychologist would argue that higher mental
processes like thinking and imagining are necessary to explain these examples
of vicarious conditioning.
Actress Janet Leigh
developed a conditioned
fear of taking showers
from watching herself in
this scene from the movie
Psycho
cognitive map
A mental image of an animals
environment that the animal
can use to guide its
movements
latent learning
Learning that occurs in the
absence of obvious reinforcers
and only appears after
reinforcement is introduced
Learning
CHAPTER 7
271
FIGURE 7.6
COGNITIVE MAPS
map of the maze. When the food became available, they used their cognitive
map to find it quickly. What they had learned while wandering through the
maze earlier was not a simple response. They had learned the layout of the
maze much as you and I have learned the way around the town we live in. If
we hear from a friend that someone is giving away twenty-dollar bills at Fifth
and Main, we can use our knowledge to get there as quickly as possible.
Q: Did these rats learn without any reinforcement at all?
The rats learned to navigate the maze without being reinforced by food. It
seems likely, however, that there were other reinforcers at work. Animals are
often reinforced for developing and maintaining accurate cognitive maps of
their environments. Hunters remember where they have found game and are
often reinforced for doing so. I have a friend who is an enthusiastic golfer. He
can draw you very accurate map of his favorite golf course. He claims that
when he has trouble sleeping, he plays the course in his head. Because his
knowledge of the course helps him get lower scores, my friend is often reinforced for maintaining this map of the course. The rats in the experiment may
have been reinforced in the past for having accurate cognitive maps of the
world around them. If this behavior was reinforced in a variety of settings, generalization would take effect. We would expect the rats to construct cognitive
maps in any new environment, including the maze. As scientists, we cant say
that the rats learned without being reinforced. We can only say that food was
not the reinforcer. We should also consider the possibility that making accurate cognitive maps has survival value. Perhaps making cognitive maps has
evolved because it helps us survive long enough to reproduce. This means that
making cognitive maps may not be a learned behavior at all.
272
modeling
The tendency to imitate the
behavior of another person
Raygor
had never seen it. He said that if he watches someone elses swing, he begins to
imitate it. Sometimes the results are disastrous. Modeling is a powerful force
in shaping human behavior. We imitate our parents, our siblings, people we see
in films and television shows, even cartoon characters. An important area of
study for psychologists is the question of whether watching violent behavior in
others makes us more violent. How could we, as scientists, test the theory that
we imitate violent behavior when we see it?
By far, the best-known scientific experiments on modeling are those performed by Albert Bandura and his associates (Bandura et al., 1961, 1963).
In these experiments some children observed adults attacking an inflatable
Bobo doll. The Bobo doll is an inflatable toy with weight in the bottom. It
pops back up when it is knocked down. In a typical experiment, children in
one group were playing quietly when an adult in the room went over to the
Bobo doll and began attacking it. The adult beat on the doll for about ten
minutes, kicking it, hitting it with a hammer, knocking it down, and sitting
on it. During this performance, the adult also spoke out loud saying things
like Sock him in the nose, and Kick him. The other group of children saw
no aggressive behavior at all. One at a time, the children were then taken into
another room. This new room contained a number of attractive toys and another Bobo doll. In the new room, each child was told that the toys were being saved for other children and that he or she was not allowed to play with
them. The child was then left alone and watched through hidden windows.
The children who had seen the aggressive model were much more likely to attack the Bobo doll in the second room. Often, they attacked it in the same way
as the adult had. They even repeated the phrases they had heard the adult use.
They appeared to be modeling their behavior closely after the behavior of the
adult. Bandura and his associates found that the children imitated the violent
behavior of adults even when the adult violence they saw was on film. Banduras research raises many questions about showing children violent images
on television or in the movies. We will discuss the effects of violence in the
media further in Chapter 14.
Q: Does reinforcement or punishment of the model play a role in modeling?
In Banduras early studies, the adult models received no punishment or reinforcement for their violent behavior. Mary Rosekrans and Willard Hartup
(1967) designed a scientific experiment designed to find out if what happened
to the adult model after their violent outburst would make a difference in the
childs imitation of the behavior. In this chapters Science of Psychology Journal,
we take a look at their experiment.
The Experiment
Learning
CHAPTER 7
273
Chapter Summary
What are the three basic kinds of learning?
Learning is defined as a relatively permanent change in behavior due to experience. The three basic kinds of learning are classical conditioning, operant conditioning, and observational learning.
How did Ivan Pavlov discover classical conditioning?
While studying digestion in dogs, Ivan Pavlov learned that if a neutral stimulus (a bell) is repeatedly paired with an unconditioned stimulus (food) that
already elicits the unconditioned response (salivation), the neutral stimulus
will come to elicit the same response.
Who made Little Albert afraid of the rat and how was it done?
J. B. Watson made Little Albert afraid of the rat by using classical conditioning. He presented the rat together with a loud noise.
How are behaviors learned and unlearned in classical conditioning?
During the acquisition phase of classical conditioning, a neutral stimulus
(CS) is repeatedly paired with an unconditioned stimulus (US) that already
elicits the unconditioned response (UR). After conditioning, the CS will
elicit the same response. The response to the CS is called a conditioned response (CR).
274
Raygor
Learning
CHAPTER 7
sponse will usually show spontaneous recovery and reappear after a rest period.
In punishment, the response is followed by an aversive stimulus. Over time,
the frequency of the response decreases.
How does the schedule of reinforcement affect operant conditioning?
With continuous reinforcement, the response is reinforced every time it occurs. During intermittent reinforcement, the response is not reinforced
every time it occurs. There are two kinds of intermittent schedules of reinforcement: ratio schedules and interval schedules.
On ratio schedules, the animal must make a certain number of responses to
receive a reinforcer. The number may be fixed (fixed-ratio schedule) or it
may vary (variable-ratio schedule). Ratio schedules generally produce
higher rates of responding than interval schedules.
On interval schedules, the animal will not be reinforced for a certain period
of time after receiving a reinforcer. The interval may be fixed (fixed-interval
schedule) or it may vary (variable-interval schedule).
Can punishment be an effective method of eliminating undesirable behaviors?
Punishment can suppress undesirable behaviors if it is immediate, consistent, brief, and aversive. However, in everyday life, it is often difficult to meet
these standards. Using prison to punish criminals, for example, is unlikely
to be effective in suppressing their criminal behavior because the punishment is not immediate, consistent, or brief, and may not be aversive for
some criminals.
Are there problems or negative side effects associated with the use of punishment?
Punishment has a number of problems and negative side effects. It can cause
anxiety and emotional behaviors. It can suppress behaviors without really
eliminating them. It can lead to escape and avoidance behaviors. Punishment provides poor feedback because it doesnt give any information about
what behavior is desired. It can also produce aggression and hostility. Adults
who use physical punishment are modeling violent behavior.
Can we learn without undergoing classical or operant conditioning?
In vicarious conditioning, conditioned responses can be learned by observing another animal being conditioned.
The research of Albert Bandura and his associates demonstrated that children imitate the behavior of adults through a process called modeling.
Does watching violent shows make children more violent?
A number of researchers have found that children imitate the behavior of
adult models. This is true even when they view films or videos of adult behavior. It the adults behave violently, it appears that the children will become
more violent. This is especially true when the adult model it reinforced for
violent behaviora common feature of much media violence.
learning
stimulus
response
habituation a decrease
in response to a
repeated stimulus
classical conditioning
unconditioned stimulus
(US)
unconditioned response
(UR)
conditioned stimulus
(CS)
unconditioned response
(UR)
extinction
spontaneous recovery
generalization
discrimination
discrimination training
Important
Terms and
Concepts
275
276
Raygor
conditioned emotional
response
systematic
desensitization
biological preparedness
blocking
taste aversion
law of effect
operant conditioning
reinforcer
punishment
positive reinforcement
negative reinforcement
Important
Names
Ivan Pavlov
J. B. Watson
Robert Rescorla
Leon Kamin
John Garcia
E. L. Thorndike
B. F. Skinner
David Premack
Albert Bandura
positive punishment
negative punishment
shaping
sensory reinforcer
primary reinforcer
secondary reinforcer
generalized reinforcer
Premack principle
superstitious behavior
stimulus generalization
discriminative stimulus
stimulus control
continuous
reinforcement
intermittent
reinforcement
fixed-ratio schedule
variable-ratio schedule
fixed-interval schedule
variable-interval
schedule
vicarious conditioning
cognitive map
latent learning
modeling