Sei sulla pagina 1di 11

JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 1975, 24, 215-225 NUMBER 2 (SEPTEMBER)

THE BLOCKING OF REINFORCEMENT CONTROL'


BEN A. WILLIAMS
UNIVERSITY OF CALIFORNIA, SAN DIEGO

Two experiments were conducted to extend the blocking effect to the reinforcement of a
response. A delayed reinforcement contingency was presented to subjects with or without a
previously pretrained response available during the delay interval. The interpolated re-
sponse had no scheduled effect on delivery of the reinforcer, but its availability reduced
strengthening of the initial response, which completely extinguished for some subjects. The
results were interpreted as support for blocking as a fundamental principle of behavior, and
as evidence against the principle of reinforcement being stated solely in terms of temporal
proximity between response and reinforcer.
Key words: blocking, principle of reinforcement, delay of reinforcement, temporal con-
tiguity, predictiveness, information, pigeons

Investigations of the conditions defining in terms of the order and proximity of re-
when behavior is strengthened (or selected) sponse and reinforcement." In contrast, there
by reinforcement and of the conditions de- never has been universal acceptance of tem-
fining the occurrence of stimulus control over poral contiguity's role for stimulus control.
behavior have grown increasingly separate. The need for differential reinforcement (e.g.,
Perhaps because of the emphasis on operants Jenkins and Harrison, 1960) and the apparent
as emitted behavior, with no necessary elic- involvement of selective attention (e.g., Rey-
iting stimulus, response strengthening and nolds, 1961) both have argued, persuasively
stimulus control have been assumed, at least to some, that temporal contiguity between
implicitly, to be independent processes, which stimulus and reinforcement is not a sufficient
may obey different laws of conditioning. Such condition for stimulus control.
a separation is to be contrasted with the More recent research has shown clearly that,
Thorndikean law of effect, where the unit of indeed, temporal contiguity is not sufficient
learning was the S-R connection, and response (cf. Honig, 1970). Perhaps the most definitive
strengthening and stimulus control were to- demonstration of stimuli not gaining control
tally interdependent. even when indefinitely paired with reinforce-
The effect of the Skinnerian rejection of the ment is the "blocking effect" (cf. Kamin, 1968;
S-R connection is nowhere more evident than 1969). The typical procedure for producing
in the treatment of temporal contiguity as a blocking is to pair some compound stimulus,
condition for learning. Until recent challenges AB, with reinforcement. Normally both ele-
(e.g., Baum, 1973; Bloomfield, 1972), tem- ments gain control, as shown when they are
poral contiguity between response and rein- tested separately. When one of the elements
forcer has been almost universally accepted A is pretrained alone, however, stimulus con-
as the key ingredient of the principle of rein- trol by the other element B occurs much less
forcement. In the words of Skinner (1948), during compound AB training. Thus, the pre-
".... conditioning takes place presumably be- training of A "blocks" control by B.
cause of the temporal relation only, expressed The question raised by demonstrations such
as blocking is whether similar functional rela-
'This research was supported by USPHS grant MH tions can be shown for the operant strength-
25202-01 and NSF grant GB-42887 to the University of ening of a response by reinforcement. To the
California. Reprints may be obtained from the author, extent that stimulus control and response
Department of Psychology, University of California, strengthening obey similar laws, some more
San Diego, P. 0. Box 109, La Jolla, California 92037.
The author thanks George S. Reynolds for the loan of unitary conception of learning would seem
equipment to conduct Experiment 2. required. The present experiments addressed
215
216 BEN A. WILLIAMS
this issue by extending the blocking effect to key for 5 sec, followed by 5 sec of darkness,
a response-reinforcer relation. A contingency followed by illumination of the right-green
was arranged between a response and rein- key for 4 sec, followed by a 3.0-sec reinforcer
forcer in a situation where an alternative re- if at least one peck had occurred on the left-
sponse was available that previously was as- red key while illuminated. Responses to the
sociated with the reinforcer. The opportunity right-green key had no scheduled effect. In
for the two responses was arranged sequen- addition to the reinforcement obtained
tially. First came the response to be condi- through pecking, response-independent grain
tioned, and then the previously associated delivery at the green-key offset was scheduled
response. At issue was whether the presence of on one-third of the trials on a quasirandom
the previously associated response during the basis. The additional reinforcement was in-
delay between the initial response and rein- cluded to ensure that the green-key responding
forcer blocked the strengthening of the initial would be maintained even if red-key respond-
response. ing extinguished. Trials were separated by a
50-sec intertrial interval, during which time
the chamber was dark.
EXPERIMENT 1 R-only condition. The procedure for the six
Subjects control subjects was similar except that the
Twelve experimentally naive mixed-breed right-green key was not illuminated during
pigeons were maintained at 80% of their free- the last 4 sec of the delay interval. Their trials
feeding body weights. were composed of 5 sec of left-red key illumi-
nation, 9 sec of darkness, and then reinforce-
Apparatus ment if a red-key response had occurred. As
Two identical chambers with interior di- with the blocking condition, response-inde-
mension of 28 by 30 by 38 cm were constructed pendent reinforcement was scheduled on one-
from styrofoam picnic cases and placed in third of the trials at the end of the 9.0-sec
larger wooden boxes. Two translucent pigeon delay.
keys, each 2.5 cm in diameter, were located 23 Reversal of conditions. After 10 sessions
cm above the floor, and 10 cm apart, center to under the above procedures, the conditions
center. Each key required at least 20 g force for the two groups were reversed, and 10 addi-
(0.20 N) for operation. The left key could be tional sessions were presented.
illuminated with red light, the right key with
green light. Between and 8.0 cm below the RESULTS
keys was the grain hopper, which was acti- Figure 1 presents the mean percentage of
vated for reinforcement. The two chambers trials on which reinforcement was obtained
were located in a completely darkened room, by a response to the left-red key. Subjects with
isolated from the scheduling equipment in an the interpolated green key responded to the
adjoining room. red key less often. To assess the effect statisti-
cally, the data were subjected to a two-way
Procedure analysis of variance. The blocks produced a
Pretraining. After learning to peck the right- significant effect (F9, = 5.40, p < 0.01), but
green response key, all subjects were trained the F-value for the treatment effect only ap-
in the next two to three sessions to peck on a proached significance (Fl,10 = 4.65, 0.10 > p
fixed-ratio 25 reinforcement schedule (FR 25). > 0.05). The blocks X treatment interaction
They were then given four additional sessions was significant, however (F990 = 2.72, p <
with the FR 25 schedule. Throughout pre- 0.01), indicating that the two groups signifi-
training, the right-green key was illuminated cantly diverged as training continued.
continuously and the left-red key was darkened The second portion of Figure 1 shows the
and inoperative. All sessions terminated after effect of reversing the conditions. Subjects
50 reinforcements. Six subjects were then as- initially trained on the R-only condition con-
signed to each of two experimental conditions. tinued to respond to the red key with only a
R + G condition. After pretraining, all ses- slight decrement. Once reinforcement control
sions were divided into 50 discrete trials. A was established, therefore, addition of the
trial began with illumination of the left-red interpolated response generally did not abolish
BLOCKING OF REINFORCEMENT 217
recovery is not depicted adequately by the
a
group mean because two subjects never re-
° 1-0 +o subjectsI sponded after the conditions were reversed.
,,100_1suhjcly subjects Figure 2 shows the individual data for sub-
jects trained initially on the R-only condition
".0
IUU
-o and then switched to R + G. During initial
training, five of the six subjects gradually in-
6.6 creased responding to the red key; Subject 33
never developed any consistent responding
- 0
40
throughout the 10 sessions of training. A note-
worthy observation is the data of Subject 40,
B.
20
whose responding decreased toward the end
of the R-only training. This decrease with
* 2 4 6 I 10 12 14 10 1t 20 continued training under delayed reinforce-
SESSION S ment has been noted on several occasions
Fig. 1. Mean percentage of trials on which at least with the present procedure and is consistent
one peck occurred on the initial red key.
with the delay-of-reinforcement literature (cf.
the control, at least with the number of ses- Logan, 1960).
sions used. In contrast, subjects initially The effects of shifting from R-only to R + G
trained on the R + G condition responded are shown in the second portion of Figure 2,
substantially more when switched to the which presents each response separately. Three
R-only condition. The failure of complete subjects (35, 18, 60) continued to respond to

100
75

W 50

o 25 35 37
IL

: 100
- 75
50
#A
-I 25 18

100
z
A" 75
-
Responses to Red
IL
w 50 **--.x Responses to Green
60

5 10 15 20 5
SESSIONS
Fig. 2. Percentage of trials with at least one peck on the initial red key for subjects trained on the R-only condi-
tion and switched to R + G.
218 BEN A. WILLIAMS

the red key without decrement; Subjects 37 The second portion of Figure 3 shows the
and 40 showed a small decrement, and Sub- results of shifting to the R-only condition.
ject 33 once again did not develop any consist- By the end of that training, four of the six
ent responding. Noteworthy is that mainte- subjects responded consistently to the red key.
nance of the red-key response occurred in The remaining subjects never responded to
conjunction with resumption of green-key re- the red key and consequently that behavior
sponding. The only exception was Subject 18, was never reinforced.
which never responded to the green key after
the switch in conditions, in spite of its pre- DISCUSSION
training of that response, and also in spite of Experiment 1 indicated that reinforcement
the green key's close temporal association with control in a delayed reinforcement situation
reinforcement. is in some measure a function of the behavior
Figure 3 shows the variety of performances in the delay interval. Subjects that had the
for subjects initially trained on R + G. Four of delay interval filled with darkness responded
the six subjects had essentially no red-key consistently more than those that had avail-
responding by the end of the initial phase of able a response previously associated with re-
training. Three of these continued to respond inforcement. With the second response' avail-
to the green key, while the fourth, Subject 16, able, in:'4act, responding normally controlled
ceased responding. The two remaining sub- by the ''delayed reinforcement contingency
jects that did respond to the red key also con- ceased completely for some subjects. A tenta-
tinued to respond to the interpolated green tive conclusion, therefore, is that temporal
key. contiguity between response and reinforcer is

*....s Responses to Green


MA *- Responses to Red
en
a
0
IL 16
En
MA
u

=a

I-
29
ci

IL-

48 21
5
SESSIONS
Fig. 3. Percentage of trials with at least one peck on the initial red key for subjects trained on R + G and
switched to R-only.
BLOCKING OF REINFORCEMENT

not a sufficient condition for the occurrence larger chamber. On the metal panels were
of reinforcement. Apparently, reinforcement mounted two Gerbrands pigeon keys, 1.9 cm
control over a response can be blocked, just in diameter, which required a force of 0.10 to
as stimulus control previously has been shown 0.12 N for operation. The keys were mounted
to be blocked. 15 cm apart center to center for chamber 1,
Explanations other than blocking are of and 12 cm apart for chamber 2. The keys were
course possible. The simple occurrence of the illuminated with two 7.5-W Christmas-tree
response during the delay interval is corre- light bulbs for chamber 1, and with two 28-V
lated with several other variables, including jewel lights for chamber 2. For both chambers,
differences in illumination, response competi- the food magazine was located directly be-
tion, etc. The primary datum needed to estab- tween and 12 cm below the two keys. House-
lish blocking as the sole explanation of the lights were located in the upper left-hand
above data is an inverse functional relation be- corners of the two chambers.
tween the strength of the response interpolated
in the delay interval and the degree of Procedure
strengthening of the first response. Kamin All subjects were first trained with an auto-
(1969) reported such a relation for the block- shaping procedure using the right-red response
ing of stimulus control. That is, the degree of key (cf. Brown and Jenkins, 1968). The key
control acquired by element B during com- was illuminated for 5 sec, followed by 3 sec of
pound AB training is an inverse function of food reinforcement. Trials were separated by
the amount of control acquired by element A a variable intertrial interval (mean = 50 sec)
during pretraining. Experiment 2 attempted ranging from 20 to 110 sec. All subjects were
to establish a similar result for the reinforce- trained under these conditions until the first
ment control over behavior. session in which an autoshaped response oc-
curred, and were then given one additional
session. All sessions terminated after 50 trials.
EXPERIMENT 2 All subjects were then presented three ses-
The rationale of this experiment was to sions of training with the left-green key con-
determine the level of initial-key response as tinuously illuminated and the right-red key
a function of the degree of association of the darkened and inoperative. In the first session,
interpolated response with reinforcement. The all pecks were reinforced; in the second and
manipulation of this association was accom- third sessions, pecking was reinforced on a VI
plished by presenting two types of condition- 15-sec schedule. Sessions terminated after 50
ing trials. One type was that in Experiment 1: reinforcements.
first key, delay, second key, and then rein- The subjects were then exposed to four
forcement if a response occurred to the first experimental conditions, each of which in-
key. The second type presented the second key volved two types of conditioning trials. One-
alone but with different reinforcement con- half of the trials for all conditions consisted
tingencies. For one experimental condition, of presentations of the left-green key alone for
responses to the second key were reinforced, 4 sec. For Conditions 1 and 3, reinforcement
whereas for a second they were extinguished. followed the offset of the green key if at least
Two additional conditions were also run to one peck occurred during its presentation. For
control for the effects of reward versus extinc- Conditions 2 and 4, green-key responses were
tion per se. never reinforced. On the other half of the
trials, Conditions 1 and 2 involved the R + G
Subjects procedure of Experiment 1: the red key was
Sixteen experimentally naive White Car- illuminated for 5 sec, followed by 5 sec of
neaux pigeons were maintained at 80% of delay, 4 sec of green-key illumination, and
their free-feeding body weights. then food presentation if at least one peck had
occurred on the red key. For Conditions 3 and
Apparatus 4, the second type of trial involved the R-only
Two standard chambers included an inner procedure of Experiment 1: the right-red key
Plexiglas chamber, 30.5 cm in all dimensions, was illuminated for 5 sec, followed by 9 sec of
attached to a metal panel, and enclosed in a delay, and then reinforcement if at least one
220 BEN A. WILLIAMS
Table 1 (hereafter designated Phase IA), however, it
Training Conditions for the Four Procedures became apparent that responding to the red
key was highly variable across subjects. To
Trial Type reduce the variability, all subjects were pre-
Condition A B sented one session of the red-key retraining
1 R+G G a SR procedure just described, and then were re-
2 R+G G- EXT turned to their original training conditions for
3 R-only G -- SR 10 additional sessions (Phase iB). Phases 2 and
4 R-only G -- EXT 3 each involved only 10 sessions. Table 2 sum-
marizes the assignment of subjects to condi-
peck had occurred. Table 1 summarizes the tions for each phase of training.
procedures. The two types of trials alternated
quasirandomly. Twenty-five trials of each type RESULTS
were presented each session, with the same in- Figure 4 shows the mean percentage of red-
tertrial intervals as used in autoshaping (mean key responding for the last two sessions of each
= 50 sec). A houselight was used throughout phase of training. The different phases cor-
all phases of training. respond to each new assignment of subjects to
Training was divided into three phases, as conditions. The primary comparison of in-
defined by the assignment of subjects to ex- terest is that between Conditions 1 and 2,
perimental conditions. The different phases which both received the R + G trials. As can
were separated by a single retraining session be seen, Condition 1 produced a consistently
with the right-red key alone, to ensure that lower probability of red-key responding for
responding occurred to that key at the begin- all phases of training. Separate t-tests among
ning of an experimental condition. The first conditions were conducted on Phase IA and
five to 10 trials during the retraining sessions the combined Phases iB, 2, and 3. Both differ-
involved the autoshaping procedure described ences were statistically significant (Phase IA:
above. The remaining 40 to 45 trials used a t = 2.88, 6 df., p < 0.05; Phases lB, 2, and 3:
delayed reinforcement precedure where the t = 2.32, 22 df., p < 0.05).
red key was illuminated for 5 sec followed by To substantiate that the difference between
a 3-sec delay, and then reinforcement if at Conditions 1 and 2 was due to the differential
least one peck had occurred. Training during effects of presenting the green key in the delay
each phase was planned to continue for 10 interval, and not simply to the differential
sessions. After the first 10 sessions of Phase 1 green-key contingency when it was presented
Table 2
Percentage of trials with a response to the red key during the last two sessions of each
condition.
Phase 1 Phase 2 Phase 3
Subject Condition % (IA) % (iB) Condition % Condition %
1 1 8 16 1 28 4 70
18 1 90 98 2 98 3 100
22 1 4 82 3 80 2 82
13 1 2 40 4 64 3 96
6 2 86 90 1 56 4 82
15 2 96 58 2 76 1 40
10 2 82 82 3 96 3 80
20 2 88 94 4 100 1 98
9 3 0 100 1 96 1 58
3 3 0 94 2 90 2 84
11 3 82 68 3 68 4 90
5 3 12 74 4 96 4 86
7 4 84 84 1 2 2 70
25 4 0 82 2 76 3 98
21 4 66 90 3 62 1 90
17 4 96 82 4 80 2 92
BLOCKING OF REINFORCEMENT 221

alone, the control Conditions 3 and 4 were also on the R + G trials. Table 3 shows the mean
run. These two conditions received R-only rate of responding to the interpolated green
trials instead of R + G trials, but with different key for both conditions during each phase of
contingencies associated with the G-alone training. Little difference between the condi-
trials. Any difference between Conditions 3 tions occurred for either part of Phase 1, in
and 4 would thus be due to the effects of re- spite of a significant difference for red-key
ward versus extinction associated with green, responding. The high level of green-key re-
quite apart from its presentation during the sponding for Condition 2 is somewhat surpris-
delay-of-reinforcement interval. Figure 4 shows ing, since little responding occurred to the
an effect for Phase IA but no difference for green key when it was presented alone. Re-
any of the remaining phases of training. The sponding to the green key on R + G trials did
difference for Phase IA was not statistically decrease during Phases 2 and 3, but this was
significantly (t = 1.31, 6 df., p > 0.05). due primarily to some subjects not developing
The small amount of Condition-3 respond- responding upon transfer to that condition.
ing during Phase IA is noteworthy because it In particular, subjects transferred from Con-
is similar to effects often found with the pres- dition 4 to Condition 2 never developed the
ent procedure when subjects are transferred interpolated responding in spite of the green
abruptly to a long delay of reinforcement. key being followed consistently by the rein-
That is, subjects typically show an initial forcer. It is nonetheless clear from Table 3
reduction of responding, which may or may that the degree of interpolated responding
not recover. In the present case, three of the was not a critical determinant of the difference
four subjects had stopped responding during between the two conditions.
the first three to four sessions, and their re- The individual subject data are shown in
sponding never recovered. After retraining on Table 2 and Figure 5. Table 2 shows the per-
the 3-sec delayed reinforcement procedure be- centage of red-key trials with at least one re-
fore the start of Phase 1B, however, all sub- sponse during the last two sessions of each
jects continued to respond consistently. Ap experimental condition. Of primary interest
parently, some minimal training on a short is the pattern of variability for subjects trained
delay of reinforcement is necessary to maintain under Condition 1. Whereas the performance
behavior under longer delays. Observation sug- of most subjects was reduced during that con-
gested that food-magazine orientation pro- dition, performance for some subjects was
duced blocking analogous to that produced maintained. The two classes of obtained be-
by the explicitly interpolated response. havior do not appear to be due to random
Because the difference between Conditions variability, but instead are consistent with
1 and 2 was due to the differential history of the results of Experiment 1. Namely, the sub-
the interpolated green response, of particular jects with maintained performance under the
interest is the degree of green-key responding R + G condition also maintained their per-
formance under all other conditions as well.
Figure 5 shows the individual subject data
for all subjects transferred between Conditions
1 and 2 without any other condition inter-
spersed between. For all subjects, the mean
level of red-key responding across sessions was
less for Condition 1. The variability across
sessions was also greater for Condition 1.
Table 3
Mean rate of responding (responses per second) to the
interspersed green key, for R + G trials, for the last two
sessions of each phase of training.
TRA/M/N/6 PEAISE IA IB 2 3
Fig. 4. Mean percentage of trials on which at least Condition 1 2.17 2.65 3.27 1.96
one peck occurred on the initial red key in the last two
sessions of each phase of training. Condition 2 2.58 2.78 1.94 0.69
222 BEN A. WILLIAMS

00mL
~60~

20~
G-EcT| G-S* G-S e G- of-reinforcement interval, is the critical vari-
able in producing blocking. Such a result also
excludes alternative explanations in terms of
simpler mechanisms involving sequences of
stimuli, illumination during the delay inter-
val, response competition, etc.
A question raised by the present results is
whether they require explanatory principles
820I different from those postulated for the block-
ing of stimulus control. Two aspects of this
issue should be separated. The first concerns
k- /5 O/ 5 /0 / 5 A/ X
S the role of the instrumental contingency of
reinforcement for the initial red key. Much
Fig. 5. Individual data for R + G trials for subjects recent emphasis has been given to the notion
transferred directly between Conditions I and 2. Sub- that pigeon's key pecking has many classical
jects in the left panel were transferred from Condition conditioning components (e.g., Moore, 1973),
2 to Condition 1; subjects in the right panel were trans-
ferred from Condition 1 to Condition 2. and perhaps the blocking effect obtained here
has little generality for other instrumental
Of particular interest is Subject 7. Its train- learning situations. Blocking effects, however,
ing preceding Condition 1 was Condition are not restricted to classical conditioning pro-
4, during which all green-key responding had cedures, but instead have been obtained in a
extinguished. At the start of Condition 1, variety of different procedures with different
therefore, the subject was responding to the subjects (cf. Honig, 1970). Also, the present
red key but not to the green key, either during results do seem to depend upon the use of an
the R + G trials or during trials when G was instrumental contingency for the initial red
presented alone. In Session 3, response-inde- key. Unpublished research with the procedure
pendent reinforcement was given following used in Experiment 1, but with all reinforce-
the G-alone trials for the first 10 trials of the ment delivered noncontingently, generally has
session. Responding to G-alone immediately not been the same, mainly because the be-
developed and then transferred to the R + G havior to the initial red key is not maintained
trials. The development of the interpolated by the control condition using delayed rein-
responding resulted, in turn, in the rapid de- forcement (the R-only procedure of Experi-
cline of red-key responding, which persisted ment 1).
until the end of training. The second issue concerning the mechanism
of the effect is whether the critical feature for
DISCUSSION blocking is the interpolated response or the
The results demonstrate that the blocking keylight stimulus that previously has been as-
effects were due to the history of reinforcement sociated with reinforcement. This question
of the response interpolated in the delay-of- may be impossible to answer, because any re-
reinforcement interval. Although Conditions sponse must be directed toward some feature
1 and 2 involved identical reinforcement con- of the apparatus, so that at least some stimulus
tingencies with respect to the initial red key, properties also must become associated with re-
and also involved identical opportunities for inforcement. Conversely, the presentation of
interpolated responding, blocking occurred a stimulus associated with reinforcement is
only for Condition 1, in which the interpo- likely to elicit some responding. The question
lated response was separately associated with becomes even less meaningful when it is con-
reinforcement outside of the R + G trials. Of sidered that the response must itself possess
major significance in understanding the mech- stimulus properties, and that many investi-
anism of blocking is the fact that the actual gators have argued that instrumental condi-
occurrence of interpolated responding for the tioning is simply the association of response-
two conditions was not systematically different. produced stimuli with the reinforcer (e.g.,
Such a finding establishes that the association Bindra, 1972). Such a possibility suggests that
of the interpolated response with reinforce- the blocking effects obtained here and those
ment, not simply its occurrence in the delay- found in stimulus control are interchangeable.
BLOCKING OF REINFORCEMENT 223
The present results nonetheless differ from has some asymptotic amount of conditioning
previous demonstrations of blocking in several associated with it, which will be distributed
important respects. Blocking research using across the various stimuli that precede it.
stimulus-control procedures generally has si- Events then paired with the US become con-
multaneously compounded stimulus elements ditioned only if the sum of the existing condi-
where one element of the compound previ- tioning to all of the stimuli in the situation is
ously had been paired with reinforcement. In different from the asymptotic conditioning
contrast, the present study sequentially pre- possible. If the sum is less, excitatory condi-
sented the elements with a 5-sec delay period tioning will occur; if the sum is more,
between them. Secondly, stimulus-control re- inhibitory conditioning will occur.
search has demonstrated blocking only with The main features of the Rescorla-Wagner
situations where the obtained rate of rein- model appear directly applicable to the pres-
forcement was unaffected by blocking. In the ent results. The pretraining of the interpo-
present case, hiowever, blocking was obtained lated response presumably caused most of
in spite of the consequent reduction in rate of the asymptotic response strength possible to
reinforcement. Such differences indicate that be acquired. Later exposure of the second
blocking is of central importance in many response to delayed reinforcement was there-
learning situations, both classical and operant, fore less effective because little further con-
and indeed may be one of the most funda- ditioning remained possible.
mental of learning phenomena. A total application of the model to the pres-
Perhaps the most important aspect of the ent data is not without difficulty, however.
present results is their implication for a gen- One major obstacle is to define "asymptotic
eral theory of conditioning. Any statement of response strength possible" in an operant
the law-of-effect solely in terms of temporal situation. A variety of data (cf. Herrnstein,
contiguity (e.g., Herrnstein, 1966; Skinner, 1970) indicates that particular rates of rein-
1948) is clearly inadequate, but less obvious forcement are not uniquely associated with
is what direction any additional, or alternative, particular asymptotic response levels, but
principle should take. One approach, consist- rather the asymptotic level is a relative func-
ent with considerable recent theoretical spec- tion of the "context of reinforcement". A
ulation (cf. Bloomfield, 1972), is to emphasize related problem concerns the assessment of
that learning occurs only when the response the "suin of the existing response strengths".
"predicts" the occurrence of the reinforcer. For two subjects in Experiment 1, the inter-
The difficulty with such an approach is that polated response occurred regularly and yet
it is imprecise. Unless the conditions that was accompanied by substantial changes in
determine predictiveness are clearly specified the probability of the initial response. To the
(e.g., temporal boundaries), the concept can extent that conditioning of the first response
be used to explain, but not predict, virtually was attributable to the degree of response
any conditioning result. In addition, at least strength of the second response, therefore,
some versions of the notion (e.g., Baum, 1973) some assessment of that response strength,
commit the experimenter to a "molar" anal- other than rate of pecking, must be used.
ysis of conditioning, rather than the de- Similarly, in Experiment 2, differences in the
lineation of moment-by-moment changes in degree of blocking (Condition 1 versus Con-
response strength. dition 2) occurred in spite of little difference
An alternative approach to the present in the degree of interpolated responding in
results is the conditioning model of Rescorla the delay-of-reinforcement interval. Instead,
and Wagner (1972). The model was inspired blocking was a function of whether the inter-
by several different findings in the realm of spersed response was reinforced on separate
classical conditioning, including blocking, but trials where the initial response was not avail-
appears generally applicable to the stimulus able. Such results argue for some intervening
control of operant behavior as well (cf. concept such as "level of associability", rather
Williams, 1973). The question is whether it than tying an analysis of the blocking effect to
is also applicable to the operant strengthening direct measures of response rate.
of a response. According to the Rescorla- Regardless of the success of deriving the
Wagner model, each reinforcing event (US) present results from a general model of learn-
224 BEN A. WILLIAMS

ing, the experiments described above have with the corresponding tandem controls, the
implications for several areas of research. One different behavior produced by the two sched-
is the study of delay of reinforcement. The ules has been interpreted in terms of the
degree of response strengthening was not an roles of conditioned reinforcement and stim-
absolute function of temporal proximity, but ulus control. The present results imply that
of the occurrence of other responses during blocking is another factor involved in the
the delay interval. A possible implication is comparison. Namely, the reinforcement of
that no invariant delay-of-reinforcement gra- responding in early links of the chained
dient exists. The effect of a particular delay schedule may be blocked because the inter-
may instead depend on the nature of the vening later links are highly associated with
situation, i.e., upon the likelihood of inter- reinforcement, thereby explaining the general
vening responses developing during the delay failure to maintain responding in the initial
interval. Parameters such as size of the experi- link with chained schedules having several
mental chamber, illumination, etc., should be components.
expected to be of critical significance.
Several findings in the delay-of-reinforce- REFERENCES
ment literature support the present treat- Baum, W. The correlation-based law of effect. Journal
ment. One common result is that the acquisi- of the Experimental Analysis of Behavior, 1973, 20,
tion function with delayed reinforcement is 137-153.
nonmonotonic (cf. Logan, 1960). Such a result Bindra, D. A unified account of classical conditioning
could be accounted for if, as conditioning and operant training. In A. Black and W. Prokasy
(Eds.), Classical conditioning II: current theory and
progresses, the strength of intervening behav- research. New York: Appleton-Century-Crofts, 1972.
ior increases. More direct evidence comes Pp. 453-481.
from a recent study of T-maze learning in rats Bloomfield, T. M. Reinforcement schedules: contin-
(Lett, 1975), which found that learning oc- gency or contiguity? In R. Gilbert and J. Millenson
(Eds.), Reinforcement: behavioral analysis. New
curred even with long delays between the York: Academic Press, 1972. Pp. 165-208.
response and reinforcer only if the rats were Brown, P. L. and Jenkins, H. M. Auto-shaping of the
removed from the apparatus during the delay. pigeon's key-peck. Journal of the Experimental
In addition, learning was retarded as a func- Analysis of Behavior, 1968, 11, 1-8.
tion of the time the subjects were left in the Herrnstein, R. J. Superstition: a corollary of the prin-
ciples of operant conditioning. In W. Honig (Ed.),
apparatus before removal. Operant behavior: areas of research and application.
The preceding analysis seems in conflict New York: Appleton-Century-Crofts, 1966. Pp. 33-51.
with at least one study of delayed reinforce- Herrnstein, R. J. On the law of effect. Journal of the
ment effects. Using a concurrent-chain proce- Experimental Analysis of Behavior, 1970, 13, 243-
266.
dure, Neuringer (1969) compared the effects Honig, W. K. Attention and the modulation of stimu-
of delayed reinforcement versus fixed-interval lus control. In D. Mostofsky (Ed.), Attention: con-
reinforcement where the two types of delay temporary theory and analysis. New York: Appleton-
intervals were equated. According to the Century-Crofts, 1970. Pp. 193-238.
present analysis, one should presumably ex- Jenkins, H. and Harrison, R. Effect of discrimination
training on auditory generalization. Journal of Ex-
pect that the addition of the fixed-interval perimental Psychology, 1960, 59, 246-253.
response requirement would reduCe prefer- Kamin, L. J. "Attention-like" processes in classical
ence. To the contrary, Neuringer's subjects conditioning. In M. Jones (Ed.), Miami symposium
were indifferent. While Neuringer's delay val- on the prediction of behavior: aversive stimulation.
Miami: University of Miami Press, 1968. Pp. 9-31.
ues and other procedural parameters were Kamin, L. J. Predictability, surprise, attention, and
perhaps too different to warrant comparison conditioning. In R. Church and B. Campbell (Eds.),
with the present study, one resolution of the Punishment and aversive behavior. New York: Ap-
discrepancy is that the effective variable in pleton-Century-Crofts, 1969. Pp. 279-296.
Lett, B. T. Long delay learning in the T-maze. Learn-
Neuringer's study was not delay of reinforce- ing and Motivation, 1975, 6, 80-90.
ment but the relative conditioned reinforce- Moore, B. R. The role of directed Pavlovian reactions
ment values of the stimulus change cueing the in simple instrumental learning in the pigeon. In R.
entry into the second link. Hinde and J. S. Hinde (Eds.), Constraints on learn-
The present results also bear on the analy- ing. New York: Academic Press, 1973. Pp. 159-188.
Logan, F. A. Incentive. New Haven: Yale University
sis of chained schedules of reinforcement. Press, 1960.
When chained schedules have been compared Neuringer, A. J. Delayed reinforcement versus rein-
BLOCKING OF REINFORCEMENT 225
forcement after a fixed interval. Journal of the Ex- Skinner, B. F. "Superstition" in the pigeon. Journal
perimental Analysis of Behavior, 1969, 12, 375-383. of Experimental Psychology, 1948, 38, 168-172.
Rescorla, R. and Wagner, A. A theory of Pavlovian Williams, B. A. The failure of stimulus control after
conditioning: variations in the effectiveness of rein- presence-absence discrimination of click-rate. Jour-
forcement and nonreinforcement. In A. Black and nal of the Experimental Analysis of Behavior, 1973,
W. Prokasy (Eds.), Classical conditioning II: current 20, 23-27.
theory and research. New York: Appleton-Century-
Crofts, 1972. Pp. 64-99.
Reynolds, G. S. Attention in the pigeon. Journal of Received 28 December 1974.
the Experimental Analysis of Behavior, 1961, 4, (Final Acceptance 14 May 1975.)
203-208.

Potrebbero piacerti anche