Whether a given stimulus event is interpreted by a dog as a punitive one or a rewarding one depends on the dog's moment-to-moment motivational state and learning history. As previously discussed in Chapter 7, giving a fully satiated dog a treat may actually function punitively—that is, the dog may experience the ingestion of food when not hungry as an aversive event. Similarly, a dog that has been exercised to the point of exhaustion will view an opportunity to play very differently than at some other time when the dog is well rested. In general, the provision of anything that the dog would rather be doing at any given moment may function as a reward. On the other hand, anything that the dog would rather not be doing at any given moment might be used as an effective punisher. This general motivational interpretation of reward and punishment has been elegantly described by Premack (1962).
It is useful to interpret ongoing behavior in terms of a field of learned expectations and controlling signs. Dogs make fine predictions from moment to moment based on past experiences, including the identification of signs anticipating future events. In the words of Tolman (1934), "A conditioned reflex, when learned, is an acquired expectation-set on the part of the animal that the feature of the field corresponding to the conditioned stimulus will lead, if the animal but waits, to the feature of the field corresponding to the unconditioned stimulus" (1934:393). A common example of classical conditioning in dog training involves the bridging stimulus. Consistently saying "Good" just before giving the dog a piece of food teaches the dog to expect a treat on each occasion it hears the vocal signal. What happens, however, if the vocal signal "Good" is presented independently of the presentation of food—that is, when it is randomly paired or not paired so that the animal cannot predict the actual outcome on any given trial?
Rescorla's (1968, 1988) laboratory findings indicate that if an animal is exposed to random shocks that are signaled only 50% of the time by a tone stimulus but unsignaled the remaining 50% of the time, the result is that the tone will fail to develop as a CS— that is, the animal will fail to respond to the tone as a predictive signal for the occurrence or nonoccurrence of the US (shock). Such stimulus neutrality occurs in spite of many positive pairings between the tone and the US, since the positive pairings are offset by an equal number of US events occurring in the absence of the tone. In this case, the tone equally fails to predict the absence or the presence of the US—that is, it occurs independently of the US. Rescorla's studies prove that the animal forms an expectancy derived from a contingency of probability existing between the occurrence and nonoccurrence of the CS and the US (see Chapter 6). Furthermore, in addition to making predictions about the probable occurrence of the US, the dog also makes predictions about its size and quality. In this regard, associative expectancies between the CS and US yield three general possibilities:
1. The CS exactly predicts the size and quality of the US (no new learning results).
2. The CS underpredicts the size and quality of the US (acquisition).
3. The CS overpredicts the size and quality of the US (extinction).
In terms of conditioned reinforcement, these various relationships between conditioned and unconditioned stimuli result in the following outcomes: (1) If the word signals "Good" or "No" are always followed by the same amount of unconditioned stimulation (the same reward or aversive event), then no new learning takes place (i.e., the strength of the Srs "Good" and "No" remains the same). (2) If the word signals "Good" or "No" are sometimes followed by a larger-than-expected reward (e.g., a bonus) or an unexpected punisher, then additional associative conditioning takes place. Such stimulus learning is facilitated under conditions of appetitive surprise (Blanchard and Honig, 1976) or aversive startle (Kamin, 1968). (3) If the CS overpredicts the size of the reward or punisher, then extinction occurs. For instance, if dogs have learned to expect a piece of steak each time they hear the word signal "Good" and are then given a biscuit instead, they will quickly adjust their expectations to reflect the disappointment. In the case of punishment, if dogs have learned to expect aversive punishment every time they hear the word signal "No" while engaging in some unwanted behavior but are then exposed to a series of mild physical prompts instead, the fearful emotional and avoidance responding previously controlled by the reprimand will undergo extinction.
During the training process, dogs definitely form certain predictions and expectations about outcomes associated with their behavior. Extrapolating from the foregoing analysis of classical conditioning to instrumental learning, if a dog receives a reward that is significantly smaller than expected, the outcome is perceived as punitive (disappointment), resulting in the trial rendering the response weaker. If, on the other hand, the reward exactly matches the dog's expectations, then the instrumental response that resulted in reward is neither rendered stronger nor weaker than it was before reinforcement. A reinforcer that does not result in additional learning (acquisition or extinction) might aptly be termed a verifier, serving to confirm the status quo but not resulting in any new learning. This general theory suggests that a third instrumental outcome exists in learning besides rewards and punishers (i.e., verifying events that function to maintain behavior at the same level of probability). For new instrumental learning to take place, the reward must exceed a dog's expectation—that is, additional positive learning depends on a surprise element. According to this viewpoint, instrumental behavior is strengthened only to the extent that the anticipated reward exceeds the dog's predictions about the reward's size, quality, or context.
Similarly, in the case of punishment, an aversive event that exactly matches a dog's expectations should not alter or weaken the behavior that the aversive event follows—such a well-predicted event serves only to verify the status quo. That the dog anticipates the aver-sive outcome and still performs the targeted behavior at a steady rate is empirical evidence for such an interpretation. However, if the punitive event exceeds the dog's prediction, then a corresponding degree of suppression will occur. Finally, if the punitive event is less than the dog has predicted, one would likely observe extinction of punishment effects.
Several general outcomes can be anticipated from the reciprocal relationship between the probability of punishment and its intensity:
1. If the probability of punishment is high but intensity low, the degree of suppression will be correspondingly mitigated.
2. If the probability of punishment is low but the intensity high, suppression should likewise decline over time. [This case finds some trouble when compared with findings from traumatic escape-avoidance experiments (Solomon et al., 1953) and one-trial learning events. Avoidance learning is typically very resistant to extinction.]
3. The highest degree of suppression occurs when both the intensity of punishment and its probability of occurrence are high.
4. The lowest degree of suppression occurs when both the intensity of punishment and its probability of occurrence are low.
When the effects of expectancy are factored into the foregoing cases, the following additional predictions are obtained:
5. If the expectation of punishment is matched exactly with the aversive event's actual probability and intensity, no additional suppression will occur.
6. If the expectation of punishment is underestimated in either the direction of probability or intensity, then additional suppression will occur.
7. If the expectation of punishment overestimates the aversive event's probability or intensity, then the degree of suppression controlled by punishment will be correspondingly attenuated.
Was this article helpful?