Premack Principle The Relativity of Reinforcement

The usual way reinforcement is described emphasizes its stimulus characteristics and their potentiating effects on behavior, but reinforcement can also be analyzed in terms of the potentiating effects that responses have on other responses. David Premack (1965), who performed a number of experiments supporting this sort of analysis, has demonstrated that behavior occurring at a high frequency or probability tends to reinforce behavior occurring at a lower frequency/ probability. According to this perspective, the determination of whether any particular behavior is a reinforcer (or punisher) depends on its relative probability with respect to the behavior it follows. Premack states this relationship in terms of response probability: "For any pair of responses, the independently more probable one will reinforce the less probable one" (1962:255).

If one distributes the dog's behavior on a hierarchy or continuum ranging from low to high response probability, then, according to the Premack principle, behaviors ranked higher up on the hierarchy of probability will tend to positively reinforce ones ranked lower down. Alternately, if a response occurring lower on the probability hierarchy follows one ranked higher up, the relationship is punitive—that is, the higher-ranked antecedent response will be rendered less probable by the lower-ranked consequence. Therefore, reinforcers and punishers are relative and dependent on a dog's transient behavioral tendencies and motivational states.

Instead of conceptualizing the reinforcing event as a stimulus, Premack describes it in terms of an indivisible S-R composite. For example, a biscuit for the hungry dog is both stimulus (something to be eaten) and a response (the act of eating it). These observations emphasize an important difference between instrumental and classical conditioning. Responses reinforce responses in the case of instrumental learning, whereas stimuli reinforce stimuli in classical conditioning.

A significant factor in the foregoing paradigm of reinforcement is the role of response-activity deprivation (Timberlake and Allison, 1974). Any behavior can be made more valuable and, therefore, more probable by depriving the animal access to it. Similarly, any behavior can be made less valuable and punitive by satiating the animal with it, thus making it less probable. Further, the value of any given reward is dynamic and dependent on the animal's changing sensory needs and the attainment of what Wyrwicka (1975) has described as a better state of being.

During an ordinary training session, the dog is going to prefer performing some exercises more than others. Determining at any moment what the dog would prefer to do and then providing access to that activity on a contingent basis is a sound and efficient incorporation of the Premack principle. For in stance, having a dog heel out of a down-stay is a reinforcing consequence for staying, regardless of what else is done to strengthen the down-stay exercise. Although there appears to be a natural inclination for active exercises to reinforce stationary ones, this is not always the case. For example, if a dog is made to heel for a long period without stopping, the dog's inclination to sit or lay down will gradually become stronger than its inclination to continue heeling. When the dog is finally permitted to sit or lay down, the opportunity to rest will tend to reinforce the previous heeling pattern.

Another example involves the trained habit of coming when called from the sit-stay. Most dogs find coming when called preferable to staying still in the sit position. Consequently, even in the absence of other rewards, the sit-stay is reinforced when the dog is called by its handler. However, having the dog come and then to sit-front may have a contrary effect. In this situation, the dog moves from a highly reinforcing activity (coming) into a less reinforcing one (sitting). The overall effect is mildly punitive. There fore, at the outset of recall training, it is better not to require that dogs perform the customary sit-front each time they come. Instead, it would be consistent with the Premack principle to follow the performance of coming with an even more exciting and reinforcing opportunity, for example, an immediate opportunity to play ball. An alternative approach would be to train the sit-front to a high degree of proficiency, thus making it highly probable, and then to chain the less probable recall response to it.

