habit slips - definition
unconscious intrusions of a habit when an alternative behavior had been consciously intended
habit slips - example
- - putting sugar in your cereal when you were intending to cut back
- - driving past an intended stop only to realize you didn't get gas when you get home
operant response - definition
label coined by BF Skinner indicating that the subject's response operates on the environment to produce a certain outcome and the consequences of the response can modify future responses; once conditioned, these responses can be extinguished
operant response - example
- - bar press-to-food should lead to an increase in bar pressing (positive reinforcement)
- - running a red light and getting hit should reduce red light running in the future (punishment)
discrete trial - definition
instrumental learning technique in which the subject is given separate occasions during which the response may be performed and in which the beginning and the end of the trial is known; used by Thorndike
discrete trial - example
maze, puzzle box, runway, two-goal box
intervention - definition
action taken to cause a change in behavior, cognition, and/or emotional state
intervention - example
- - types of reinforcement given for a behavior: positive, negative, omission and punishment
- - giving 3 M&M's to a child for making it to the bathroom on time
learned helplessness - definition
learning that there is an explicit lack of contingency between responses and an aversive outcome: there is no response that is causing punishment, nor is there one that prevents it
learned helplessness - example
an individual in a situation where all their actions had no influence over the outcome will become passive and not do anything next time they are in the same (or similar) situation - two separate groups of students are exposed to loud tone pulses and told they have the ability to stop the tones; the learned helplessness' button to turn off the tone did not work and in the second phase of the study this group had slower reactions to turning of the tone when they were given the real ability to stop it
primary reinforcer - definition
innate reinforcer; reduce biologial needs of the organism
primary reinforcer - example
food, water, relief from excessive heat, cold or pain
negative contrast - definition
current rf is is smaller or nonexistent in contrast with the previous rf which alters the level of responding of the organism
negative contrast - example
switching from a larger to a smaller reward, the rats run slower than they did when the reward was larger
social reinforcement - definition
powerful class of reinforcers for human behavior in which praise, attention, physical contact and/or facial expressions are used
social reinforcement - example
teacher praising an otherwise uncooperative/defiant student for completing her math problems increases the student's performance on her math problems
nonreward - definition
contingencies in which the response is not followed by a positive reinforcer - extinction
nonreward - example
in lab 3, bar presses by Faraday ceased to produce her expected rf of water, so her bar presses decreased, then disappeared (until the very end in which she had an extinction burst)
shaping - definition
method to train a behavior that is not presently in the organism's repertoire - start by reinforcing a response that is performed and that approximates the desired behavior; once this response occurs at a higher frequency, we reinforce certain deviations in the direction of the target behavior; step by step the reinforced response is slightly changed from that that will no longer be reinforced through successive approximations
shaping - example
in lab 2, Faraday was reinforced for going to the corner with the bar in it, then for being by the bar, then for touching the bar, then for actually pressing the bar; eventually she was only reinforced for bar presses
chaining - definition
form of instrumental conditioning in which reinforcement occurs only after the final response in the sequence
chaining - example
in lab 5, Faraday learned to hit the lever which turned the light on, then press the bar by the lights which activated the dipper which then gave her a reinforcement of water; if she did any of these responses out of order, she didn't receive any water
antecedents - definition
any behavior or occurrence that precedes a behavior/event
antecedents - example
organization of all supplies needed before beginning to study for a test (books, pens, notes, snacks, etc...)
avoidance-avoidance - definition
- an organism is faced with two choices/goals both of which are negative; the conflict arises when the organism is required to choose between the two negative options; movement away from one goal is countered by an increase in the repellence of
- the other goal so that the individual returns to the point where he was at the beginning
- of the conflict
avoidance-avoidance - example
I can choose to study for my nursing test or write an APA paper for my nursing class
frustration hypothesis - definition
the frustrating aftereffects of nonreward become associated with the subsequent occurrence of reward; the frustration experienced after nonreward on one trial is followed by reward on the next trial and frustration becomes a discriminative stimulus for reward; this is found in PREE
frustration hypothesis - example
in PREE trials, a rat is only partially reinforced for bar presses and the bar press behavior increases in response to emotional frustration for not being reinforced; when a reinforcement is given, it reinforces the emotional frustration level so the rat continues to bar press when there is no water reinforcement because they are increasing the frustration level in expectation of the next water reinforcement
S-R learning - definition
a discriminative stimulus leads to an instrumental response; stimulus-response conditioning; predicts that the discriminative stimuli come to elicit the previously reinforced instrumental responses; association of a new stimulus with a pre-existing stimulus; the response seems to have become separated from its reinforcing consequences and has become an automatic reaction to the stimulus; the instrumental response sometimes persists even though reinforcement is freely available and the response is no longer needed to obtain reward
S-R learning - example
advertising strategies: a customer didn't previously believe they needed a certain pair of shoes or jeans, but when they see them on a mannequin or model in a window, they now believe they NEED that item of clothing
Edward Thorndike began exploring the concept of learning by studying animals, specifically cats. He developed a theory called trial-and-error learning. Explain his theory including the process he used to demonstrate his theory of learning. Include explanations for the law of effect and the law of exercise. Why was this type of learning included in the category of instrumental response learning and stimulus-response learning?
- - Trial-and-error learning: Now known as instrumental conditioning; Sought to systematize the principles involved in the development of adaptive behavior; The organisms tries many behaviors at first, and then the ineffective responses cease over time and the effective responses increase
- - Three elements: Discriminative stimulus: environmental stimuli present at the time the response occurs; Response: organism’s action based on the environmental (discriminative) stimuli; Consequence: the result of the organism’s action
- - Cats in a puzzle box:Placed a cat in a wooden crate with a hinged door and a trip mechanism somewhere in the box; A disguised mechanism would open the door and the cat could escape; By using responses not already in the cat’s repertoire he was able to study how the new response developed with practice; Each time the cat tripped the mechanism and escaped, he would place the cat back into the box right away and time how quickly each successive escape took - the faster the cat was, the more efficient the learning had been
- - Law of effect: Associations were strengthened when the behavior resulted in the goal; Responses that produced a satisfying consequence became connected to the situation and become more likely to occur during the next trials, while responses that produced unsatisfying consequences dropped out and disappeared over time; Statement of the principle of reinforcement: behavior, in its form, timing, and probability of occurrence, is modified by the consequences of the behavior
- - Law of exercise:Associative shifting: after you have learned something, you don’t really have to think about the actions involved each time you perform them, you just do them; “Use it or lose it”: The more practice the stronger the connection, and the less practice the less likely the connection will be maintained; True for many cases but the ability to ride a bike after decades of not riding one would be a situation where this doesn’t hold true
- - Why instrumental response and stimulus-response learning?: Thorndike’s trial-and-error learning was the foundation for S-R learning in which the discriminate stimulus is connected to the instrumental response and reinforcement is what conditions or strengthens the S-R connection.; Because of Thorndike’s laws of effect and exercise, S-R learning highlighted the difference between mechanistic and insight based learning - Thorndike believed in mechanistic learning and his trial-and-error learning trials demonstrate that.
Based upon the theories of B.F. Skinner, particularly as he explained them in his book “Walden Two,” a community of people began “Twin Oaks.” Explain the principles used to establish this community. Provide examples of how the principles were put into practice in the community. How do they relate to Skinner’s theories of learning and rewards? What did Skinner think about Twin Oaks? Why?
- - Twin Oaks principles: Skinner’s behaviorism; Based on community sharing, no violence, no aggression, no jealousy, no competition; Everything based on reinforcement as primary mover of human behavior; Basic values: cooperation, egalitarianism, income-sharing and non-violence
- - How the principles were put into practice: Everyone works 40 or so hours a week; Children are raised by Metas in a children’s house modeled after Israeli kibbutzim; Labor-credit work system; Walden Two Planner-Management system
- - Relations to Skinner’s theories of learning and rewards: The Twin Oaks community used fixed ratio reinforcement schedules, whereas Skinner believed variable ratio schedules were more effective
- - Skinner’s opinion of Twin Oaks and why: Liked Twin Oaks but they didn’t follow his behaviorism as much as he would have liked; Wouldn’t set up the government the way they did Used on person to delegate jobs but they weren’t ever really in charge - just there by default; Impressed by how self-sufficient they were; Would have liked to see variable ratio reinforcement, instead of the fixed ratio (which seemed to work better for the people in this community)
Terry argues, as do other learning theorists, that “instrumental conditioning” and “operant learning” are different. Terry suggests that the difference is important but not recognizable outside the field. Explain the differences between instrumental and operant learning. Why do you think these two theories are often seen as the same?
- - Instrumental conditioning: Tends to adopt a particular form of theorizing in its attempts to explain learning, often postulating theoretical constructs; Discrete Trials: the subject is given separate occasions during which the response may be performed; the beginning and the end of the trial is known; Averages out individual variations in performance by using groups of subjects
- - Operant learning: Strictly functional approach: the frequency of responding is a function of the amount of reinforcement, or of its delay, or its schedule, and so on; Continuous availability to the response; Seeks to demonstrate lawful relationships in a single subject
- - Why are they seen as the same?: They both use reinforcement for behaviors; Behavior occurs because of the consequence it produces; The response is always voluntary and has conscious controlThe behavior is goal-directed; So without knowledge about the discrete trial and how the information is interpreted, they look the same
Clark Hull developed the drive reduction theory of learning. Explain his theory and how it can help us understand what it takes for people to change. Make sure you include the definitions for D, K, H, V, and I in your explanation.
- - Drive reduction theory of learning: Reinforcers are stimuli that reduce drives based on biological needs; Once the biological need is satiated, the drive is reduced (though it is only temporary); The drive reduction serves as a reinforcer for learning; Uses a combined influence of various factors
- - D x H x K x V - I = response
- - D: drive, level of motivation
- - K: incentive motivation, quantity or quality of goal
- - H: habit, past response (SUR = innate and SHR = acquired)
- - V: stimulus intensity, salience
- - I: inhibition, fatigue level
- - How can this theory help us understand what it takes for people to change?:People change because of a need to change and when that need is biological, their motivational factor is intensified. However, as people’s drives are satisfied, the reinforcements become less effective and must wait until there is a deprivation state again. The higher each factor, the greater the response.
Explain the basic concepts of Skinner’s reward or reinforcement theory including the four basic types of reinforcements, explanations for each, and examples. Provide information regarding what type of reinforcement works most efficiently and why? What type of reinforcement is least effective and why?
- - Reward/reinforcement theory: All behavior occurs because of the types of reinforcement it receives: positive, negative, omission training and/or punishment, along with the schedule of reinforcements. Types of reinforcers a pleasant/appetitive or unpleasant/aversive and vary based on addition or subtraction of the reinforcer.
- - Positive: Increases behavior; Most effective: because positivity lasts longer and makes us feel better about our behaviors and it doesn’t cause detrimental side effects; Example: Child receives M&M’s each time she makes it to the bathroom on time
- - Negative: Behavior stops aversive stimulus; aversive stimulus is removed because of the behavior; Escape: A behavior can stop a continuous, aversive stimulus; Avoidance: A behavior prevents the occurrence of an aversive stimulus; Example: escape: click your seatbelt to turn off the annoying buzzer; avoidance: click you seatbelt before the annoying buzzer has a chance to start
- - Omission training: Behavior presents the delivery of a pleasant stimulus, then the pleasant stimulus is taken away to decrease the response; Example: Time out away from toys and attention from others
- - Punishment: Presenting an aversive (usually physical) stimulus to decrease a response; Least effective: because the punishment has to be very severe, immediate and consistent; it’s side effects are usually not worth the actual behavioral outcomes and can actually cause the unwanted behavior to increase, rather than decrease. Also, if the behavior was violent in someway, using punishment (violence) for violence can be confusing and contradictory.; Example: Spanking a child after they have hit another child
Part of Skinner’s theory of learning has to do with schedules of reinforcement. Define the 5 types. Give an example of each type. When would each schedule be used most effectively and why?
- - Continuous: reinforcement for each instance of a response; Effectiveness: at the onset of training, continuous reinforcement usually produces more rapid conditioning but over time continuous reinforcement doesn’t encourage the continuation of the behavior if the reinforcement is taken away so extinction is very quick; Example: M&M each time the child goes to potty in the toilet might cause the child to only use the toilet when there is a guarantee of candy
- - Fixed ratio: a reinforcement is given after a certain number of tasks are performed; Effectiveness: lead to high response rates and is based solely on the participant’s efforts; however there are pauses after the delivery of the reinforcement before the behavior increases; Example: for every 5 bar presses, Faraday receives a rf of water
- - Variable ratio: a reinforcement is delivered after an average number of performances; Effectiveness: not highly effective at the beginning of conditioning but it encourages ongoing strong responses later on; Example: sales associates’ attempts to help customers are sometimes rewarded with sales. Which customer will buy may be unpredictable, but more attempts should produce more sales.
- - Fixed interval: reinforcement is delivered after a set time; Effectiveness: this produces a gradual increase in performance; Example: the bar press won’t produce any water for 60 seconds when the light is off, no matter how many times Faraday presses the bar, so she waits for the lights to come on before she presses the bar again
- - Variable interval: reinforcement is delivered on an average time schedule; Effectiveness: not highly effective at the beginning of conditioning but it encourages ongoing strong responses later on; Example: checking Facebook for updates - the updates may arrive unpredictably, but the recipient won’t know unless she checks for them
Human behavior is complex which makes explaining and changing it challenging. Explain the concepts of multiple schedules of reinforcement including concurrent schedules, under-matching, overmatching, tandem schedules, and chaining. Give examples of how a person might be responding to more than one schedule of reinforcement at a time. How do people make decisions about what should be done now or what should be done later based upon these concepts?
- - Concurrent schedules: two or more responses reinforced on different schedules --> short delay reward = cake while long delay reward = healthy teeth; the short delay reward becomes less appealing the longer the person waits; Under-matching: the proportion of rf for responding is less than the actual possible rf proportions for that stimulus; reinforcement is still left over; the organism doesn’t work enough to receive all the rf possible; Example: Person works out only 30 minutes every other day and not very hard when they know if they worked out 45 minutes every day with more intensity they would have better, faster results; Overmatching: the response proportions are greater than the available reinforcement proportions; the organism works to hard given the available rf; Example: I could have printed off a calendar and used it to keep track of my behavior project; instead I spent an hour and a half scrap-booking a calendar to keep track of my behavior project
- - Chaining: two or more cues presented successively - responding to cue #1 in the presentation of cue #2 and responding to cue #2 results in a rf; Tandem schedules: no external stimuli used for cues and no discrete ending to one event; the first sequence gives information about the next sequence; tends to work more effectively than chaining along because the end of the trial is not as evident - Faraday hits lever, which turns light on, which means she knows to press the bar, and bar pressing will produce rf; Example: In soccer, you have the ball and know that your goal is to dribble it down to the goal and score, and your score is reinforced by the cheering of your teammates, coach and crowd but there was no stimulus to get you to dribble, shoot, score
- - How might a person be responding to more than one schedule of reinforcement at a time?: Multiple schedules can be seen in the everyday of life of a college student: when this class is over, I get to eat, but when this class is over, I get to go home, when I get in my car, I can listen to music, when I get home, I can do homework, when my alarm goes off, I get up, but when it goes off again it means I have to leave for a class
- - How do people make decisions about what should be done now or what should be done later?: We make decisions by prioritizing - sometimes our priorities are based on immediate gratification (cake now) or they are based on long-term benefits (work out now, no cake and better abs)
Explain the Premack Principle using the information presented in class. What does this principle have to do with the effectiveness of reinforcers? Make sure you define each element in the principle. What did Timberlake and Allison add to the Premack Principle?
- - Premack Principle: the opportunity to perform the higher-probability response will serve as a reinforcer for the lower-probability response
- - Effectiveness of reinforcers?: the reinforcers have to be effective to increase the likelihood of the low-probability response; the reinforcer can be any activity the person is more likely to engage in than the instrumental response (TV, video games, play time, shopping, dancing, etc...)
- - Elements: High probability action: watching an episode of Gilmore Girls; Low probability action: reviewing my notes for at least 30 minutes; How to get the low probability action to happen more than the high: I would have to review my notes for at least 30 minutes before I could watch an episode of Gilmore Girls, thus I would be using the high-probability response to reinforce the low-probability response
- - Timberlake and Allison’s additions: Said the principle didn’t give all the information and the organisms weren’t really free to choose (what if they didn’t like either option); Developed the behavioral bliss point to distribute activities among available response options - given this environment and this time what are the possible reinforcements? - you have to figure out what they will respond to - response restrictions are imposed to increase the low-probability response
Terry discusses that punishment can work. What is necessary in order for punishment to be effective in training? Explain why human beings are not very good at using punishment as an effective modification method? What is meant by “paradoxical rewarding effects” of punishment?
- - What is necessary for punishment to be effective?: Punishment must occur in a timely manner - the more immediate the better; The punishment must be meaningful to the behavior; The degree of response suppression is a function of intensity so punishment cannot gradually increase or it is not effective; More effective if on a continuous reinforcement schedule than any interval or variable; Estes, Miller and Masserman worked with punishment; Estes found that extinction without punishment was more effective than punishment; Miller found sudden, intense punishment is most effective; Masserman found that punishment must be intense to decrease behavior
- - Why humans aren’t very good at using it: We tend to use punishment as an outlet for our own anger, rather than as a reinforcement to decrease a behavior; we aren’t consistent or timely enough and the punishment may not be aversive enough or too aversive for learning to occur; Punishment can lead to harmful/unwanted side effects: fear, aggression, and avoidance
- - Paradoxical rewarding effects: pairing of punishing stimulus and with a positive reinforcer can convert into a secondary reinforcer; a punishing event (air blast) may become a conditioned reinforcer by virtue of pairing with the a positive reinforcer (food). Then, the gradual increase in the intensity of punishment minimizes its power to suppress behavior. Eventually, the cat is bar pressing for blasts of air instead of food.
There are some behaviors that we want to extinguish. Explain how extinction can be accomplished with operant conditioning. What schedules of reinforcement are most susceptible to extinction and what schedules of reinforcement are most resistant to extinction? Why? Make sure you include the concept of partial reinforcement extinction effect in your discussion.
- - Extinction: a nonreward contingency in which reward is omitted after those responses that once produced positive reinforcement; by withholding the positive reinforcement, the organism should stop performing the response - in lab 3, Faraday was not given any reinforcement for bar presses in the dark, and, eventually (until the very end) she stopped pressing the bar
- - Side effects: Extinction bursts: temporary increase of the nonreinforced behavior; Spontaneous recovery: after a delay interval, the response recovers; When the old response no longer produces reward, the organism engages in new behaviors to try to restore reward and behavioral variability increases - can lead to adaptation
- - Susceptible schedules of reinforcement: continuous reinforcement schedules are very susceptible to extinction
- - Resistant schedules of reinforcement: intermittent reinforcement (PRE)
- - Partial reinforcement extinction effect: variable schedule of partial reinforcement which causes resistance to extinction later on; Discrimination hypothesis: when the reinforcement doesn’t come, the partially reinforced rat doesn’t discriminate this difference until several nonrewarded trials have occurred; Frustration hypothesis: frustrating aftereffects of nonreward become associated with the subsequent occurrence of reward - this requires frequent transitions between nonrewarded and rewarded trials for this association to occur; Sequential hypothesis: memory of nonreward on one trial becomes associated with the occurrence of reward on a later trial so at the start of a new trial, the participant remembers the outcome of the previous trial and associates it with the outcome of the current trial