-
What is operant/instrumental conditioning?
- A change in a behaviour (learning) caused by a causal relationship between the behaviour and a biologically important stimulus (reinforcer).
- (Process whereby organisms learn to make responses in order to obtain or avoid important consequences.)
-
Who first demonstrated operant/instrumental conditioning? What did he do?
- Thorndike (1911)
- showed that cats could learn to press lever to get certain outcomes. Cats pulled lever to escape.
- The more trials, the faster the cat was able to escape.
-
What observation was made by Thorndike after his experiments with the cats? Which law did he come up with?
- Law of Effect
- Of the several responses made to same situation (eg. lying down, meowing, pulling lever) those which are accompanied or closely followed by satisfaction to animal will be more firmly connected to situation.
-
Outline the difference between instrumental and Pavlovian conditioning?
- Pavlovian: response is a reflex produced in expectation of an outcome
- Instrumental: the response is instrumental in receiving the outcome
-
[Principles of reinforcement] What are 3 types of operant conditioning their effects to frequency of response?
- Positive reinforcer: response followed by pleasant outcome - increase in probability of response
- Negative reinforcer: response followed by removal of unpleasant outcome - increase in probability of response
- Punisher: response followed by unpleasant outcome or a pleasant outcome is omitted - response decreases
- (ALSO: negative is removing something; positive is adding something. Reinforcement increases response, punishment decreases response)
- Unlike Pavlovian conditioning where irrespective of whether reinforcer is appetitive or aversive, conditioning results in increase in vigour of conditioned response.
-
What are the two types of schedules of reinforcement?
- Ratio schedule: the arrival of the outcome/reinforcer depends on how many responses are made.
- Interval ratio: next reinforcer depends on how much time has passed after the last one (does not depend on number of responses)
- Both these can either be fixed or variable VR (variable ratio) FI (fixed interval) etc
-
Which type of schedule of reinforcement is most like what happens in nature?
- Interval schedule
- Because organic resources usually deplete and replete after a certain timehas elapsed so that there is a limit on reinforcement
-
Who compared the performance for the two types of schedule? What is the study like?
- Matthews et al (1977)
- rewarded lever pressing with monetary reward on variable ratio (VR) schedule
- every time reward delivered on VR schedule, reward became availble for another participant - generating a yoked variable interval (VI) schedule with similar rate of reinforcement
- Result: VR schedule yielded most responses
- (perhaps because of stronger association of response as instrumental)
-
Law of effect states that reinforcers strengthen association between stimulus and response (S-R learning) which establishes habitual responding. What study shows neural correlates with habitual responding and how there are parts of the brain which are activated for instrumental, but not Pavlovian conditioning.
- O'Doherty et al 2004
- Participants touch 1 of 2 simultaneously presented visual stimuli --> one of them yields higher probability of fruit juice reward
- Other participants just received yoked pairings of stimuli with outcome without making choice response (Pavlovian conditioning)
- fMRI showed more activity in dorsal striatum during instrumental task (but activation in ventral striatum for both instrumental and pavlovian)
-
What is the limitation of S-R learning compared to goal-directed behaviour?
- S-R reinforcement process establishes instrumental habits that are not mediated by knowledge of outcome
- Does not allow selection of instrumental action on basis of current goal
-
What method can be used to look at whether instrumental response is S-R habit or goal-directed action? Which study looked at the difference in the two types in different aged children?
- Study by using outcome devaluation test
- Klossek et al (2008)
|
|