-
Basic Forms of Logic
Modus Tollens- Denying the Antecedent: if P, then Q, not Q, therefore, not P
Modus Ponens- Affirming the Consequent: if P, then Q, Q, therefore P
-
Determinism
- weak: all events have antecedent causes
- probabilistic/stochastic: if all relevant antecedents are known, then the distribution of future events can be known
- strong: if all relevant antecedents are known, then the future event can be known in advance
-
phenomenological assertion
- descriptive
- what happens when
- when peoples stated beliefs and behaviors are in conflict, they are more likely to change their stated beliefs than their behavior
-
Theoretical assertion
- explanatory
- why what happens when
- conflict between stated beliefs and behaviors causes unpleasant internal condition (cognitive dissonance) and changing the belief is usually easiest way to reduce conflict
-
data
- sets of values for variables
- needs to be objective: can be verified by others
- needs to be replicable: if someone else collects the same kind of data from similar people under same conditions, they should get same results (statistically)
-
variable types (3)
- continuous: numerical, can take any value between two extremes ex: response time
- discrete: numerical, can take only certain values ex: # of siblings
- qualitative: categorical, values differ in a non mathematical way ex: race
-
Univariate Data is summarized by...
- Center: mean
- Spread: standard deviation
- Shape: name
*bivariate data summarized as two sets of univariate plus some measure of their association (almost always the correlation between them)
-
Bivariate Data
- paired observations; can be any two measures
- can also be two measures of the same thing at different times
- 1. do the descriptive D-stats on each variable alone
- 2. calculate a measure of the relationship between the two variables
-
Correlations
- the simplest and most popular measure of the association between two variables
- can be calculated between any two variables
- max +1.00 min -1.00 both considered perfect
- no correlation=0
- correlation coefficient (r): provides a measure of the linear relationship 1.00 to -1.00
- coefficient of determination(r2): perfect=.7 provides a measure of how much of the variance in one variable is explained by another variable
- 1. correlations are unaffected by linear transformations
- 2. correlations have no units
-
Pre Processing Steps
- 1. Nothing.. the raw data are what's needed
- 2. Condensed score: combining a large number of different measures to get on value (usually done for convergence)
- 3. Summary Score: reducing a large number of identical measures to a single value (usually done for noise reduction, averages are much less unreliable)
-
Descriptive Stats
- summarize a given set of data
- the set of data is usually a sample, not the entire population
- because these are summaries, they can't be wrong
-
Unreliability
- the standard deviation across scores
- if you measure the same thing many times, identical conditions, you often don't get the same value every time
- good estimate of the amount of noise in the data
- the maximum possible correlation depends on the unreliability of the measure
- unreliability is BAD to reduce use summary scores instead of raw scores
-
Reliability
- the correlation between scores
- measuring many people twice gets you test/retest pairs
- calculating the correlation between these two is the reliability of the measure
- test/retest reliability must be at least +.70
-
operational definition
a statement that maps one or more empirical measures on to one or more theoretical constructs
-
construct validity
- the extent to which the measure provides an exhaustive and selective estimate of the target theoretical construct
- exhaustive: measure should cover all aspects of target construct (the whole truth)
- selective: measure only covers things that are a part of the target (nothing but the truth)
-
Convergent Validity
- the extent to which the measure is correlated with other measures of the same or similar underlying constructs
- exhaustive part of construct validity
- at least +.70
-
Discriminant Validity
- the extent to which the measure is NOT correlated with measures of different and dissimilar underlying constructs
- selective part of construct validity
- within .20 of zero
-
Threats and Counters to Construct Validity
- lack of exhaustiveness: does not cover the whole thing, so add or expand items
- lack of selectivity: measuring other stuff too, so delete or refine items
- systematic error
- sometimes reactivity, evaluation apprehension, demand chars with reactivity
-
Kinds of Validity (4)
- 1. Content Validity: whether the measure makes sense, exclusive and exhaustive measure
- 2. Face Validity: whether the measure appears to measure the construct (to a subject), avoid this, subjects may change their behavior
- 3. Criterion Validity: whether the measure correlates with known consequences of the construct
- 4. Construct Validity: the extent to which the measure provides an exhaustive and selective estimate of the target theoretical construct
-
Internal Validity
the extent to which a significant (IV-DV) relationship is causal and not spurious
- significant IV-DV relationship= the data from the different conditions are different and it isn't just due to chance
- is causal= the data from different conditions are different because of the planned difference between conditions
- and not spurious= as opposed to the data from different conditions being different for some other reason
- Threats
- confounds
- experimenter / observer bias
- demand chars with good subject behavior
-
Experiment
- has at least one manipulated variable acting as the potential cause of interest
- has a labile measured variable acting as the potential effect
-
IV
a manipulated variable being treated as the potential cause of interest
-
DV
a labile measured variable being treated as the effect of interest
-
Manipulated Variable
- something that is under the complete control of the experimenter
- 3 types
- 1. Situational: features of the environment (example lighting)
- 2. Task: elements of what subjects are asked to do (easy vs hard tasks)
- 3. Instructional: elements of how subjects are asked to do the task (example use imagery vs rote memory)
-
Measured Variable
- Something that is determined by or built in to the subject
- 2 types
- 1. stable: built in, permanent, like gender or handedness; difficult to impossible to manipulate so referred to as subject variables
- 2. Labile: situational, temporary, like mood or response time; relatively easy to manipulate these are called data variables
-
extraneous variable
a potential cause of the effect that is not of current interest
-
Confound
- an extraneous variable that covaries with the potential cause of interest
- in order for it to be a confound, the EV must change in parallel with the IV
-
Confounding
when at least one extraneous variable changes in parallel with the IV
-
Experimental Control
the ability of experimenters to hold everything constant
-
Control Experiment
an ancillary experiment designed to test whether a potential confound in the main experiment could have been responsible for the results observed in the main experiment
-
Experimenter Bias
- when beliefs and or expectancies of experimenter influences the results
- acts as confound and reduces internal validity
- to reduce..
- 1.reduce the involvement of human experimenters by using computers
- 2.standardize the behavior of human experimenters by using strict protocols like interaction scripts
- 3.remove the human experimenters knowledge by making the experimenter unaware of predictions or double blind
- Checking for Exptr Bias..
- run an experiment that follows advice
- run a null manipulation experiment which is a type of control experiment
-
Participant Bias
- when beliefs of participant concerning how they should behave influence the results
- subtypes are demand characteristics and evaluation apprehension
- to reduce..
- 1. "good subject" type: reduce the demand characteristics or bury them in a load of filler
- 2.evaluation apprehension type: make the experiment less "social" and or convince the subjects that data is annonymous
-
Demand Characteristic
- any aspect of the experiment that indicates the purpose of the experiment
- acts like a confound
-
Evaluation Apprehension
- an internal state that causes subjects to alter their behavior so that they will be viewed more positively by other people
- reduces construct and external validity
-
Statistical Conclusion Validity
- the extent to which inferences about the sampling population, based on a sample, are accurate
- Type I Error: concluding there is a difference in the sampling population when there isn't (false alarm)
- risk: the probability of making this error .05
- Type II Error: concluding in favor of same in the sampling population when different (miss)
- power: probability of not making this error 1-B .80 +
- Threats
- uncontrolled variability
- random error
- violating assumptions hurts all stats
-
What causes Type I and Type II errors?
- Type I:
- bad luck
- one or more assumptions were not true
- Type II:
- bad luck
- one or more assumptions were not true
- noisy data and or too small of sample
-
Within Designs
- do both conditions
- interest in a small effect
- very brief experiment
- heterogeneous subject population
- downsides: increased demand characteristics and variety of possible carry over effects
-
Between Subjects Design
- 1/2 do one condition, 1/2 do other condition
- non repeatable measure
- long lasting manipulation
- need vanilla control condition
- fear of demand characteristics
- need to create equivalent groups by pseudo random assignment and covariates/ matching
- downsides: requires many more subjects
-
Random Assignment
- to produce equivalent groups in b/w design
- the means, across groups, of all EVs are equal
- 1.true RA: each subject is independently and randomly assigned to a group
- 2.blocked randomization: subjects are randomly assigned to one of the currently smallest groups
- 3.pseudo-randomization: the order of assignment to groups is set in advance and applied as the subjects arrive (effective with large group)
- alternatives/ additional procedures
- 1.matching: measure everybody in advance on worried about variables, top two and so on in different conditions, makes same, big pain
- 2.Verification: include measures of potential confounds, discard entire dataset if RA fails
- 3.inclusion of covariates:USE THIS ONE include measures of the potential confounds, use these measures to remove the effects during the analysis, never throw out and one meeting, although each covariate = 1 df
- *use true RA with covariates*
-
Counter-balancing
- to equalize all sequence and order effects in within design
- 1.complete counter balancing all possible orders are used
- 2.random partial counter balancing: each subject gets the conditions in a pseudo random order
- 3. latin square: k different orders are created
- 4.balanced latin square: latin square where each condition is also followed by each other condition exactly ex: a precedes b once, c,d,...
-
Control Hierarchy
- 1. exert control, Hold it constant: don't allow the potential confound to vary at all
- 2. pre equalize, Equalize on Average: make the potential confound equal on average across conditions
- 3. post equalize, Measure and Remove: remove the effects of the potential confound after-the-fact
- 4. run Control Experiment: test the potential confound in a separate experiment
-
Inferential Stats
- go beyond the actual data in hand and make a best guess about the population from which the sample was taken, can be wrong
- 1.Point Estimation(for the mean): get a best guess for what you are interested in, get an estimate for how wrong you might be, standard error
- 2.
- Paired Samples t-test for within design:
- with w/i subjects design you have pairs of data, is there a difference between the two conditions? hypothesis testing, convert the two values from step one into a single, y/n answer to a question
- Independent Samples t-test for b/w design: when you use a b/w subjects design, you have separate samples for cond 1 and cond 2, so the probability question becomes what is the probability of getting two sample means that are this different if we assume both samples came from one distribution? less than 5%, then population means are not the same
-
Standard error of the mean
- a measure of how far any given sample mean might be from the mean of the sampled population
- sd/ root N
-
Hypothesis Testing
- there are four possible outcomes from an experiment
- what is true for the sampled pop?
- 1.the means for the two conditions are the same
- 2.the means for the two conditions are different
- what did we conclude, based on sample?
- 3.the means for the two conditions are the same
- 4.the means for the the two conditions are different
-
Correlational Study
- two measured variables
- one called predictor (cause)
- one called predicted (effect)
-
Quasi Experiment
- one stable measured variable (SV) that is treated as IV
- one labile measured variable treated as DV
-
third variable problem
- when Z causes the correlation between x and y
- spurious
- include Z as a covariate and recalculate the correlation (partial correlation)
- if partial is as strong as before, then not the cause
- if partial is smaller than original then Z is not the only cause
- if partial is now zero, then Z was the entire cause of x and y's correlation
-
Third Variable
a typically unmeasured variable that could be the cause of both the measured variables in a correlational study
-
Spurious
a significant relationship that is not causal in either direction
-
Cross-lagged
- the correlation between one variable at one time and another variable at another time
- used to determine the more likely direction of causation
- if r2x1y2>r2x2y1, then x is probably the cause
-
Partial with respect to Z
- the correlation between two variables after the effects of a third variable Z have been removed
- used to test and rule out a third variable explanation
- if pr XY(Z) = rXY then Z is not a cause of both X and Y
-
External Validity
- the extent to which the results from an experiment will generalize to other situations
- to minimize the need for external validity...
- 1. do things to increase external validity like parallel random assignment and counter balancing
- 2. reduce the need for EV by making the studied situation more similar to the target
- person and context specificity threaten external validity
- threats
- unique/atypical sample
- lack of mundane realism
-
context specificity
when the results from an experiment or study are unique to the situation
-
person specificity
when the results from an experiment or study are unique to the subjects
-
convenience sampling
when only easily recruited subjects are used
-
Probability sampling
- when each person in the population has a definable probability of being sampled
- 2 types
- 1.simple random sampling
- 2.stratified random sampling
-
simple random sampling
- when all members of the population have a definable probability of being sampled, but no attempt is made to match the group sizes
- subtypes (use standard)
- 1.standard: sample 500 people
- 2.systematic: sample every tenth person on list
- 3. bernoulli: each person has a 10% chance
-
stratified random sampling
- the sizes of the groups in the population are taken into account
- subtypes
- proportional: force the groups %'s in the sample to match the %'s of the groups in the population
- important when samples are small
- nonproportional: quota sampling, force the group %'s, in the sample to be equal to each other
- its only important when some groups in population are very small, some stats require a minimum number of observations in every cell to be used has equal error or use when the accessible population doesnt match the target population
-
To choose a sampling method...
- ask how important it is to have a sample that accurately represents the target population
- if not very: convenience
- if sort of: simple random sample
- if very: stratified random sampling
if matching population is important but population is huge then use cluster sampling: when people are conveniently pre grouped via an irrelevant variable
-
survey or questionaire
a structured set of items designed to measure attitudes, beliefs, values, or behavioral tendencies
-
survey types
1. face to face interview: often only weakly structured, + can go where you want topic wise, - highly susceptible to reactivity and exptr bias2.face to face survey: often paired with convenience sampling , +very fast, - limited to simple yes no answers3. phone survey: often paired with simple random sampling, + very fast, - limited to simple yes no answers4. written questionaire: controlled setting vs take home, +can use more complicated item types (scales) and less reactivity, - less expt realism, can suffer from biased attrition: when probability of a given subject completing a survey depends on what their response is 5. electronic questionaire: +lower reactivity - lower realism
-
Item Types
- 1. open ended questions: +less demand - less control
- 2.Closed questions: + easily codified , - often require fillers to avoid demand
- 3. Likert scales: sets of 7 point agree/disagree items + usually most reliable measure of attitude, - some subjects object to lack of it depends option
- 4. Guttman scales: a set of ascending questions, how far will subject go? stop asking when they say no + adaptive, - assumption of order may not always be accurate
- 5. Thurstone scales: checklists subjects indicate all that apply, each item pre rated for positivity, + often the best convergent and discriminant validity, - much more work
- 6. Semantic differentials: pairs of opposites indicate position between extremes + works well for overlapping constructs, - requires complicated pre processing
-
Naturalistic Observation
- studying behavior in everyday environments without getting involved
- key threat is reactivity, then observer bias
-
participant observation
- studying behavior from within the target group
- key threat is standard exptr bias, then observer bias
- not often possible because no consent observation can only occur in reasonable places
-
observer bias
- when the beliefs or expectancies of the observers influence what is recorded
- intercoder reliability must be .90+
- to reduce...
- use multiple observers
- prevent observer overload thru checklists and time sampling and event sampling
-
expost facto quasi experiment
when you take only one sample and then divide the subjects into the groups after the fact
-
planned quasi experiment
when you take separate samples for each of the groups
-
longitudinal study
- aging
- when you follow the same subjects over time
- threat is time frame effects (zeitgeist)
-
cross sectional study
- aging
- when you take separate samples for each age group at the same time
- threat is cohort
-
solution to both aging studies
run a hybrid study and verify same results either way
-
Paradigm
a standard method for studying a particular issue
-
paradigm measure
a summary score requiring data from atleast two conditions that is used to estimate a single theoretical construct
-
Stroop Original Version
- target construct is automaticity (of reading
- task is name the ink color)
- paradigm manipulation is (incongruence of a to be ignored word with the correct vocal response for the trial)
- incongruent trial is green
- neutral trial is house
- paradigm measure is the increase in response time between incongruent and neutral trials
-
Stroop Complete version
- Target construct is the failure of attentional selectivity
- task name the ink color
- paradigm manipulation congruence (of a to be ignored word with the correct vocal response for the trial)
- congruent trial is black
- incongruent trial is blue
- paradigm measure is the increase in response time between congruent and incongruent trials
-
The two Stroops
- original: incongruent vs neutral (ie never congruent) automaticity of reading
- complete: incongruent vs congruent (chance congruence) failure of selective attention
-
Exogenous Cuing
- target construct is capture of visual attention
- task is to respond to large white square
- paradigm manipulation is spatial validity of a prior cue (whether the place holder that flashes is in the same location as the target or elsewhere)
- paradigm measure is the advantage for valid over invalid trials
-
Pretest/Posttest with one group
- measure the subjects before and after treatment and test if they get better
- problems are before vs after is within subjects and within subject manipulations need to be counter balances but that can not be done here
-
Post test only with two groups
- one group gets the treatment, the other does not and test if the treatment group ends up better
- problems: treatment vs control is between subjects and there could have been a failure of random assignment
-
Pretest/Posttest with two groups- Analysis Option 1
- change score logic + level 4 confound control
- compare the change scores between the groups using an independent samples t test
- a significant advantage for the treatment group is evidence that the therapy is better than nothing
-
Pretest/Posttest with Two groups -Analysis Option 2
- outcome logic + level 3 confound control
- compare the post scores between the groups but include the pre scores as a covariate
- a significant advantage for the treatment group is evidence that the therapy is better than nothing
- problem is that it doesnt tell if the treatment group got better or the control group got worse
-
Pretest/Posttest with two groups -Analysis Option 3
- change score logic + level 3 confound control
- compare the change scores between the groups but also include the pre scores as a covariate
- a significant advantage for the treatment group is evidence that the therapy is better than nothing
- best option!
-
Non Equivalent Group Designs
- many subjects run at once
- subjects are already in groups (which can't be shuffled), so whole groups are assigned to conditions, instead of people
- problem 1. groups not equal to start with, to fix add covariates
- problem 2. different events after the covariates for the groups to fix add a control measure to assess these events
-
Won't inclusion of covariate allow us to remove any differences, regardless of source?
- covariates only remove main effects of confounds, they do not remove interactions
- covariates can't do anything about confounds that occur after you take the covariate measure
-
Don't these issues apply to all between subject designs and not just non equivalent group designs?
- yes but they are much more likely when group membership is inherent to the subject, instead of randomly assigned
- you can not stop these group based confounds from happening but can get a measure of them
- this is like running a control experiment, but its done in the same subjects, at the same time, so its a control measure instead
-
Matched groups design
- pick subjects that are between the overlapping groups
- run experiment on everyone but only get data on matched
- choose two subgroups that are equal so starting with equal groups
-
Regression to the Mean
- rooted in true score theory observed score = T + e
- the expected value of e is zero and separate measures have independent errors
- a particularly high observed score
- probably includes a large positive e
- is expected to be followed by a lower score since the odds of getting to large positive es in a row is small
- the amount of regression that is expected to occur depends entirely on the unreliability of the measure
- amount of regression depends on unreliability of the measure
-
How do you reduce unreliability so regression is also reduced?
- use multiple measures
- use a measure with very high test retest reliability
-
Interrupted Time Series Design
- allows you to remove the general effect of time
- add a non equivalent control group the data from them will provide a measure of the effect of wide spread events like 9-11
-
Ultimate time series design
- start with basic interrupted time series design to get an estimate of general trend across time
- add a staggered second group to get a second measure of the effect and get measure of effect of widespread events
- add a control measure to both groups to get a measure of the effect of local events in both groups
|
|