-
What is a construct?
- An underlying trait that is reesponsible for some observeable behavior
- Indirect measure
-
What is construct validity?
- Refers to inferences made from measured variables to theoretical constructs
- Examines how well the assessment “matches up with” the construct
-
What is validity?
“the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of the test”
-
What are the two threats to validity?
- Construct Underrepresentation (Less)
- Construct Irrelevant Variance (More)
-
What is construct underrepresentation?
- Not measuring the construct as broadly as intended
- “the degree to which a test fails to capture important aspects of the construct”*
-
What is construct irrelevant variance
Measuring consistently something that is not part of the construct
-
What are the five sources of validity evidence?
- Evidence based on Test Content
- Evidence based on Response Processes
- Evidence based on Internal Structure
- Evidence based on Relations to Other Variables
- Evidence based on Consequences of Testing
-
What does test content refer to?
- Analysis of the relationship between the test’s content and the construct of interest
- Themes, wording, and format of the items, tasks, or questions on a test, as well as the procedural guidelines for its administration and scoring
-
What does response process validity refer to?
- Analyses of the response processes of examinees is used to determine the fit between the construct and the examinee’s actual performance or response
- Includes the processes of judges, raters, or observers when evaluating examinee’s performances .
-
What types of evidence support content validity?
- Use logical or empirical analysis
- Most frequently relies of “expert” judgment
- Part of test development
-
What types of evidence support response process validity?
- Theoretical and empirical
- Process focused on a study of examinees
- Interview test takers
- Think alouds
- Examine rating or scoring process
-
What does internal structure refer to?
The degree to which the relationship among test items and test components conform to the construct which the interpretations are based.
-
What types of evidence support internal structure validity?
- Item analysis
- Reliability
- Factor analysis
-
What does external validity refer to?
- relationship between the test scores and variables outside of the test
- ie: GRE predicts student success in graduate school
- Relationship to other tests
-
What types of evidence support external validity?
- Convergent and discriminant evidence
- Test criterion evidence (predicts and a function of purpose)
- Validity Generalization (Can the instrument be generalized to other situations)
-
What is convergent evidence?
Relationship between scores on measures of the same construct
-
What is discriminant evidence?
Relationship between test scores and measures of different constructs, should not be related
-
What does consequential (consequences of testing) refer to?
- Intended consequences happen
- Unintended consequences are not occuring especially issues of bias, fairness
-
What types of evidence support consequences of testing?
- Intended
- Shaping the curriculum
- Teaching the content of the test
- Unintended Consequences
- Narrowing the curiculu
- Teaching to the test and only the test
- Subpopulation differences
- Develops over time
-
What is internal validity?
The extent to which a cause and effect relationship is isolated from competing influences
-
What is the purpose of internal validity?
- Determines the soundness of conclusions/interpretations of a causal relationship
- Are the variables influenced by different variables
-
What are the four categories of threats to internal validity?
- Time threats
- Group threats
- Mortality
- Atypical behavior
-
What is a time threat in internal validity?
Impacts on dependent variable over time are different because of factors other than the treatment variable
-
What are group threats to internal validity?
- Explanations for changes in the variables other than experimental differences created by the researcher
- Selection of the group may cause
-
What are mortality threats to internal validity?
- Loss of subjects in a study
- Impacts all studies long enough to have dropouts
-
What are atypical behavior threats to internal validity?
- Research design cannot eliminate
- Actions of those in the groups which change the treatment (Treatment and control group interact about the intervention)
-
What is testing as a time threat and possible solutions?
- The effect of the first test on the second
- Effect of a publication about the treatment
- Solutions:
- Lengthen time interval between tests
- Disguise use of prestest (don't disclose next test)
- Use control groups
-
What is history (time threat) and possible solutions?
- An event that occurs during treatment that affects subject response
- Usually events that could be controlled
- Solutions
- Use control groups
- Use shorter time
-
What is maturation as a time threat and possible solutions?
- Naturally occuring process within participants that occur because of time and may change their performance
- Inc. fatigue, boredom, growth, intellectual development
- Solutions:
- Shorten the time of the study
- Use a control group with a similar maturation rate
-
What is instrumentation as a time threat and possible solutions?
- Changes in measurement procedure in a pretest - posttest study
- Inc: calibration, rater changes, score use
- Solutions:
- control group
- standardize the measurement procedure
-
What is pre-experimental research design?
- One group with post-test only such as pilot testing
- Or
- One group with pre-and post test
- These are weakest designs because they do not strongly link group changes to treeatment
-
What are quasi experimental designs?
- Designs that involve a control group but do not use random selection
- Or
- One experimental group with multiple tests, the first is a baseline
-
What are possible threats to quasi experimental design?
- Time threats
- Selection (group changes at time of intervention)
- Instrumentation
- Maturation
-
What is experimental design?
- Pre-test, post test, control group design w/random assignment
- Post test control group design with random assignment
- Still has threats to validity but usually eliminates group threats
-
How to improve internal validity?
- Use random assignment
- Use Pretest
- Use control/comparison group
-
What are types of statistical designs?
- Within-subject designs
- Between subjects designs
- Mixed designs
-
What are types of experimental designs?
- Pre-experimental designs
- Quasi experimental designs
- Experimental design
-
What is a within subject design?
- Measures individuals within a group multiple times both before and after treatment
- Change over time is of interest
- Applies to pre-experimental and quasi experimental
-
What is between subjects design?
- Compares the scores of two or more groups
- Group comparison are of interest
- Applies to quasi-experimental and experimental designs
-
What is mixed design (split-plot) ?
- Both between and within designs used
- ie: look at individual student scores as a result of a treatment in two classes
- Applies to quasi-experimental and experimental
-
How can you control for differences between groups?
- Statistical adjustment based on theoretical argument
- Matching of variables between groups
- Random assignment creates random equivalence on all variables
- Matching and Random Assignment
-
What are group threats to internal validity?
- Regression towards the mean
- Selection
- Selection by time interactions
- selection by maturation
- selection by history
-
What is regression to the mean?
- Applies to pre-post design
- Scores (both high and low) regress to the mean
- Inflates low group and deflates high group change
- Impacts in gain score situations with extreme groups
-
How to mediate the effects of regression to the mean?
- Avoid comparison of extreme groups if possible
- Retest-retest to establish a more "stable" baseline
- Use a control group for each extreme
- Use high reliability measures
-
What is selection threat?
- A threat that is due to the different group characteristics present at the start of the study
- Affects all quasi-experimental studies
-
What are interactions with selection?
- Internal validity threats interact with selection to produce effects not due to treatment
- Selection maturation (mature at different rates)
- Selection history (groups from different settings so different events)
-
How to mediate the effects of selection and interactions?
- Random assignment
- Matching (only relates to variables being matched)
- Random assignment and matching
- Check pretest equivalence applies to quasi and true experimental design (ANOVAs testing for this)
-
What are mortality threats?
- subjects leave the study for different/systematic non-random reasons
- results in selection artifact because posttest group is different
-
How to mediate impact of mortality threats?
- Control groups if mortality cause is the same
- Shorter time interval between start and finish
- Monitor incidence of mortality
- Use pretest to compare scores of those who dropped and those who did not
-
What are atypical behavior threats?
- not considered part of validity by all because cannot be controlled by design
- Differences between groups not caused by the design
- Caused by group communication or public knowledge of treatment groups
-
How to mediate impact of atypical behavior threats?
- NOT conrolled by random assignment
- Monitor the research project
- If possible prevent groups from communicating with each other
-
How is external validity determined?
look at threats to the sample which prevents generalizing to the population of interest
-
What is a population?
Complete set of observations about which we draw conclusions
-
What is an experimentally accessible population?
the subset of the population from which the sample is drawn
-
What is a sample?
Actual observations included in the study
-
What are the steps in selecting a sample?
- 1. Define the observation unit (individual or object)
- 2. Define the target population
- 3. Define the boundaries of the population (affects generalizability)
- 4.Define the sampling technique
- 5. Obtain a sampling frame to use with the technique
- 6. Select the sample
-
What are the types of samples?
- Probability samples
- Simple random, Systematic, stratified, cluster
- Non-probability samples:
- Convenience, quota, purposive
-
What is a simple random sample?
- Every item has an equal chance of being selected
- Has no control on sample make-up
-
What is s systmatic sampling?
- Select every kth element of sample
- Starting point should be randomly selected
-
What is stratified sampling?
- Separate samples are created for each strata (characteristic of interest)
- Ensures each characteristic of interest is represented
- Can be weighted in relation to population
-
What is cluster sampling?
- The population is divided into heterogeneous clusters
- Clusters should be already formed in the population
- One cluster is randomly selected
-
What is multi-stage cluster sampling?
- Two step procedure after clusters are identified
- 1. Random selection of a cluster
- 2. Random sample in each cluster, often based on strata
-
What is probalistic cluster sampling?
- Random or stratified random (can underrepresent larger units)
- Probability Proportionate to size (units are weighted based on size of unit)
-
What is convenience sampling?
- Non-probabalistic sampling
- Based on easily accessible elements in a population
- Subject pools
- Volunteers
- Often subject to all external validity threats
-
What is quota sampling?
- Non-probabalistic sampling
- Like stratified without random element
- Begin with a matrix targeting characteristics
- Collect data from each person having the characteristics
- Tries to represent the population
-
What is purposive sampling or judgmental sampling?
- Non-probabalistic sampling
- Used for pilot testing or manipulation
- Use when members of subset are easy to identify but hard to include all
- Use when only generalizing to that subset
-
What is sampling error?
- The difference between the true score of the sample and the true score of the population
- Can be estimated with probalistic sampling methods
- NOT an error made in creating the sample
-
What is sampling frame: coverage bias
- A listing of the population from which a sample will be drawn
- As representative population is formed, consider if groups are not represented how results will be impacted
-
What is non-response bias?
- Bias results from non-response if they are different than responders
- Is there a systematic difference between those that did not participate and those that did?
- Can be accounted for through statistical modeling
-
What is external validity?
- How well do your findings generalize across settings, samples and times?
- How can we generalize to the population on the basis of sample-based results?
-
What are threats to external validity?
- Interaction of selection and treatment
- Interaction of setting and treatment
- Interaction of history and treatment
-
What are threats to interaction of selection and treatment?
- Occurs when treatment effects only generalize to those selected in the same way as the sample
- Applies to non-probalistic sampling
- Solutions
- random selection
- make participation convenient
-
What are threats to interaction of setting and treatment?
- Treatment effects generalize only to settings used in the study
- Solution:
- Vary the settings and analyze IV/DV within settings
-
What are threats to interaction of history and treatment?
- Look at when you collected your sample, will it generalize across other time frames
- Solution:
- Replicate the study at different times
- Conduct research review to see if prior evidence refutes the relationship
-
What is sampling error?
Sampling error is the discrepency between a sample statistic and the corresponding population parameter.
-
What are null and alternative hypothesis?
- Null: What is assumed to be true about a situation at the start of an experiment
- Alternative: is true is null is false, usually research is trying to prove null is no longer true because of treatment
-
What is a Type I error?
- Null hypothesis is true but you reject it and conclude the alternative is true
- You are concluding treatment was effective when it was not
-
What is a Type 2 error?
- The null hypothesis is false but you decide not to reject it.
- You conclude the treatment was ineffective when it was
-
What is significance level?
- The probability of making a Type I error
- Most common is .05 (needs to be lower is loss of income, job, health, life is at stake)
- Defines an unlikely sample assuming a null hypothesis is true
-
What is power?
- The probability you correctly reject a false null hypothesis
- or
- We reject null when alternative is true
- (1-BETA)
-
What are influences on power?
- Increase the following to increase power
- sample size
- significance level (but inc significance makes errors more likely)
- effect size
- Increasing variance DECREASES power
|
|