Reliability and Validity

Home

Get App

Take Quiz

Create

What is a construct?
- An underlying trait that is reesponsible for some observeable behavior
- Indirect measure
What is construct validity?
- Refers to inferences made from measured variables to theoretical constructs
- Examines how well the assessment “matches up with” the construct
What is validity?

“the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of the test”
What are the two threats to validity?
- Construct Underrepresentation (Less)
- Construct Irrelevant Variance (More)
What is construct underrepresentation?
- Not measuring the construct as broadly as intended
- “the degree to which a test fails to capture important aspects of the construct”*
What is construct irrelevant variance

Measuring consistently something that is not part of the construct
What are the five sources of validity evidence?
- Evidence based on Test Content
- Evidence based on Response Processes
- Evidence based on Internal Structure
- Evidence based on Relations to Other Variables
- Evidence based on Consequences of Testing
What does test content refer to?
- Analysis of the relationship between the test’s content and the construct of interest
- Themes, wording, and format of the items, tasks, or questions on a test, as well as the procedural guidelines for its administration and scoring
What does response process validity refer to?
- Analyses of the response processes of examinees is used to determine the fit between the construct and the examinee’s actual performance or response
- Includes the processes of judges, raters, or observers when evaluating examinee’s performances .
What types of evidence support content validity?
- Use logical or empirical analysis
- Most frequently relies of “expert” judgment
- Part of test development
What types of evidence support response process validity?
- Theoretical and empirical
- Process focused on a study of examinees
- Interview test takers
- Think alouds
- Examine rating or scoring process
What does internal structure refer to?

The degree to which the relationship among test items and test components conform to the construct which the interpretations are based.
What types of evidence support internal structure validity?
- Item analysis
- Reliability
- Factor analysis
What does external validity refer to?
- relationship between the test scores and variables outside of the test
- ie: GRE predicts student success in graduate school
- Relationship to other tests
What types of evidence support external validity?
- Convergent and discriminant evidence
- Test criterion evidence (predicts and a function of purpose)
- Validity Generalization (Can the instrument be generalized to other situations)
What is convergent evidence?

Relationship between scores on measures of the same construct
What is discriminant evidence?

Relationship between test scores and measures of different constructs, should not be related
What does consequential (consequences of testing) refer to?
- Intended consequences happen
- Unintended consequences are not occuring especially issues of bias, fairness
What types of evidence support consequences of testing?
- Intended
- Shaping the curriculum
- Teaching the content of the test
- Unintended Consequences
- Narrowing the curiculu
- Teaching to the test and only the test
- Subpopulation differences
- Develops over time
What is internal validity?

The extent to which a cause and effect relationship is isolated from competing influences
What is the purpose of internal validity?
- Determines the soundness of conclusions/interpretations of a causal relationship
- Are the variables influenced by different variables
What are the four categories of threats to internal validity?
- Time threats
- Group threats
- Mortality
- Atypical behavior
What is a time threat in internal validity?

Impacts on dependent variable over time are different because of factors other than the treatment variable
What are group threats to internal validity?
- Explanations for changes in the variables other than experimental differences created by the researcher
- Selection of the group may cause
What are mortality threats to internal validity?
- Loss of subjects in a study
- Impacts all studies long enough to have dropouts
What are atypical behavior threats to internal validity?
- Research design cannot eliminate
- Actions of those in the groups which change the treatment (Treatment and control group interact about the intervention)
What is testing as a time threat and possible solutions?
- The effect of the first test on the second
- Effect of a publication about the treatment
- Solutions:
- Lengthen time interval between tests
- Disguise use of prestest (don't disclose next test)
- Use control groups
What is history (time threat) and possible solutions?
- An event that occurs during treatment that affects subject response
- Usually events that could be controlled
- Solutions
- Use control groups
- Use shorter time
What is maturation as a time threat and possible solutions?
- Naturally occuring process within participants that occur because of time and may change their performance
- Inc. fatigue, boredom, growth, intellectual development
- Solutions:
- Shorten the time of the study
- Use a control group with a similar maturation rate
What is instrumentation as a time threat and possible solutions?
- Changes in measurement procedure in a pretest - posttest study
- Inc: calibration, rater changes, score use
- Solutions:
- control group
- standardize the measurement procedure
What is pre-experimental research design?
- One group with post-test only such as pilot testing
- Or
- One group with pre-and post test
- These are weakest designs because they do not strongly link group changes to treeatment
What are quasi experimental designs?
- Designs that involve a control group but do not use random selection
- Or
- One experimental group with multiple tests, the first is a baseline
What are possible threats to quasi experimental design?
- Time threats
- Selection (group changes at time of intervention)
- Instrumentation
- Maturation
What is experimental design?
- Pre-test, post test, control group design w/random assignment
- Post test control group design with random assignment
- Still has threats to validity but usually eliminates group threats
How to improve internal validity?
- Use random assignment
- Use Pretest
- Use control/comparison group
What are types of statistical designs?
- Within-subject designs
- Between subjects designs
- Mixed designs
What are types of experimental designs?
- Pre-experimental designs
- Quasi experimental designs
- Experimental design
What is a within subject design?
- Measures individuals within a group multiple times both before and after treatment
- Change over time is of interest
- Applies to pre-experimental and quasi experimental
What is between subjects design?
- Compares the scores of two or more groups
- Group comparison are of interest
- Applies to quasi-experimental and experimental designs
What is mixed design (split-plot) ?
- Both between and within designs used
- ie: look at individual student scores as a result of a treatment in two classes
- Applies to quasi-experimental and experimental
How can you control for differences between groups?
- Statistical adjustment based on theoretical argument
- Matching of variables between groups
- Random assignment creates random equivalence on all variables
- Matching and Random Assignment
What are group threats to internal validity?
- Regression towards the mean
- Selection
- Selection by time interactions
- selection by maturation
- selection by history
What is regression to the mean?
- Applies to pre-post design
- Scores (both high and low) regress to the mean
- Inflates low group and deflates high group change
- Impacts in gain score situations with extreme groups
How to mediate the effects of regression to the mean?
- Avoid comparison of extreme groups if possible
- Retest-retest to establish a more "stable" baseline
- Use a control group for each extreme
- Use high reliability measures
What is selection threat?
- A threat that is due to the different group characteristics present at the start of the study
- Affects all quasi-experimental studies
What are interactions with selection?
- Internal validity threats interact with selection to produce effects not due to treatment
- Selection maturation (mature at different rates)
- Selection history (groups from different settings so different events)
How to mediate the effects of selection and interactions?
- Random assignment
- Matching (only relates to variables being matched)
- Random assignment and matching
- Check pretest equivalence applies to quasi and true experimental design (ANOVAs testing for this)
What are mortality threats?
- subjects leave the study for different/systematic non-random reasons
- results in selection artifact because posttest group is different
How to mediate impact of mortality threats?
- Control groups if mortality cause is the same
- Shorter time interval between start and finish
- Monitor incidence of mortality
- Use pretest to compare scores of those who dropped and those who did not
What are atypical behavior threats?
- not considered part of validity by all because cannot be controlled by design
- Differences between groups not caused by the design
- Caused by group communication or public knowledge of treatment groups
How to mediate impact of atypical behavior threats?
- NOT conrolled by random assignment
- Monitor the research project
- If possible prevent groups from communicating with each other
How is external validity determined?

look at threats to the sample which prevents generalizing to the population of interest
What is a population?

Complete set of observations about which we draw conclusions
What is an experimentally accessible population?

the subset of the population from which the sample is drawn
What is a sample?

Actual observations included in the study
What are the steps in selecting a sample?
- 1. Define the observation unit (individual or object)
- 2. Define the target population
- 3. Define the boundaries of the population (affects generalizability)
- 4.Define the sampling technique
- 5. Obtain a sampling frame to use with the technique
- 6. Select the sample
What are the types of samples?
- Probability samples
- Simple random, Systematic, stratified, cluster
- Non-probability samples:
- Convenience, quota, purposive
What is a simple random sample?
- Every item has an equal chance of being selected
- Has no control on sample make-up
What is s systmatic sampling?
- Select every kth element of sample
- Starting point should be randomly selected
What is stratified sampling?
- Separate samples are created for each strata (characteristic of interest)
- Ensures each characteristic of interest is represented
- Can be weighted in relation to population
What is cluster sampling?
- The population is divided into heterogeneous clusters
- Clusters should be already formed in the population
- One cluster is randomly selected
What is multi-stage cluster sampling?
- Two step procedure after clusters are identified
- 1. Random selection of a cluster
- 2. Random sample in each cluster, often based on strata
What is probalistic cluster sampling?
- Random or stratified random (can underrepresent larger units)
- Probability Proportionate to size (units are weighted based on size of unit)
What is convenience sampling?
- Non-probabalistic sampling
- Based on easily accessible elements in a population
- Subject pools
- Volunteers
- Often subject to all external validity threats
What is quota sampling?
- Non-probabalistic sampling
- Like stratified without random element
- Begin with a matrix targeting characteristics
- Collect data from each person having the characteristics
- Tries to represent the population
What is purposive sampling or judgmental sampling?
- Non-probabalistic sampling
- Used for pilot testing or manipulation
- Use when members of subset are easy to identify but hard to include all
- Use when only generalizing to that subset
What is sampling error?
- The difference between the true score of the sample and the true score of the population
- Can be estimated with probalistic sampling methods
- NOT an error made in creating the sample
What is sampling frame: coverage bias
- A listing of the population from which a sample will be drawn
- As representative population is formed, consider if groups are not represented how results will be impacted
What is non-response bias?
- Bias results from non-response if they are different than responders
- Is there a systematic difference between those that did not participate and those that did?
- Can be accounted for through statistical modeling
What is external validity?
- How well do your findings generalize across settings, samples and times?
- How can we generalize to the population on the basis of sample-based results?
What are threats to external validity?
- Interaction of selection and treatment
- Interaction of setting and treatment
- Interaction of history and treatment
What are threats to interaction of selection and treatment?
- Occurs when treatment effects only generalize to those selected in the same way as the sample
- Applies to non-probalistic sampling
- Solutions
- random selection
- make participation convenient
What are threats to interaction of setting and treatment?
- Treatment effects generalize only to settings used in the study
- Solution:
- Vary the settings and analyze IV/DV within settings
What are threats to interaction of history and treatment?
- Look at when you collected your sample, will it generalize across other time frames
- Solution:
- Replicate the study at different times
- Conduct research review to see if prior evidence refutes the relationship
What is sampling error?

Sampling error is the discrepency between a sample statistic and the corresponding population parameter.
What are null and alternative hypothesis?
- Null: What is assumed to be true about a situation at the start of an experiment
- Alternative: is true is null is false, usually research is trying to prove null is no longer true because of treatment
What is a Type I error?
- Null hypothesis is true but you reject it and conclude the alternative is true
- You are concluding treatment was effective when it was not
What is a Type 2 error?
- The null hypothesis is false but you decide not to reject it.
- You conclude the treatment was ineffective when it was
What is significance level?
- The probability of making a Type I error
- Most common is .05 (needs to be lower is loss of income, job, health, life is at stake)
- Defines an unlikely sample assuming a null hypothesis is true
What is power?
- The probability you correctly reject a false null hypothesis
- or
- We reject null when alternative is true
- (1-BETA)
What are influences on power?
- Increase the following to increase power
- sample size
- significance level (but inc significance makes errors more likely)
- effect size
- Increasing variance DECREASES power

Author

hoosera

170515

Card Set

Reliability and Validity

Description

construct validity, reliability, internal validity, external validity, statistical conclusion validity

Updated

2012-10-15T22:49:04Z