What is statistics?
A field of study that involves methods for describing and analyzing data. It reduces uncertainty and provides for better decision making.
What is a population
Universe of cases or subjects of interest to the analyst. People, thins or concepts
What is a sample
An observable subset of the population. It needs to mirror the population.
What are some types of samples?
Random and Non Random
What is a random sample
All units have an equal chance of being included in the sample.
What are 3 ways to obtain a random sample?
- 1. Using a table of random numbers
- 2. Computer generated random samples
- 3. Software selected random sample
Describe a simple random sample.
- -Assign all elements a number, in a class of 25 - assign 1 - 25
- -Determine the sample size example: 5
- -Use a table to assign random numbers to provide 5 random numbers between 1 and 25
Describe systematic sample
- -Produce names of population
- -Determine sample size (5)
- -Divide the total (25) by the sample size (5) = 5
- -Take every 5th name on the list for inclusion in the sample
- -If the 5th person refuses, the analyst must begin the count to 5 again.
Describe a stratified sample
Divide the number of STRATA (groups) that share similar characteristics. Draw random samples from each Stratum.
Cluster Sampling is?
Sampling based on selecting clusters from a population and then sampling from those clusters. Examples: geography - rural, suburbs, city
Describe the differences between stratified and cluster sampling
Cluster samples only include a subset of the clusters. Stratified samples include all of the strata. Stratified samples allow for more precision.
Name some non random sampling types
- 1. Convience samples - surveying the first 10 people in a parking lot
- 2. Volunteers- American Idol
- 3. Judgemental sample- a sample based on expert judgment
- 4. Quota sample - convience sample designed to provide a certain distribution
Define sampling error
The difference between the sample and the larger population that is due to pure random chance .
What is true of sample error.
As sample size increases sampling error decreases .
The differences between a sample and the population that are not do to pure random chance.
Describe a fact about sampling bias.
Unlike sampling error, sampling bias will not decrease as your sampling number increases .
Describe some sources of selection bias
- -A group that is under represented in your samples.
- -A group that fails to respond to your survey non response bias
- -A group that self select so as the sample, American Idol.
Define measurement error
Inaccuracy or miscalculation of the observation , caused by unclear questions, leading questions, questions containing social desirability componant.
Does the instrument measure what it intended to measure.
Does the instrument provide constant results over repeated measurements.
Name and describe 4 dimensions to validity
- 1. Face - does the anayst have confidence in the measuring instrument
- 2. Content- concerned with the sample population representatives-
- 3. Correlative - the results have a high correlation to other established measures of validity
- 4. Predictive- the resilts should be able to successfully predict outcomes gre = success in a graduate program
Define external validity
Results that can be readily generalized to the larger population are said to have external validity.
Define internal validity
Did I measure what I claimed to measure by eliminating all confounding variables
List and describe 8 threats to internal validity
- 1. History - external events that produce an effect that can be confused with the outcome. school program success vs economic boom happening at same time
- 2. Maturation - internal factors that can be confused with outcome - treated allergies that resolve over time, due to tx or due to growth of child
- 3. Testing- measuring a person that can produce the effect confusing the outcome- Stalins arrival improves productivity
- 4. Instrumentation - changes in the measurement tool
- 5. Statistical regression to the mean - selection of a group due to their deviance from the mean - odds are that next measurement that group will regressed to
- 6. Selection bias
- 7. Experimental mortality- subjects dropping out of study will change the composition of the sample.
- 8. Selection-Maturation Interaction -any bias in selection will interact with maturation to produce a greater effect than maturation alone
Name 3 research design techniques
- 1. Pre-experimental - policy is changed and later a decision is made to evaluate the policy
- 2. Quasi Experimental - uses a comparison group. Ex: impact of affirmative action on female employment in shipyards.
- 3. Experimental -includes randomization componant. Participants randomly selected and randomly assigned to experimental or control group.
Describe an example of the paradox of internal and external validity
Real world clinical trials often may have a drug that is valid in a controlled setting (high internal value) but not effective in the real world where patients don't follow direction (low external validity)
List 4 levels of measurement
- 1. Nominal - catagorizing information. Hair Color: 1= blond 2= brunette 3= other
- 2. Ordinal - ranked in order of some rype of continuum. 1= strongly agree 2= agree 3= neutral 4= disagree 5= strongly diagree
- 3. Interval - regular numbers where distance between the numbers is the same and all numbers anchored by an arbitrary zero - IQ, Temp, Test scores
- 4. Ratio Scale distance between points is equal and anchored by a non arbitrary zero. Hourly wage, height, weight, age, miles driven in a day
Give some examples of Nominal data
- -Marital status
Give some examples of Ordinal Data
- -Movie ratings
- -Scio economic status
- -Rating of meat in the store
- -Rank order of anything
Name some examples of Interval Data
- -Degrees F OR C
- -Most personality measures
- -Intelligence scores
List some examples of Ratio Data
- -Annual income in dollars
- -Distance as measured in miles, inches, centimeters etc..
What arithmetical operation is used for Nominal Data
What arithmetical operations can be used for Ordinal Data?
Gretaer than or less than
What arithmetical operation is permitted with Interval Data?
Addition and subtraction of the scale values
What arithmetical operation is permitted with Ratio Data?
Multiplication and division of scale values
Name the measures of central tendency begining with the most commonly used.
- 1. Mean
- 2. Mode
- 3. Median
- 4. Trimmed Mean
A measurement of central tendoncy that is equal to the score that occurs most often in the distribution.
The score that divides the distribution in half. =(n + 1) / 2 will identify the positon of the median
Define the Mean
The arithmetic average of scores. Sum the scores and divide by the number of scores.
Which measures of central tendoncy are not useful in statistical decsion making but may be helpful interms of describing them
Mode and Median