-
Central Limit Theorem
the sampling distribution of a statistic (x-bar, p-hat) is approximately normal whenever the sample is large and random.
-
Confidence Interval
an estimate of the value of a parameter in interval form with an associated level of confidence; it gives a list of plausible values for the parameter based on the value of the statistic
-
Confidence Level
The percentage of all possible samples for which the confidence intervals will contain the parameter being estimated; selected subjectively by the researcher (95%, 98%)
the percent of time that the confidence interval estimation procedure gives confidence intervals that contain the value of the parameter
-
Control Chart
A chart Plotting the means (x-bars) of regular samples of size n against time, it has a center line and upper and lower control limits to determine whether a process is in or out of control.
-
Control Limits
Lines on either side of the center line computed using μ-3(σ/√n) and μ+3(σ/√n)
-
Convenience Sample
A sample type where the researcher contacts those subjects who are readily available and does not use any random selection. Results are almost always biased.
-
Deviation
the difference (or distance) between an observation and the mean of all the observations in a data set, or the difference between an observation and the corresponding regression model estimate.
-
Expected Count
an estimate of how many observations should be in a cell of a two way table if Ho is true (no association between row and column variables)
-
Explained variation
the amount of total variation in the y's that is accounted for by a regression model; it is equal to ∑(yhat - ybar)2
-
extrapolation
predicting a y value for an x value that is outside the range of observed x's. dangerous and discouraged.
-
F-distribution
the distribution that models the ratio of two variance estimates; used in ANOVA for obtaining the p-value for testing equality for 3 or more means.
-
Five-Number summary
minimum, Q1, median, Q3, maximum; used when data are very skewed or outliers present
-
interquartile range
difference between Q3 and Q1; or the length of the box in a boxplot; contains 50% of the data
-
law of large numbers
the mean of observed values in a sample (x-bar) will tend to get closer and closer to μ as the sample size increases
-
marginal distribution
the distribution of only one variable in a two way table (the percentages for a single row or column)
-
Multiple analyses
performing two or more test of significance on the same data - INFLATES the overall α.
-
Observed Count
the actual count in a sample given in a two way table
-
observed effect
the difference between the observed value of the statistic and the hypothesized value of the corresponding parameter (xbar - μo)
-
what makes process Out of Control
one sample mean outside the control limits, or nine sample means in a row above or below the center line in a control chart.
-
parameter
a characteristic (mean, median, proportion) of the population
-
Power
1-β; probability of making a correct decision by rejecting a false null hypothesis;
increases when α increases, or when n increases
-
practical significance
when the difference between the observed statistic and claimed parameter value is large enough to be worth reporting (only assess if results are statistically significant)
-
Prediction Interval
an interval estimate of plausible values for a single observation of Y at a specified value of X
-
r-squared
the percentage of total variation in y that is explained by x
-
Residual
the difference between the actual y and the predicted y
-
Sampling Distribution of X-bar
a distribution of the sample mean; a list of all the possible values for x-bar together with the frequency of each value
-
Sampling Distribution of P-hat
a distribution of the sample proportion; a list of all the possible values for p-hat together with the frequency of each value
-
Significance Level
α; probability of making a Type I error (rejecting a true null hypothesis)
-
Standard Deviation of P-hat
Variability of samp. dist. of p-hat; √(p(1-p)/n)
-
Standard Deviation of X-bar
variability of samp. dist. of x-bar; σ/√n
-
Stratified Sample
population is divided into strata based on a characteristic and SRS is taken from each strata
-
t-test (when needed?)
test of significance, used when σ is unknown
-
Type I error
when a true null hypothesis is rejected (believing Ha is true, when Ho is true)
-
Type II error
when a false null hypothesis is not rejected (believing Ho, when Ha is true)
-
z-score
the number of standard deviations a value or observation is from the mean
-
What is matched pairs?
data where 2 measurements are taken at different times (or under different conditions) on each individual in a sample (one sample, two treatments)
-
P-value
the probability of getting a value of the test statistic as extreme or more extreme than the value actually observed, assuming Ho is true.
-
What is the probability that the null hypothesis is true?
1 or 0. it is or it isn't.
-
Margin of Error
the maximum amount that a statistic will differ from the value of the parameter it estimates for the middle --% (90, 95, 98) of statistics.
-
Chi-squared: what size does each expected count in each cell need to be or larger?
5
-
chi-squared: what are the degrees of freedom?
(r-1)(c-1) where r=number of rows, and c=number of columns
-
Chi-squared: what is Ho?
there is no association between the rows and columns variables
-
Chi-squared: what is Ha?
There is an association between the rows and columns variables.
|
|