
The general purpose of statistics
is to describe data and make inferences using statistics from samples.

Population
is a large body of data that we are interested in describing.

Parameters (Greek letters)
are used to describe populations, e.g., mean, variance, standard deviation.

Sample
(a subset of the population

Randomization
is the process of assigning individuals at random to conditions in an experiment.

Sampling error
the statistic is generally not equal to the parameter it is estimating. This difference constitutes sampling error.

Random Samples
samples taken so that every element has an equal probability of being selected. The attempt is to obtain a “representative” sample of the population.

Frequency distribution
organizing data in terms of frequencies


median
real limits, the 50th percentile


Measures of variability
variance, standard deviation, range (real limits), and interquartile range (real limits, the middle 50 percent). Note that about 70% of scores are between plus or minus one standard deviation in a symmetric unimodal distribution

Linear transformations
transform the data to obtain a particular mean and variance. They generally affect the mean and variance.

Distributional shapes
symmetric, positively skewed, and negatively skewed

Unimodal symmetric distribution
(mean=median=mode)

Normal distribution
 a unimodal symmetric probability distribution frequently used in statistics.
 Know how to compute the z score and how to use the normal table
 Z = (score – mean )/ standard deviation

Binomial distribution
 used when we have a dichotomous event (on‐off, sick‐not sick, etc. ) Normal approximation to binomial (real limits) Compute the z score,
 Z=( X –np) / Sqrt(npq)

Sampling distributions
 the distribution (frequency) of a statistics over many, many, many (all possible samples) samples of size n taken from the population. (This is a hypothetical distribution.)
 of the mean is the distribution of the mean over many, many, samples.

sampling distribution of the mean
is called the expected value and it is equal to the population mean, μ. Because of this property, we call the sample mean an unbiased estimator. Its mean is equal to the population parameter that we are trying to estimate.

The standard deviation of the sampling distribution of the mean
is called the standard error. The standard error is equal to the population standard deviation over the square root of n

Central Limit Theorem
the sampling distribution of the mean approaches the normal distribution with mean μ and standard deviation. It will be exactly normal when the population is either normal or n is very large. In practical applications we use the normal when n ≥ 30.

Note that as n (sample size)
 increases the standard error of the mean decreases collapsing over
 mu.

