
zscore
A value computed by dividing the deviation about the mean (xi − ) by the standard deviation s. A zscore is referred to as a standardized value and denotes the number of standard deviations xi is from the mean.

Box plot
A graphical summary of data based on a fivenumber summary.

Chebyshev’s theorem
A theorem that can be used to make statements about the proportion of data values that must be within a specified number of standard deviations of the mean.

Coefficient of variation
A measure of relative variability computed by dividing the standard deviation by the mean and multiplying by 100.

Correlation coefficient
A measure of linear association between two variables that takes on values between –1 and +1. Values near +1 indicate a strong positive linear relationship; values near –1 indicate a strong negative linear relationship; and values near zero indicate the lack of a linear relationship.

Covariance
A measure of linear association between two variables. Positive values indicate a positive relationship; negative values indicate a negative relationship.

Empirical rule
A rule that can be used to compute the percentage of data values that must be within one, two, and three standard deviations of the mean for data that exhibit a bellshaped distribution.

Fivenumber summary
An exploratory data analysis technique that uses five numbers to summarize the data: smallest value, first quartile, median, third quartile, and largest value.

Grouped data
Data available in class intervals as summarized by a frequency distribution. Individual values of the original data are not available.

Interquartile range (IQR)
A measure of variability, defined to be the difference between the third and first quartiles.

Mean
A measure of central location computed by summing the data values and dividing by the number of observations.

Median
A measure of central location provided by the value in the middle when the data are arranged in ascending order.

Mode
A measure of location, defined as the value that occurs with greatest frequency.

Outlier
An unusually small or unusually large data value.

Percentile
A value such that at least p percent of the observations are less than or equal to this value and at least (100 − p) percent of the observations are greater than or equal to this value. The 50th percentile is the median.

Point estimator
The sample statistic, such as , s2, and s, when used to estimate the corresponding population parameter.

Population parameter
A numerical value used as a summary measure for a population (e.g., the population mean, μ, the population variance, σ2, and the population standard deviation, σ).

Quartiles
The 25th, 50th, and 75th percentiles, referred to as the first quartile, the second quartile (median), and third quartile, respectively. The quartiles can be used to divide a data set into four parts, with each part containing approximately 25% of the data.

Range
A measure of variability, defined to be the largest value minus the smallest value.

Sample statistic
A numerical value used as a summary measure for a sample (e.g., the sample mean, , the sample variance, s2, and the sample standard deviation, s).

Skewness
A measure of the shape of a data distribution. Data skewed to the left result in negative skewness; a symmetric data distribution results in zero skewness; and data skewed to the right result in positive skewness.

Standard deviation
A measure of variability computed by taking the positive square root of the variance.

Variance
A measure of variability based on the squared deviations of the data values about the mean.

Weighted mean
The mean obtained by assigning each observation a weight that reflects its importance.

