-
z-score
A value computed by dividing the deviation about the mean (xi − ) by the standard deviation s. A z-score is referred to as a standardized value and denotes the number of standard deviations xi is from the mean.
-
Box plot
A graphical summary of data based on a five-number summary.
-
Chebyshev’s theorem
A theorem that can be used to make statements about the proportion of data values that must be within a specified number of standard deviations of the mean.
-
Coefficient of variation
A measure of relative variability computed by dividing the standard deviation by the mean and multiplying by 100.
-
Correlation coefficient
A measure of linear association between two variables that takes on values between –1 and +1. Values near +1 indicate a strong positive linear relationship; values near –1 indicate a strong negative linear relationship; and values near zero indicate the lack of a linear relationship.
-
Covariance
A measure of linear association between two variables. Positive values indicate a positive relationship; negative values indicate a negative relationship.
-
Empirical rule
A rule that can be used to compute the percentage of data values that must be within one, two, and three standard deviations of the mean for data that exhibit a bell-shaped distribution.
-
Five-number summary
An exploratory data analysis technique that uses five numbers to summarize the data: smallest value, first quartile, median, third quartile, and largest value.
-
Grouped data
Data available in class intervals as summarized by a frequency distribution. Individual values of the original data are not available.
-
Interquartile range (IQR)
A measure of variability, defined to be the difference between the third and first quartiles.
-
Mean
A measure of central location computed by summing the data values and dividing by the number of observations.
-
Median
A measure of central location provided by the value in the middle when the data are arranged in ascending order.
-
Mode
A measure of location, defined as the value that occurs with greatest frequency.
-
Outlier
An unusually small or unusually large data value.
-
Percentile
A value such that at least p percent of the observations are less than or equal to this value and at least (100 − p) percent of the observations are greater than or equal to this value. The 50th percentile is the median.
-
Point estimator
The sample statistic, such as , s2, and s, when used to estimate the corresponding population parameter.
-
Population parameter
A numerical value used as a summary measure for a population (e.g., the population mean, μ, the population variance, σ2, and the population standard deviation, σ).
-
Quartiles
The 25th, 50th, and 75th percentiles, referred to as the first quartile, the second quartile (median), and third quartile, respectively. The quartiles can be used to divide a data set into four parts, with each part containing approximately 25% of the data.
-
Range
A measure of variability, defined to be the largest value minus the smallest value.
-
Sample statistic
A numerical value used as a summary measure for a sample (e.g., the sample mean, , the sample variance, s2, and the sample standard deviation, s).
-
Skewness
A measure of the shape of a data distribution. Data skewed to the left result in negative skewness; a symmetric data distribution results in zero skewness; and data skewed to the right result in positive skewness.
-
Standard deviation
A measure of variability computed by taking the positive square root of the variance.
-
Variance
A measure of variability based on the squared deviations of the data values about the mean.
-
Weighted mean
The mean obtained by assigning each observation a weight that reflects its importance.
|
|