What type of statistics summarize or describe relevant characteristics of data?
What type of statistics makes inferences, or generalizations about a population?
What are the (4) Measures of center?
How do you find the mean?
Sum all values then divide by the number of values. (The average).
What is sensitive to extreme values and tends to vary less than other measures of center?
How do you find the median?
- 1. sort the data
- 2. odd number of values, median is the value in the exact center.
- 3. even number of values, add the two middle numbers and divide by two.
How do you find the mode?
The mode is the value that occurs most frequently.
What is often a good choice if there are some extreme values when looking at the measures of center?
What is a good measure of center for data at the nomial level of measurement?
How do you find the midrange? (rarely used)
What are the modes a data set can have?
What is the rounding-off rule for the mean, median, and midrange?
carry one more decimal place than is present in the orginal set of values.
What are the advantages of the mean
- relatively reliable
- takes every data value into account
The _____ is sensitive to every value, just one extreme can affect it dramatically. Therefore, we say the mean is not a ______ _______ __ _______.
- resistant measure of center
When data values are assigned different weights it's called
What is the formula for the mean of a frequency distribution?
- Multiply each frequency and class midpoint, then add the products, divide by the sum of frequencies.
What is the formula for the weighted mean?
- Multiply each weight w by the corresponding value x, then add the products, finally divide that total by the sum of the weights.
What are the measures of variation?
How is the Range of a set of data values calculated?
Range = (maximum data value)-(minimum data value)
What is the standard deviation?
a measure of variation of all values from the mean
The value of the standard deviation is ________.
It is never _________.
Larger values of s indicate __________ amounts of variation.
The value of the standard deviation s can increase dramatically with the inclusion of one or more ___________.
How do you calculate the standard deviation?
- 1. compute the mean.
- 2. subtract the mean from each individual value. (a list of deviations of the form (x-mean)
- 3. square each of the the values in step two (x-mean) squared
- 4. Add all of the obtained squares
- 5. Divide the total by the number n-1 (1 less than the total number of samples values)
- 6. Find the square root of the result.
When comparing variation in samples with very different means, it is better to use the _____ __ _________.
coefficient of variation.
What is the variance of a set of values for a sample and the population?
sample variance = square of the standard deviation s.
Population variance = square of the population standard deviation sigma.
What is the Range Rule of Thumb for the Standard Deviation?
- Minumum "usual" value = (mean) - 2 x (standard deviation)
- Maximum "usual" value = (mean) + 2 x (standard deviation)
How is the estimating of the standard deviation s done?
For many data sets, a value is unusual if it differs from the mean by more than _____ standard deviations.
The empirical rule states: that for data sets having a distribution that is approximately bell-shaped, the following properties apply:
1. About ___% of all values fall within 1 standard deviation of the mean.
2. About ___% of all values fall within 2 standard deviations of the mean
3. About ___% of all values fall within 3 standard deviations of the mean.
What is Chebyshev's Theorem?
The proportion of any set of data lying within K
standard deviations of the mean is always at least
, where K
is any positive number greater than 1.
What is the Mean Absolute Deviation (MAD)
- the mean distance of the data from the mean
What is the coefficient of variation ?
- for a set of nonnegative sample or population data, expressed as a percent, describes the standard deviation relative to the mean, and is gvien by the following:
What is a z score?
the number of standard deviations that a data value is from the mean.
What is the round-off rule for z scores:
round to two decimal places.
Whenever a data value is less than the mean, its corresponding z score is ___________.
How is the z score calculated?
What are the Ordinary and unusual values for a z score?
- Ordinary values: -2 z score 2
- Unusual values: z score < -2 or z score > 2
What are Percentiles?
measures of location, denoted p1, P2,...Pn, which divide a set of data into groups with about 1% of the values in each group.
How is the percentile of a data value found?
- percentile of value x = [(number of values less than x)
- (total number of values)] *100
round the result to the nearest whole number
How do you convert from the kth percentile to the corresponding data value?
- 1. sort data lowest to highest.
- 2. Compute L = [(percentile in question)(100)]*number of values
- 3. If L is a whole number the value of the kth percentil is midway between the lth value and the next value in the sorted set of data. Find P by adding the lth value and the next value and divide by 2.
- 4. If L is NOT a whole number: Change L by rounding it up to the next larger whole number. The value of P is the lth value, counting from the lowest.
What are Quartiles?
Quartiles are measures of location which divide a swet of data into four groups with about 25% of the values in each group.
What is the Procedure for Constructing a Boxplot?
- 1. Find the 5-number summary (minimum value, Q1, median, Q3, maximum value)
- 2. Construct a scale with values that include minimum and maximum data valuyes.
- 3. Construct a box extending from Q1 to Q3, draw a line in the box at the median value.
- Draw lines extending outward from the box to the minmum and maximum data values.
What does a boxplot tell/show us?
Give us information about the distribution and spread of the data and often great for comparing two or more data sets.
What is a modified boxplot?
modified boxplots represent outliers as special points.
What are the modifications in a modified boxplot?
- 1. a data value is an outlier if it is:
- - above Q3 by an amount greater than 1.5 x IQR
- - below Q1 by an amount greater than 1.5 x IQR
- 2. Solid horizontal line extends only as far as the minimum data value that is not an outlier and the maximum data value that is not an outlier.
What is the Inner Quartile Range? (IQR)
Q3 - Q1 = IQR
What 10 key factors should be considered when designing or analyzing data?
- 1. context of the data
- 2. source of the data
- 3. sampling method
- 4. measures of center
- 5. measures of variaton
- 6. distribution
- 7. ouliers
- 8. changing patterns over time
- 9. conclusions
- 10. practical implications.
How are class width's calculated?
- class width = (max data value)-(min data value)
- number of classes
Numbers are usually rounded up