
Raw scores
they are meaningless because we dont know whats good or bad, high or low

Relative scale
 most test scores are judged on a relative scale
 relative to other test takers

Summary statistics
 about summarizing single variables
 focus on quantitative (numerical)variables
 start with a "bag of data" (collection of numbers)
 consists of one or (usually) more variables

what does summarizing mean
 making information more concise (shorter)
 summarizing depends on the sample size (N)
 if N is large, we need to be very concise
 if N is small, we can be less concise (more complete)

Sorting
 the simpliest summary technique is to sort the data
 works with small sets of numbers
 easier to see the distribution when the data are sorted
 no information is lost; the presentation is merely simplified

Histogram
 a bar graph of a grouped frequency distribution of quantitative variable
 the apperance of a histogram can vary depending on how many categories you use

how to create a histogram
 create categories or groups of bins
 count the number of people or items in each group
 make a bar graph, one bar for each group

Frequency polygons
 the same as histograms, but midpoints connected by lines, rather than using bars
 not used very much

rawÂ frequencies
 counts
 are the original numbers

relative frequencies
 the numbers divided by N (the total)
 percentages are the same relative frequency, except with the decimal point shifted over two places

Symmetrical
 left side is the mirror image of the right side
 many distributions are symmetrical

Shapes of distributions
 Symmetrical
 Uniform
 Bellshaped
 Floor and Ceiling effects
 Skewed
 Bimodal

Uniform
 equal probabilities in all categories
 uniform distribution is symmetrical
 bars are close together in uniform

Bellshaped
 most common
 another examole of a symmetrical distribution
 bars are close togther in a bell shape

Floor effects
 there is a lower limit to the possible numbers
 usually this is 0
 examples: incomes, which generally cannot be negative

ceiling effect
an upper limit to the possible numbers

Skewed
 to the right (positively skewed)
 to the left (negatively skewed)
 skew us frequentky due to floor and ceiling effects

Bimodal
 two humps or central points
 like two bell shaped put together

Boxplots (or boxandwhisker plots)
 includes median (a small square)
 outliers (small circle)
 nonoutlier range (in the shape of a capital I)
 and the percentage (a big box)

measures of central tendency
 these measure where the "middle" or "center" is, or where most of the action is in the distribution
 includes the mean, median, and mode

measures of dispersion or variability
theses measure how spread out the data are

mean
 arithmetic average add them up and divide by N
 most sensitive to outliers

median
 middlemost number (same as the 50th percentile)
 if there is an even amount of numbers, average the middle two
 sort the numbers first
 less sensitive to outliers

mode
 the most frequently occuring number.
 the hump in the histograms
 the only measure that works with qualitative data
 the only measure of central tendency where there can be two (eg. bimodal)

when a distribution is symmetrical and bellshaped
the mean median and mode are the same

when distributions are skewed
mean, median, and mode are separate

measures of dispersion of variability
 these measure how spread out the data are
 a data set: 3 3 3 3 (0 variability)
 another data set: 1 2 3 4 5 (medium variability)
 another data set: 1 1 3 5 7 (larger variability)

Ordinal measures of variability
 these depend only on the order of the numbers
 range, interquartile range, and semiinterquartile range


interquartile range
 chop off the top 25% (upper quartile)
 chop off the bottom 25% (lower/bottom quartile)
 take the difference

semiinterquartile range
half of the interquartile range

quantitative measures of variability
 these are based on the actual numbers, not just their orders
 variance and standard deviation

variance
average squared deviation from the mean

Standard deviation
square root of the variance

Norms
 are summary statistics of test resultsthey tell us what is "normal" or average
 we can tell how far an individual score is from average using summary statistics
 Z scores are commonly used

