# Elementary Statistics

 Individuals objects described by a set of datapeople, animals, things Variable any characteristic of an individualcan take different values for different individuals Categorical Variable places an individual into one of several groups or categories Quantitative Variable takes numerical values for which artihmetic operations like adding and averaging make senseusually recorded in a unit of measurement Distribution of a Variable what values it takes and how often it takes these values Distribution of a categorical  Variable lists categories and gives count or % of individuals who fall in each category Roundoff error don't point to mistakes but to effects of rounding off results Pie Charts show distribution of a categorical variable as pie slicesmust include all the categories that make up the wholeuse pie chart only when you want to emphasize each categories relation to whole Bar graphs represent each category as a barbar height shows category counts or %'scan compare any set quantities measured in the same units Histogram Most common graph of the distribution of one quantitative variable Shape of a distribuition symmetric-right and left side are roughly mirror imagesSkewed to the right-right side of histogram extends much farther than the leftSkewed to the left- left side of histogram extends much farther than the right Center of a distribuition midpoint of distribuition Spread of a distribuition The range of a distribuition between the smallest value to the largest Stemplot like a histogram turned on end using #'scannot chose classes they are given to youpreserves actual value of each observationdo not work well with large data sets stems and leaves Parts of a stemplotstems-all but the final (rightmost) digitleaf- the final digit Split stems double the # of stems when all the leaves would fall on just a few stemseach stem appears twice0 to 4 go on upper stem, 5 to 9 go on lower stem Timeplot plots each observation of a variable against the time at which it was measuredAlways put time on horizontalalways put variable you are measuring on vertical Cycles regular up and down movements in a timeplot Trend long term up or down movement over time Time series data change in one variable at a specific location over timetimeplot Cross sectional data displays a variable at many locations at the same time Exploratory data analysis using graphs and and numerical summaries to describe the variables in a data set and the relations among them mean is not a resistant measure of center mean is sensitive to a few extreme observationsalso sensitive to skewed distribution without outliers pulled towards tail Median M midpoint of distribution such that half the observations are smaller and half larger five number summary gives the smallest and largest as well the median and 1st and 3rd QMinimum Q1 M Q3 Maximum Boxplot central box spans Q1 andQ3line in box marks median Mlines extend from box out to smallest and largest observationsbest used for side by side comparison of more than one distribution IQR Interquartile rangedistance between Q1 and Q3IQR = Q3 - Q1 1.5 x IQR rule for outliers call an observation an outlier if it is more than 1.5 x IQR above Q3 or below Q1