# AP Stats: Chapter 1

 statistics getting information out of numerical data gotten from an experiment or from a samplecreating the experiment or sampling procedure, collecting and analyzing data, and making inferences (statements) about the population descriptive statistics methods for organizing, displaying, and describing data by using tables, graphs, and summary measures inferential statistics methods that use sample results to help make inferences (decisions or predictions) about a population data analysis process of describing data using graphs and numerical summaries individuals objects described by a set of data; may be people, animals, or things variables any characteristic of an individual categorical variable places an individual into one of several groups or categories; can be numerical in some cases (zip codes, classes of age) quantitative variable takes numerical values for which it makes sense to find an average, should always specify the unit distribution tells what values a variable takes and how often it takes these values inference drawing conclusions that go beyond the data at hand frequency table displays the count (frequency) of observations in each category or class relative frequency table shows the percents (relative frequencies) of observations in each category or class roundoff error the difference between the calculated approximation of a number and its exact mathematical value pie chart shows the distribution of a categorical variable as a "pie" whose slices are sized by the counts or percents for the categoriesmust include all of the categories that make up the whole when can you not use pie charts if you don't have all the categories that make up the wholeif you're dealing with individuals that represent a category (e.g. 10-12yrs) since those are different groups, not part of a whole bar graph used to display the distribution of categorical variable or to compare the sizes of different quantities. The categories or quantities being compared is on the horizontal axis. Has blank spaces between the bars. how can graphs be misleading bars with different widthsx-axis and y-axis intervals two-way table table of counts that organizes data about two categorical variables marginal distribution distribution of values in one of the categorical variables in a two-way table among all of the individuals described in the tablein a two-way table, calculating percentages of the distribution of one variablesay nothing about the relationship between two variables conditional distribution describes the values of one variable among individuals who have a specific value of another variablepercentage of distribution calculated between the two variables in a two-way table segmented bar graph compares the distribution of a categorical variable in each of several groups. There is a bar for each group with segments that correspond to the different values of the categorical variable.height of each segment is determined by the percent of individuals in the group with that value, each bar has a total height of 100% four steps to answer a statistics problem STATE the question you want to answerPLAN how you will answer the question and which statistical techniques the problem requiresDO make graphs and calculate stuffCONCLUDE be practical given the setting of the real-world problem side by side bar graph used to compare the distribution of a categorical variable in each of several groups. There is a bar corresponding to each group for each categorical variable.height of each bar is determined by the count or percent of individuals in the group with that value association occurs between two variables if specific values of one variable tend to occur in common with specific values of the other qualitative data values of categorical data dotplot a simple graph that shows each data value as a dot above its location on a number line overall pattern in any graph of data, this can be describes by the direction, form, and strength of the relationshipSOCS: shape, outliers, center, and spread center the midpoint/median represents the typical value, and the calculated mean is the average spread indicates the variability of the data, includes the maximum and minimum values and the range range maximum-minimum values outlier an observation that lies outside the overall pattern of other observations residuals in outliers, residuals are present if outliers are outliers in the y direction but not the x direction shape peaks (modes) and the number of whichskewed results or symmetrynumber of clusters + gaps mode the value or class in a statistical distribution having the greatest frequency unimodal describes a graph of quantitative data with a single peak bimodal describes a graph of quantitative data with two clear peaks multimodal describes a graph of quantitative data with more than two clear peaks symmetry left and right sides of the graph are approximately mirror images of each other skewed to the right right side of the graph is much longer than the left side, tail is extended to the right skewed to the left left side of the graph is much longer than the right side, tail is on the left stemplot observations are separated into stems (numbers that have all but final digit) and leaves (the final digit), arranged in a vertical column with increasing order out of the stem (down) splitting stems a method for spreading out a stemplot that has too few stemsshould use asterisks (e.g. 5* and 5**) back-to-back stemplot used to compare the distribution of a quantitative variable for two groups, one variable is a leaf on one side of the stem and the other variable is a separate leaf on the other side of the stem truncate removing one or more digits from a value if it has too many digits, like in creating stemplots histogram type of bar graph without spaces that displays the class/relative frequency of a quantitative variable; horizontal axis shows the classes of the variable, vertical axis has the scale of counts/percents; do not preserve raw data because it has been grouped into classes time plots used to show bivariate (2-variable quantitative data) where the independent variable (x) represents time independent/dependent variable on graph axes dependent=y-axisindependent=x-axis mean formula mean arithmetic average, non-resistant measure, represents size of observations if they were equally split among all observations resistant measure statistic that is not affected very much by extreme observations median midpoint M of a distribution, half the observations are smaller than this and half are larger, represents typical value, resistant measure median position formula n=# observations in data setafter arranging data in increasing order, move this number inward to find median mean > median right skewed mean = median symmetric mean < median left skewed mode value that occurs the most 68-95-99.7 Rule aka Empirical Rule in a bell-shaped distribution, 68% of the data lies within one standard deviation of the mean, 95% lies within two standard deviations of the mean, and 99.7% lies within three standard deviations of the mean interquartile range (IQR) measures the range of the middle 50% of the data, resistant measureIQR= Q3-Q1 first quartile median of observations to the left of the median third quartile median of observations to the right of the median percentile implication 95th percentile means that 95% of the population got that score or lower IQR rule for calculating outliers an observation is an outlier if it falls more than 1.5 x IQR above the third quartile or below the first quartile how to use IQR to calculate bottom cutoff value Q1-1.5 x IQR how to use IQR to calculate top cutoff value Q3+1.5 x IQR standard deviation measure of spread that looks out how far observations are from the mean, typical scores are found above and below the standard deviation of the mean, non-resistant measurestandard deviation of 0 indicates no variability, greater when observations are more spread out degrees of freedom (n-1) observations variance Sx2 the average squared distance of the observations in a data set from their mean standard deviation formula variance formula how to calculate variance and standard deviation find mean of data, find the deviations of the observations from the mean, square these, and add them up, then divide by degrees of freedom (n-1) observations to find the varianceto find standard deviation, take the square root of variance five-number summary minimum, first quartile, median, third quartile, maximumgives a summary of both center and spread, roughly divides the distribution into quarters boxplot graphs the five-number summary, box spans the quartiles and whiskers extend to the min/max values, center line represents median modified boxplots boxplots that always show the outliers as dots side-by-side boxplots show the boxplots next to each other using the same scale, used to compare distributions of two data sets detecting skewedness in boxplots the longer whisker is where the distribution is skewed, a larger difference in lengths means a more strongly skewed distribution detecting range and IQR in boxplots range is represented by full length of boxplot, IQR is represented by length of box options for measuring center and spread, resistant or non-resistant median and IQR are resistant, use when analyzing skewed data and/or outliersaverage and standard deviation are non-resistant and sensitive to skewed results and outliers sigma Σ represents a summation, "add them up" index variable i lower limit and upper limit the numbers above and below a sigma, represent the range of numbers you are plugging into i and adding up summand in sigma notation, what you're adding up (e.g. i2) solution in sigma notation, the answer that you solve for (your sum after you add everything up) bar graph two-way table marginal distribution conditional distribution segmented bar graph dotplot back-to-back stemplot stemplot frequency table categories class and count relative frequency table categories class and percent frequency histogram; relative frequency histogram boxplot side-by-side boxplot side-by-side bar graph AuthorGymnastxoxo17 ID240663 Card SetAP Stats: Chapter 1 Descriptiond Updated2013-10-16T01:58:21Z Show Answers