# Stats Exam 1

 Element Entity upon which data are collected on Ex: Name of player Observation set of measurements obtained for a particular element Variable characteristic of an element Variable Categorical (qualitative) non numerical data that is classified into categories Ex: Position or team Variable: Categorical: Nominal categorical data which have no meaningful order Ex: position, team Variable: Categorical: Ordinal categorical data which can be ordered. Ex: shirt size – small, medium, large Variable: Quantitative numerical data that is measures on a numerical scale Ex: Points scored in a game Variable: Quantitative: Interval numerical data that has no true 0 point Ex: Temperature Variable: Quantitative: Ratio numerical data with a true 0 point Ex: points scored Cross Sectional Data data that is collected at the same time Ex: points scored in a specific week Time Series data collected over different time periods Ex: points scored over multiple seasons Descriptive Statistics uses tables, graphs, and numerical methods to summarize data Inferential Statistics uses data from a sample to make estimates or test hypotheses about the characteristics of a population Population the set of ALL elements in a population Sample a SUBSET of a population. Sample estimates a population Frequency Distribution table that summarizes the number of items that occur in non-overlapping categories Histogram graphical way to display quantitative data. Uses intervals to display frequency table data Correlation shows an association between 2 variables Measures of Central Tendency Mean the average of a sample of (n) observations. The mean is sensitive to extreme values Measures of Central Tendency Median the middle point where exactly ½ of the observations on either side of that point The median is resistant to extreme values Measures of Central Tendency Mode the observation that occurs most frequently. Can have 2 modes (bimodal) or more than 2 modes (multimodal) Statistic the numeric measure of SAMPLE data Parameter the numeric measure of POPULATION data Types of Distribution Symmetric mean = median Types of Distribution Skewed Right (positive) median is best measure Mean is greater than the median Types of Distribution Skewed Left (negative) median is best measure. Mean is less than median Types of Distribution Percentile a data value that has at least p% fall at or below a percent value To find percentile o Arrange observations in increasing order o Compute the index: I = (p/100)*n o If the index (i) is an integer, then take the average of that point and the next increasing point o If the index (i) is not an integer, use the location of the next integer greater than i Quartile Range the area between the 25th and 75th percentile. Holds 50% of the data set Measures of Variability and Dispersion Range the difference between the largest and smallest values in a data set Measures of Variability and Dispersion Variance based on the difference between each value and the mean Population variance (σ2) Sample variance (s2) has (n-1) in the denominator Measures of Variability and Dispersion Standard Deviation the square root of variance. Easier to interpret than variance because it isin the same units as the original data Measures of Variability and Dispersion Coefficient of variation measures how large the standard deviation is relative to the mean. It is expressed in a percentage. (CV = standard deviation/mean *100). Lower Lower is better. Used to compare data which has different Standard deviations and means. Measures of Distribution Shape and Relative Location Z Scores gives the number of standard deviations an observation is from the mean. A z score of 0 indicates that the value is equal to the mean. Measures of Distribution Shape and Relative Location Outliers z scores greater than 2 in highly skewed distributions or greater than 3 in normal distributions Measures of Distribution Shape and Relative Location Chebyshev’s Theorem Within +/- 2 standard deviations, 75% of the observations will fall within this range Within +/- 3 standard deviations, 89% of the observations will fall within this range Measures of Distribution Shape and Relative Location Empirical Rule (normal distribution) Within +/- 1 standard deviations, 68% of the observations will fall within this range Within +/- 2 standard deviations, 95% of the observations will fall within this range Within +/- 3 standard deviations, 100% of the observations will fall within this range Measures of Distribution Shape and Relative Location Correlation Coefficient the relationship between 2 random variables Measures of Distribution Shape and Relative Location Correlation Coefficient Univariate data collected on one random variable Measures of Distribution Shape and Relative Location Correlation Coefficient Bivariate data collected on two random variables Measures of Distribution Shape and Relative Location Correlation Coefficient Person product moment sample correlation coefficient measures the strength of the linear relationship (Rxy). The sign depends on the slope of the data. Must fall between -1 and +1. This is a POINT measurement. 0.00 – 0.29 Little if any correlation 0.30 – 0.49 Weak/Low correlation 0.50 – 0.69 Moderate correlation 0.70 – 0.89 Strong/High correlation 0.90 – 1.00 Very strong/very high correlation Probability Experimental Outcome A sample point Probability Event one or more sample points/experimental outcomes Probability Properties The sum of the probabilities must equal 1 Probabilities must fall between 0 and 1 Probablities When to use combination or permutation formula? Combination when order is not importants (C) Permutations when order is important (P) Probabilities Methods (3) Classical - # of outcomes / total # of outcomes Relative Frequency – used when an experiment is repeated many times Subjective – based on experience or intuition. Used when no relative data is available Probablities Events a collection of sample points/experimental outcomes ( has one or more sample points) Discrete Probability Variables Random Variables a variable that associates a numerical value with each outcome Discrete Probability Variables Random Variables Discrete a finite number of values Ex: number of defective radios Discrete Probability Variables Random Variables Discrete Properties 0 < f(x) < 1 Σf(x) = 1 Discrete Probability Variables Random Variables Discrete uniform probability has the form of? f(x) = 1/n Discrete Probability Variables Random Variables Discrete Expected Value the mean of a discrete random variable Discrete Probability Variables Random Variables Continuous numerical value in one or more intervals on the real number line. Can pick 2 points and can find a 3rd between them such as a time measurement. AuthorAnonymous ID66486 Card SetStats Exam 1 DescriptionStats Exam 1 Updated2011-02-15T18:27:13Z Show Answers