
Stemandleaf diagraom or stemplot
 Stems  first numbers of digits
 Leaf final digits
 observations 3076
 stems will be from 37
 arrange leafs in ascending order
 2 lines per stem when small amount of observations to see the distribution better
 1st line 3/14
 2nd line 3/59

How to find the median of a Stemplot?
 Find number of observations, if even number (40) the mid point is 20 and 21.
 Now find in ordered stem plot from top the 20th and 21st numbers (ex. 67 & 68)
 Median 67+68/2=67.5

Pie charts
 Each piece of pie chart is a proportion of a relative frequency
 360*0.325(rel.freq)=117 Degree angle

Bar charts
similar to histograms but bars don't touch each other as in histograms

Distribution shapes
 Bell shaped
 Triangular
 Uniform or rectangular
 Reverse J shaped
 J shaped
 Right skewed
 Left skewed
 All of the above are unimodal shapes  1 pick
 Bimodal  2 picks
 Mulimodal  3 or more picks
 all on page 73

Population and sample distribution
 population distribution  can be called just distribution, comes from census
 sample distribution  is a a distribution of a sample data

Chapter 3
Measures of Center
 called averages  3 measures of center are
 Mean
 median
 mode
 Where mean and median apply only to quantitative data, whereas mode can be used in guantitative and qualitative data

Mean
The sum of observations devided by the number of observations

Median
 to find median:
 arrange the data in increasing order. ex. 10, 22, 36,....(helpfull to use stemplot)
 n = # of observations
 If, # is odd, median is the observations exactly in the middle of the ordered list. ex. let n=13, (n+1)/2 (13+1)/2=7, the seventh observation in the ordered list is the median
 If, # is even, there will be two numbers to find out use the same method, Let n=10, (n+1/2=(10+1)/2=5.5, the median between the 5th and the 6th observation.

Mode
 The most frequent value
 ex. 2,5,2,4,6,2,1,2,1,2, the mode would be 2

Population mean and Sample Mean
 The mean of population datga is called the population mean or the mean of the variable; the mean of sample data is called a sample mean. Same terms used for median and mode.
 There is only one population mean and many sample means.

Summation Notation
 Simgol (Z)Sigma is a summation notation simbol which is shorthand for "the sum of"
 Summatio notation Zx_{i} read as "summation x sub i" or "the sum of the observations of the variable x"
 x1+x2+x3+...123 are the subscripts used to distinquish one observation from another.

Measure of variation or measures of spread
 descrpitive measure that indicates the amount of variation, or spread
 3 measures of variations:
 Range
 Sample standard deviation
 Interquartile range

Range of data set
 range of data set is the difference between max and min observations
 Range=MaxMin ex. Range=7872=6in.
 The larger range the bigger variation

The Sample Standard Deviation
 takes in to account all the observations, it is the preferred measure of variation when the mean is used as the measure of center.
 Measures variation by indicating how far, on average, the observations are from the mean.
 1st step is to find the deviations from the mean, that is, how far each observation is from the mean.
 2nd step is the sum fo squared deviations
 3rd step sample variance
 4th step is the final step in computing sample sandard deviation is to take a square root of sample variance

Deviations from the mean
 Find mean(x bar) 1st, then
 to find deviation from the mean take x_{i} and subract the mean from it xxbar=72753

The sum of squared deviations
is to obtain a measure of the total deviation from the mean for all the observations we need to square the deviations from the mean. The sum of the squared deviations from th mean, Z(summation notation)(xixbar)^{2 }called the sum of squared deviations that gives a measure of total deviation from the mean for all the observations.

Sample Variance
 is to take an average of the squared deviations. Do so by dividing the sum of squared deviations by n1, or 1 less than the sample size. the resulting quantity is called a sample variance  s^{2}
 write formula s^{2}=Z(x_{i}xbar)^{2}/n1

Square root of sample varience
Final step in computing standar sample deviation is to take a square root of sample varience

Defining formula for Sample standard deviation (s)


Variation and standard deviation
the more variatrion that there is in a data set, the larger is its standard deviation

Section 3.3 Quartiles
 divide a data set into quartes (4 equal parts)
 the most commonly used percentiles.
 a data set has 3 Quartiles
 Q1the number that devides the bottom 25% of the data from the top 75%
 Q2  is the median, bottom 50% from top 50%
 Q3  divides the bottom 75% from the top 25%
 1st step: arrange the data in in increasing order
 2nd step: determine the median of the entire data set
 3rd. step: determine the median of each quartile

Interquartile Range
IQR, is the difference between the first and third quartiles; IQR=Q3Q1

The FiveNumber Summary
 the five number summary of the data set is Min, Q1, Q2, Q3, Max
 1st: Find Min and Max observations
 Variation of first quartile: Q1Min
 Variation of 2nd quartile: Q2Q1
 Variation for the 3rd quartile: Q3Q4
 Variation of the 4th quartile: Max  Q3

Outliers
observations that fall well outside th overall pattern of tha data. Can be a result of a measurement or reccording error. An unusual extreme observation

Lower and Upper limits
 Lower limit=Q11.5*IQR
 Upper Limit: Q3 + 1.5*IQR

Potential outlier
observations that lie below the lower limit or above the upper limit

Boxandwhisker diagram
 based on the fivenumbersummary
 1. determine quartiles
 2. determine potential outliers and the adjacent values
 3. Drow boxes around quarlites, and lines connection boxe and adjacent values
 4. Plot each potential outlier with an asterisk *

Adjacent values
the min and max observations

Section 3.4 The population mean (mean of a variable)
 Same method in sample mean but different simbols.
 use letter M (mew)
 N  size of the population
 write formula:

Population standard deviation(Standard Deviation of a variable)
 To distinguish Population standard deviation from Sample standard deviation, the lettter is used is:
 o  "sigma" the population varience
 >
 >
 >
 >
 >
 >

Population Standard Deviation ( computing formula)
 .Square each variable and add the results
 .
 .
 .
 .
 .
 .
 .
 .

Parameter ans Statistic
 Parameter: A descriptive measure for a population  and o
 Statistic: A descriptive measure for a sample, x and s

Standardized variable
 z=
 .
 .
 always has mean 0 and standard deviation 1

Standard Score (zScore)
 For an observed values of a variable x, the corresponding vallue of the standardized variable z is called the zscore of the observation. Standard score is used instead of zscore.
 Negative zscore  observation is below mean
 Postive zscore  is above the mean

