Stem-and-leaf diagraom or stemplot
- Stems - first numbers of digits
- Leaf- final digits
- observations 30-76
- stems will be from 3-7
- arrange leafs in ascending order
- 2 lines per stem when small amount of observations to see the distribution better
- 1st line 3/1-4
- 2nd line 3/5-9
How to find the median of a Stemplot?
- Find number of observations, if even number (40) the mid point is 20 and 21.
- Now find in ordered stem plot from top the 20th and 21st numbers (ex. 67 & 68)
- Median 67+68/2=67.5
- Each piece of pie chart is a proportion of a relative frequency
- 360*0.325(rel.freq)=117 Degree angle
similar to histograms but bars don't touch each other as in histograms
- Bell shaped
- Uniform or rectangular
- Reverse J shaped
- J shaped
- Right skewed
- Left skewed
- All of the above are unimodal shapes - 1 pick
- Bimodal - 2 picks
- Mulimodal - 3 or more picks
- all on page 73
Population and sample distribution
- population distribution - can be called just distribution, comes from census
- sample distribution - is a a distribution of a sample data
Measures of Center
- called averages - 3 measures of center are
- Where mean and median apply only to quantitative data, whereas mode can be used in guantitative and qualitative data
The sum of observations devided by the number of observations
- to find median:
- arrange the data in increasing order. ex. 10, 22, 36,....(helpfull to use stemplot)
- n = # of observations
- If, # is odd, median is the observations exactly in the middle of the ordered list. ex. let n=13, (n+1)/2 --(13+1)/2=7, the seventh observation in the ordered list is the median
- If, # is even, there will be two numbers to find out use the same method, Let n=10, (n+1/2=(10+1)/2=5.5, the median between the 5th and the 6th observation.
- The most frequent value
- ex. 2,5,2,4,6,2,1,2,1,2, the mode would be 2
Population mean and Sample Mean
- The mean of population datga is called the population mean or the mean of the variable; the mean of sample data is called a sample mean. Same terms used for median and mode.
- There is only one population mean and many sample means.
- Simgol (Z)Sigma- is a summation notation simbol which is shorthand for "the sum of"
- Summatio notation Zxi read as "summation x sub i" or "the sum of the observations of the variable x"
- x1+x2+x3+...123 are the subscripts used to distinquish one observation from another.
Measure of variation or measures of spread
- descrpitive measure that indicates the amount of variation, or spread
- 3 measures of variations:
- Sample standard deviation
- Interquartile range
Range of data set
- range of data set is the difference between max and min observations
- Range=Max-Min ex. Range=78-72=6in.
- The larger range the bigger variation
The Sample Standard Deviation
- takes in to account all the observations, it is the preferred measure of variation when the mean is used as the measure of center.
- Measures variation by indicating how far, on average, the observations are from the mean.
- 1st step is to find the deviations from the mean, that is, how far each observation is from the mean.
- 2nd step is the sum fo squared deviations3rd step sample variance4th step is the final step in computing sample sandard deviation is to take a square root of sample variance
Deviations from the mean
- Find mean(x bar) 1st, then
- to find deviation from the mean take xi and subract the mean from it x-xbar=72-75-3
The sum of squared deviations
is to obtain a measure of the total deviation from the mean for all the observations we need to square the deviations from the mean. The sum of the squared deviations from th mean, Z(summation notation)(xi-xbar)2 ---called the sum of squared deviations that gives a measure of total deviation from the mean for all the observations.
- is to take an average of the squared deviations. Do so by dividing the sum of squared deviations by n-1, or 1 less than the sample size. the resulting quantity is called a sample variance - s2
- write formula s2=Z(xi-xbar)2/n-1
Square root of sample varience
Final step in computing standar sample deviation is to take a square root of sample varience
Defining formula for Sample standard deviation (s)
Variation and standard deviation
the more variatrion that there is in a data set, the larger is its standard deviation
Section 3.3 Quartiles
- divide a data set into quartes (4 equal parts)
- the most commonly used percentiles.
- a data set has 3 Quartiles
- Q1-the number that devides the bottom 25% of the data from the top 75%
- Q2 - is the median, bottom 50% from top 50%
- Q3 - divides the bottom 75% from the top 25%
- 1st step: arrange the data in in increasing order
- 2nd step: determine the median of the entire data set
- 3rd. step: determine the median of each quartile
IQR, is the difference between the first and third quartiles; IQR=Q3-Q1
The Five-Number Summary
- the five number summary of the data set is Min, Q1, Q2, Q3, Max
- 1st: Find Min and Max observations
- Variation of first quartile: Q1-Min
- Variation of 2nd quartile: Q2-Q1
- Variation for the 3rd quartile: Q3-Q4
- Variation of the 4th quartile: Max - Q3
observations that fall well outside th overall pattern of tha data. Can be a result of a measurement or reccording error. An unusual extreme observation
Lower and Upper limits
- Lower limit=Q1-1.5*IQR
- Upper Limit: Q3 + 1.5*IQR
observations that lie below the lower limit or above the upper limit
- based on the five-number-summary
- 1. determine quartiles
- 2. determine potential outliers and the adjacent values
- 3. Drow boxes around quarlites, and lines connection boxe and adjacent values
- 4. Plot each potential outlier with an asterisk *
the min and max observations
Section 3.4 The population mean (mean of a variable)
- Same method in sample mean but different simbols.
- use letter M (mew)
- N - size of the population
- write formula:
Population standard deviation(Standard Deviation of a variable)
- To distinguish Population standard deviation from Sample standard deviation, the lettter is used is:
- o -- "sigma" the population varience
Population Standard Deviation ( computing formula)
- .Square each variable and add the results
Parameter ans Statistic
- Parameter: A descriptive measure for a population - and o
- Statistic: A descriptive measure for a sample, x and s
- always has mean 0 and standard deviation 1
Standard Score (z-Score)
- For an observed values of a variable x, the corresponding vallue of the standardized variable z is called the z-score of the observation. Standard score is used instead of z-score.
- Negative z-score - observation is below mean
- Postive z-score - is above the mean