CENTER OF A DATA SET
The center is a represenative or average value that indicates where the middle of the data set is located.
VARIATION OF A DATA SET
The variation is a measure of the amount that the data values vary amoung themselves.
DISTRIBUTION OF A DATA SET
The distribution is the nature or shape of the distribution of the data set ( such as bell shaped, uniforn, or skewed).
OUTLIERS OF A DISTRIBUTION SET
The outliers are the sample values that lie very far away from the vast majority of the other sample values.
Time of a distribution set
The time: changing characteristics of the data over time.
The frequency distribution or frequency table lists data values ( either individually or by groups of intervals), along with their corresponding frequencies ( or counts ).
The frequency for a particular class is the number of original values that fall into that class.
LOWER CLASS LIMITS
The lower class limits are the smallest numbers that can belong to the different classes.
UPPER CLASS LIMITS
The upper class limits are the largest numbers that can belong to the different classes.
Class boundries are the numbers unsed to separate classes but without gaps created by class limits.
EX if the class is listes 10-19 and 20 -29 the class boundries would be 9.5, 19.5 and 29.5
Class midpoints are the values in the middle of the classes.
EX if the classes are 21-30 31 - 40 41-50 the midpoints would be 25.5, 35.5 , 45.5
The class width is the difference between two consecutine lower class limits or two consecutive lower class boundries.
if the lower class is 21 in a 21-30 then 31-40 you find the difference between 21 and 31 so width= 10
Frequency Distribution reasons for
- 1 - large data sets can be summerized
- 2 - we gain insite into the nature of data
- 2 - we have a basis for contructing important graphs.
class width = (maximum value) - (minimal value)/ number of classes
RELATIVE FREQUENCY DATA
Relative frequency are found by dividing each class frequency by the total of all frequencies. A relative frequency includes the same class limits as a frequency distribution but are used instead of actual frequencies.
RF = class frequency/ sum of all frequencies
The cummulative frequency for a class is the sum of the frequencies for that class and all previous classes.
In a normal distribution when graphed the result is in a bell shaped. Frequencies start low increase to a maximum then decrease. The distribution should be symmetric.
EX 2 4 6 8 6 4 2
A histogram is a bar graph in which the horizontal scale represents classes of data values and the verticle scale represents frequencies. The heights of the bars correspond to the frequency values, and the bars are drawn adjacent to each other (without gaps).
RELATIVE FREQUENCY HISTOGRAM
A relative frequency histogram has the same shape and horizontal as a histogram, but the vertical scale is marked with relative frequencies instead of actual frequencies.
HISTOGRAM HORIZONTAL SCALE
use class boundries or class midpoints
HISTOGRAM VERTICAL SCALE
use the class frequencies for the histogram vertical scale
NORMAL QUANTILE PLOT
If the plotted points lie reasonable close to a straight line pattern and do not exibit exhibit any other systematic pattern, then the data appears to come from a population having a normal distribution.
A frequency polygon uses line segments connected to points located directly above class midpoint values.
An ogive is a line graph that depicts cummulative frequencies, just as the cummulative distribution list cumulative frequencies.
consists of a graph in which each data value is plotted as a point or dot along a scale of values.
- EX . . .
- . .... . . .
A stemplot or a stem and leaf plot, represents data by separating each value into 2 parts: the stem the left number and the leaf the right numbers.
EX. 21 would be 2 leaf 1 the stem, then 23 would be 2 leaf 3 stem.
A pareto chart is a bar graph for qualatative data with the bars arranged in order according to frequencies. Vertical scales in Pareto charts can represent frequencies or relative frequencies.
A pie chart is a circular graph which has the qualitative is depicted as slices of a pie.
Never publish pie charts because they waste ink on non data components and they lack an appropriate scale
A scatterplot or scatter diagram is a plot of paired (x , y) data with a horizontal x axis and a veertical y axis. The data is paired in a way that matches each value from one data set with a corresponding value from a second data set.
TIME SERIES GRAPH
A time series graph is a graph of time series data which are data that has been collected at different points in time.
IMPORTANT CHARACTERISTICS OF DATA