-
Quantitative variables
Involve numeric characteristics such as age, height, sales revenue or business profits and so on
-
Qualitative variables
Involve non numeric characteristics such as race, gender or hair color
-
Variables
Characteristics of a population
-
Inferential characteristics
Statistics that determine something about an entire group (population) based on looking at part of the group (sample)
-
Descriptive statistics
Statistics that organize, summarize and present data
-
Causation
a change in one variable will cause a change in another variable
-
Association
Variables appear to have a relationship to each other but one variable does not necessarily cause a change in the other item
-
Statistics
The science of gathering, organizing analyzing and presenting data
-
Sample
A portion or a subset from a population
-
Population
The entire set of items, people or measurements that are studied
-
Mutually exclusive
When the occurrence of one event prevents the other events from happening
-
Exhaustive
All observations will be placed in a category. There are no other options
-
What are the two types of quantitative variables?
Discrete and continuous
-
Discrete variable
Can only assume a specific value. Ex. Population of a city. Discrete variables are counted.
-
Continuous variable
A variable that can be any value with in a certain range such as weight interest rates
-
Levels of measurements
Used to classify data. Also called scales of measurement. Indicates how data is calculated summarized measured and tested
-
What are the 4 levels of measurement?
Nominal ordinal interval ratio
-
What 2 levels of measurement are used in qualitative variables?
Nominal ordinal
-
What 2 level of measurements are used in quantitative variables?
Interval and ratio
-
Nominal level data
Mutually exclusive and exhaustive and no logical sequence. Classified and counted. Ex number if boys and girls in a class
-
Ordinal level data
Mutually exclusive and exhaustive and can be ranked or ordered. Ex grades a b c d f
-
Interval level data
Mutually exclusive and exhaustive can be ranked or ordered difference between classifications is a consistent unit of measure zero does not mean nothing is present. Ex: dress size and temp
-
Arithmetic Mean
In a population it is the sum of all the values divided by the number of items. In a sample it is the sum of the value of items selected divided by the number if items. Ratio level data uses the arithmetic mean to represent the center
-
Characteristics of a Arithmetic Mean
A mean uses all values in a sample or a population
A mean might be distorted by large or small values called outliers
All interval and ration data have a mean
Each set of data has a single unique mean
If you sum each item's deviation from the mean, it equals zero
-
Ratio level data
Mutually exclusive and exhaustive can be ranked or ordered. consistent unit of measurement. zero means none. ratio between 2 classifications is meaningful. Ex: gross pay hours worked test scores
-
Weighted Mean
A computation of the arithmetic mean used whrn you have multiple observations of the same value in the population or sample.
-
Median
The midpoint of the values after they have been arranged in order from smallest to largest or largest to smallest. Ordinal data uses the median to represent the center
-
Geometric Mean
A special form of mean used in a situation in which you are computing averages that compound on each other or you want to compute the rate of change on an item over time.
-
Mode
The value of an observation that ocurs most frequntly. Nominal data uses the mode to represent the center.
-
Population Mean
The population mean is the mean of all the values in a population. The formula is expressed as follows:
- Where:
- μ = Population mean
- Σ = Sum
- X = Population value
- N = Number of values in the population
-
Characteristics of the Median
It is not affecte by extremely high or extremely low scores
Each set of data has a single median
It can be computed on ratio=level, interval-level, and ordinal level data
-
Characteristics of a mode
It is used with all types of data
It is not afffected by extremely high or low values
If there are no reoccuring values, there is no mode
It is possible to have multiple modes
-
Symmetrical Distribution
In a symmetrical distribution, the arithmetic mean, median, and mode are equal. You can use any one of these measures to represent the center.
A symmetrical distribution is where the histogram has the same shape on each side of the center point. In other words, if you cut the histogram in half at the center point, you get two identical pieces. Not all distributions are symmetrical
-
Positively Skewed Distribution
The arithmetic mean is larger than the median or the mode due to one or more large values
-
Negatively skewed distributions
The arithmetic mean is smaller than the median or the mode due to one or more small values
-
Measures of dispersion
Measures of dispersion tell you about the spread in the data. There are 4 measures of dispersion: range, mean deviation, variance and standard deviation.
-
Range
The difference between the highest and lowest values in the data
-
Mean Deviation
The arithmetic mean of the absolute values of the deviation of each observation from the arithmetic mean
-
Variance
The arithmetic mean of the squared deviations of the observations from the mean
-
Standard Deviation
The square root of the variance
-
Chebyshev’s Theorem
P.L. Chebyshev was a mathematician who developed a theory regarding standard deviation. His theory states that for any population or sample, the percentage of values that lie within k plus and minus standard deviations of the mean is at least:
- k = Number of standard deviations
- Note: For this theorem to work, k must be greater than one.
According to the theorem, at least 75% of this data falls within two standard deviations of the mean.
-
Empirical Rule
The empirical rule (also called the normal rule) applies only to a symmetrical, bell-shaped distribution
Approximately 68% of the data is within one plus or minus standard deviation from the mean.
Approximately 95% of the data is within two plus or minus standard deviations from the mean
Approximately 99.7% of the data is within three plus or minus standard deviations from the mean.
|
|