CIS2300_TEST1.txt

  1. Population
    • a collection of persons, objects or items of interest.
    • Whatever the researcher is studying
  2. parameter
    • a descriptive measure of the population. Usually denoted by Greek letters
    • e.g. mean(µ), population variance(σ^2), populuation standard deviation(σ)
    • data from a census are parameters
  3. sample
    a portion of the whole and if taken properly, representative of the whole
  4. statistic
    • a descriptive measure of the sample. Usually denoted by Roman letters
    • e.g. mean(x *bar*), sample variance (s^2), sample standard deviation(s)
    • data from a sample are statistics
  5. Descriptive Statistics
    • Using data gathered on a group to describe or reach concclusions about that same group
    • e.g. most athletic stats. The data is gathered from that group and conclusions are drawn about that group only. Basketball stats are about Basketball
  6. Inferential Statistics
    • gathering data from a sample and use the statistics generated to reach conlusions about the population from which the sample was taken
    • sometimes referred to as inductive statistics
  7. emprical rule
    • The approximate values that lie within a given number of standard deviations from the mean of a set of data if the data are normally distributed.
    • Distance from the Mean Values within Distance
    • µ + 1σ 68%
    • µ + 2σ 95%
    • µ + 3σ 99.7%
  8. Population Mean
    • µ = (∑x)/N
    • where x = actual data values
    • N = # total terms
  9. standard deviation
    • square root of the variance
    • σ = sqrt(σ)
    • Σ = sqrt( (∑(x- µ)^2)/N)
  10. sum of squares of x
    • SSx
    • The sum of the squared deviations about the mean of a set of values
  11. variance
    • average of the squared deviations about the arithmetic mean for a set of numbers
    • Population Variance
    • - σ^2 = (∑(x- µ)^2)/N)
  12. deviation from the mean
    x-µ
  13. mean absolute deviation (MAD)
    • the average of the absolute values of the deviations around the mean for a set of numbers
    • MAD = (∑|x-µ|)/N
    • where
    • x-µ = actual value of a given number minus the mean
    • N= Number of terms
  14. Chebyshev's Theorem
    • at least (1-1/k^2) values will fall within + k standard deviations of the mean regardless of the shape of the distribution. Assume k>1
    • e.g. k=2.5, 1-1/(2.5^2) = .84. so at least .84 of all values are within µ + 2.5σ.
    • or at least .84 of all values will be within 2.5 standard deviations of the mean, µ.
  15. sample variance
    • variance: s^2 = ∑(x- x(bar))^2)/(n-1)
    • also
    • s^2 = (∑x^2 - ((∑x)^2)/n)/n-1
    • where
    • x = actual value
    • x(bar) = sample mean
    • n = sample number
  16. sample standard deviation
    • sqrt(s^2) where s^2 =
    • s^2 = (∑x^2 - ((∑x)^2)/n)/n-1
  17. Percentiles
    • measure of central tendency that divide a group of data into 100 parts
    • 87.7% = 87th Percentile
    • percentile location: i=(P/100)n
    • where P = percentile
    • i= percentile location
    • n= number in the db
    • if i is a whole number then then P = (i+(i+1))/2 or the average of the two numbers
    • if i is NOT a whole number then P = whole number of i+1
    • e.g. i= 11.8, P = (11.8+1) = 12.8 or 12th percentile
    • e.g. i = 11, P = (11+12)/2 = 11.75 = 11th percentile
  18. frequency distribution
    • a cumming of data presented in teh form of class intervals and frequency
    • e.g. 1 under 3, 3 under 5, etc.
    • use classes rule of thumb, 5-15 classes
  19. range
    difference between the largest and smallest values of an order
  20. classes
    • 5-15 rule of thumb
    • arrangement of values in groups
  21. cumulative frequency
    running total of frequency through the classes of a frequency distribution
  22. relative frequency
    proportion of total frequency that is in any given class interval in a frequency distribution
  23. class width
    range/# classes
  24. histogram
    typical vertical bar-chart used to depict a freq. dist.
  25. frequency polygon
    graph in which line segments connnect the dots depicting frequency distribution
  26. ogive
    cumulative frequency polygon- most useful for running totals
  27. pie chart
    • data represented as a whole
    • Interval/total * 360
  28. stem & leaf
    constructed by separrating the digits for each # of the data into 2 groups
  29. pareto
    • Vertical bar chart that displays the most common types of defects
    • ranked in order of occurence left to right
  30. scatter plot
    • 2 dimensional plot of pairs of points from 2 variables
    • god for attempting to determine relationship between 2 variables
  31. census
    • gather data from a whole population
    • data from a census are parameters
  32. Levels of Data
    • Lowest to Highest
    • Nominal
    • Ordinal
    • Interval
    • Ratio
  33. Nominal
    • Lowest level of data: Used only to classify or categorize
    • e.g. doctor, lawyer, educator, other
    • NON-METRIC Data, aka qualitative data.
  34. Ordinal
    • Higher than Nominal, can be used to rank or order subjects
    • e.g. not helpful, somewhat helpful, moderately helpful, very helpful, extremely helpful
    • NON-METRIC Data, aka qualitative data.
  35. Interval
    • Higher than Ordinal
    • Distances between consecutive numbers have meaning and the data are always numerical
    • e.g. temperature
  36. Ratio
    • Highest Level of data measurement
    • Have the same properties off Interval but they have an absolute zero which indicates absence
    • Ratio of two numbers is meaningful
    • e.g. Height, weight, Kelvin temperature, passenger miles
  37. Parametric Stats
    Must be Interval or Ratio
  38. Non-Parametric Stats
    Can be nominal or ordinal but can be used to analyze parmetric
  39. grouped data
    data that have been organized into a frequency distribution
  40. ungrouped data
    raw data or data that have not been summarized in any way
  41. median
    • middle value in an ordered array of #s.
    • -an array with an odd amount of values, the median is the middle value
    • -an array with an even amount of values the median is the average between the two middle numbers
    • -the median number is (n+1)/2
    • e.g. for 77 terms the median is (77+1)/2= 39th term
  42. Quartiles
    • same rules as percentiles, if i is a whole number Qx is the average of the i+(i+1) number
    • Q25 = Q1 = first 25% of values ending in the Q25 term
    • Q50 = Q2 = first 50% of values ending in the Q50 term
    • Q75 = Q3 = first 75% of values ending in the Q75 term
    • Q2 is the median
  43. measure of central tendency
    yield info about the center, or midddle part, of a group of values
  44. mode
    • the most frequently occuring value in a set of data
    • bimodal- data set has two modes
    • multimodal - data set has more than two modes
  45. Inter Quartile Range : IQR
    • The middle 50% of values
    • IQR = Q3-Q1
    • e.g. if Q3 = the 12th (70)term and Q1 = the 4th term (5) IQR = 70-5 or 65
  46. Coefficient of Variation
    • The ratio of the standard deviation to the mean expresed in precentage and is denoted as CV
    • CV = (σ/µ)100
    • e.g. for σ=4.84 & µ = 64.4, CV = 7.5%
  47. z score
    • number of standard deviations a value (x) is above or below the mean of a set of numberrs when the data are normally distributed
    • z = (x-µ)/σ
    • e.g. x = 1, µ = 4.28, σ = 2.491, z = -1.32
    • x = 9, µ = 9, σ = 2.491, z = 1.89
    • z scores still follow the empirical rule
  48. coefficient of correlation
    • correlation: measure of the degree of relatedness of variables
    • coefficient of correlation = r
    • r = (big equation)
  49. classical method of assigning probability
    • involves an experiment which is a process that produces outcomes, and an event, which is an outcome of an experiment.
    • P(E) = n_e/N
    • Highest probability of an outcome is 1.
    • Lowest probability is 0
  50. apriori
    probabilities can be determined prior to the experiment
  51. intersection
    • contains the element common to both sets
    • X = 1234 Y = 2367 X(int)Y = 23
  52. mutually exclusive events
    • when the occurence of one event precludes the occurence of another event
    • e.g. Male and Female. OK and Defective. A person can not be both Male and Female and a part may not be both OK and Defective
    • formula: P(X(int)Y) = 0
  53. independent events
    • events wherein the occurence or nonoccurence of one of the events does not affect the occurence or nonoccurence of the other event.
    • e.g. Coin tosses or Die Rolls. The previous event does not influence the following event
    • formula: Independent Events X & Y
    • P(X|Y) = P(X) and P(Y|X) = P(Y)
  54. complement
    • All the elementary events of an experiment not in A comprise its complement.
    • e.g. If the experiment is rolling die and the event is 5, then the complement is 1,2,3,4,6
    • A'
    • P(A') = 1 - P(A)
  55. relative frequency of occurence method of assigning probabilities
    the probability of an event occurring is equal to the number of times the event has occurred in the past divided by the total number of opportunities for the event to have occurred.
  56. subjective probability
    assigning probability based on the feelings or insights of the person determining the probability
  57. mn counting rule
    • For an operation of m ways and a second operation of n ways, the tw operations then can occur, in order, in mn ways.
    • This rule can be extended to 3 or more operations
    • e.g. # of Groups possible with the following factors
    • gender, marital status, economic class = 2(m/f), 3(single-never married/married, divorced), 3(lower/middle/upper)
    • =18 groups. Therefore 18 samples could be taken to represent all groups.
  58. sampling from a population with Replacement
    • sampling n items from a population of size N with replacement would provide N^n possibilities
    • e.g. A die being rolled 3 times in succession, how many different outcomes can occur?
    • N = 6, n=3, 6^3 = 216
    • A lottery of reusable numbers 6 digits long from 0-9
    • N=10, n=6, 10^6 = 1000000
  59. Combinations
    • Sampling n items from a population of size N without replacement
    • N^Csub_n = N!\{n!(N-n)!}
    • e.g. three lawyers are to be sent to a conference from a pool of 16
    • 16!/3!13!= 560
    • combination because once selected the lawyer can not be selected again
  60. Spearman Rank
    • r(sub_s) = 1 -(6(sum)d^2/n(n^2-1))
    • where d = differenc in the ranks of each pair
    • n= number os pairs
    • High positive number indicates a positive correlation
    • High negative number indicates a negative correlation
    • e.g. if x and y pairs, and spearman's equals -.830 this indicates a strong inverse correlation,
    • that is when x is high y is low and vice-versa
  61. General Law of Addition
    • P(A∪B)= P(A) + P(B) - P(A∩B)
    • That is probability of A + probability of B - Probability of A&B together
  62. Special Law of Addition
    • Applies only if Probabilities are mutually exclusive
    • i.e. male or female, or P(A∩B) = .000
    • Then the union of P(A) and P(B) = P(A) + P(B)
  63. General Law of Multiplication
    • This gives the probability that both A & B will happen at the same time
    • P(A∩B) = P(A) * P(B|A) = P(B) * P(A|B)
    • P(A∩B) means that A & B MUST happen.
    • P(A|B) is the probability of A given that B is true
  64. Special Law of Multiplication
    If X, Y are independent, P(X∩Y) = P(X) * P(Y)
  65. Independent Events X,Y
    • To test to determine if X & Y are independent events, the following must be true
    • P(X|Y) = P(X) and P(Y|X) = P(Y)
  66. Conditional Probability
    P(X|Y) = P(X∩Y)/P(Y) = (P(X)*P(X|Y))/P(Y)
Author
gummibear
ID
7317
Card Set
CIS2300_TEST1.txt
Description
CIS 2300 Business statistics test 1
Updated