pfd LIU.txt

  1. Nominal:
    • NO ORDER
    • categorical measures that do not have an order
    • ---e.g. color (red/blue/green/etc);
    • types of teeth (molars/incisors/premolars/canine)
  2. Ordinal:
    • ORDER of intensity
    • catagoical measures that have an order of intensity/degree
    • ---e.g. stage of oral cancer (stages i---iv); curvature of dental
    • root (straight/slight curvature/pronounced curvature)
  3. Interval
    • ex: dates
  4. RATIO
    • ex. perio pocket depth
  5. Continuous Measure
    • Interval: measures that do not have a true zero; the relative difference is the key
    • ---e.g. temperature; dates;
    • Ratio: measures that have a true zero
    • ---e.g. depth of periodontal pocket; size of oral lesion.
  6. Range
    • Distance between the largest and the smallest observation
    • Simplest measure of variability
  7. Percentile
    • Point below which a specified percent of observations lie
    • Percentile of an observation x is given by:

    (# of obs less than x) + 0.5 /total number of obs in data X 100
  8. Central Location
    • The value on which a distribution tends to
    • center
    • Mean: the arithmetic average
    • Median: the middle item of the data set
    • Mode: the most frequent value
  9. Confidence Interval (CI)
    • Measures the likelihood that the true value of a population parameter (e.g., mean) is within the margin of error of the sample estimate.
    • 95% CI is the range of values that would cover the true population parameter 95% over time.
    • 95% CI for a normal distribution: will “capture” µ 95% of the time.
  10. Descriptive Statistics
    •  Dispersion
    • Variance --- measures the variation
    • Standard Deviation (SD)---the square root of the
    • variance, denoted by σ , has the same unit as x
    • Standard Error (SE)---an estimate of the precision of parameter estimates. It measures the variability of an estimate due to sampling:
    • Kurtosis---characterizes the relative peakedness or
    • flatness of a distribution (-2 to infinity)
    • Skewness---measures the asymmetry of a distribution: (-3 to 3
  11. Frequency
    • Most commonly used method to describe categorical measures
    •  Consists of categories, the number of observations
    • and percentage corresponding to each category:
  12. Mode
    Most frequent value
  13. Hypothesis Testing
    •  Goal: judge the evidence for a hypothesis
    •  Steps for hypothesis testing
    • ♦ Stating the null & alternative hypothesis
    • ♦ Choosing an appropriate statistical test
    • ♦ Conducting the statistical test to obtain the pvalue
    • ♦ Comparing the p-value against a fixed cutoff for statistical significance – α (usually 0.05) and make conclusion 12
  14. Type I error
    •  Reject a null hypothesis when it is true---we have
    • committed a Type I error (α error—0.05).
  15. Type II error
    •  Accept a null hypothesis when it is false---we have
    • committed a Type II error (β error—0.2).
  16. P-value of a test
    Probability that the test statistics assumes a value as extreme as, or more extreme than, that observed, given that the null hypothesis is true.
  17. Power
    • (1-β) Probability that you reject the
    • null hypothesis, given that the alternative hypothesis
    • is true.
  18. Parametric test
    • Statistical procedures based on distribution assumptions
    •  t-test
    •  Analysis of Variance (ANOVA)
    •  Chi-Square test
  19. Non-parametric test
    • Statistical procedures not based on distribution assumptions
    •  Sign-test
    •  Kruskal-Wallis test (non-parametric ANOVA)
  20. 2-group T-test:
    Compare whether two independent groups have the same mean of a normally distributed variable with unknown variance.
  21. ANOVA
    • Test means among multiple groups
    • Uses F-test. It is a generalization of t-test and equivalent to t-test if comparing two groups.
    • Data will need to satisfy several assumptions (e.g., the outcome has a normal distribution; equal variance for each group; the data are independent between and within groups.)
    • Example
    • Null=means of all groups are equal
    • F-stat exceeds the critical value for 5% level with a p-value of 0.000<0.05
    • not all means of three groups are the same.
    • Pairwise comparison of means
  22. Chi-Square Test
    • Compare observed data with the data we would expect to
    • obtain according to a specific hypothesis.
    •  Steps of χ2
    • goodness of fit test
    • ---Divided the data into c categories;
    • ---Estimate k parameters of the probability model with your
    • hypothesis;
    • ---Compute observed and corresponding expected cell
    • frequencies;
    • ---Test Statistic:
    • 1. Create 6 intervals (categories): X ≤16.25, 16.25 < X ≤ 17.20, 17.20 <
    • X ≤ 18.15, 18.15 < X ≤ 19.10, 19.10 < X ≤ 20.05, and 20.05 < X.
    • 2. Null hypothesis H0
    • : the underlying distribution from which the
    • measurements came is N(18.37, 1.92), i.e. the normal distribution
    • with mean 18.37, variance 1.92.
    • 3. Calculate the observed frequency and expected frequency.
    •  The p-value is 0.1072, we will accept the null hypothesis .
  23. Sign test
    •  Used to test if there is a difference between paired
    • samples.
    •  Independent pairs of sample data are collected:
    • (x1,y1) (x2, y2)…, the difference of the pairs are
    • calculated, and zeros are ignored.
    •  The null hypothesis is: equal numbers of positive
    • and negative differences.
    • ---A one-sided sign test has p-value 0.1719 indicating that it is not significant at 5% level---no statistically significant difference in # of patients seen between the two offices.
  24. Kruskal-Wallis (K-W) Test
    •  Based on the rank of observations to compare the distribution of a continuous variable among more than two groups—non-parametric ANOVA.
    •  The only assumption required for the population distributions is that they are independent, and continuous.
    •  Many software provide such test (e.g., kwallis in STATA.)
  25. Analysis of Covariance (ANCOVA)
    • Continuous outcome
    • Merger of ANOVA and Regression
  26. Logistic Regression
    • binary outcome
    • Simple --- single predictor
    • Multiple --- two or more predictors
    • Dependent variable is binary
    • Logistic function is non-linear in terms of the probability of event
  27. Linear Regression
    • continuous outcome
    •  Simple --- single predictor
    •  Multiple --- two or more predictors
    • dependent->independent
    • predicted->predictors
    • response -> explanatory
    • outcome->covariates
  28. Logistic Regression
    •  The dependent variable is binary (e.g. whether inflammation of the gingiva presents.)
    •  Logistic function is non-linear in terms of the probability of event.
    • The parameter estimates can be expressed as odds ratio, which describe the relationship between exposure and
    • outcome, controlling for other factors.
  29. Analysis of Covariance (ANCOVA)
    •  A method for comparing mean values of the outcome between groups when adjusting for covariates (e.g., compare mean LOA across groups, adjusting for age)
    •  The response is continuous and the covariates can be both continuous and categorical
    •  An extension of ANOVA or a combination of ANOVA and linear regression
  30. Statistical significance
    • Desired outcome of a study, planning to have enough sample size is of prime importance.
    • – Due to limitations of resources and availability of subjects, we can only get limited sample size.
  31. Sample Size & Statistical Power
    • Five key factors
    • 1. Sample size--the minimum number of unique subjects in your data required to detect a certain difference
    • 2. Effect size--the difference between parameters to be tested, (e.g., difference in LOA between groups)
    • 3. Significance level (Type I error)--the probability that we reject a null hypothesis when it is true(commonly at 0.05)
    • 4. Power --the probability of rejecting a null hypothesis when it is false (equals to 1-Type II error; commonly at 0.8)
    • 5. Variability -- variation of the outcome measure
Card Set
pfd LIU.txt