# pfd LIU.txt

 Nominal: CATEGORYNO ORDERcategorical measures that do not have an order ---e.g. color (red/blue/green/etc); types of teeth (molars/incisors/premolars/canine) Ordinal: CATEGORYORDER of intensitycatagoical measures that have an order of intensity/degree ---e.g. stage of oral cancer (stages i---iv); curvature of dental root (straight/slight curvature/pronounced curvature) Interval CONTINUOUSNO TRUE ZEROex: dates RATIO CONTINUOUSTRUE ZEROex. perio pocket depth Continuous Measure Interval: measures that do not have a true zero; the relative difference is the key ---e.g. temperature; dates; Ratio: measures that have a true zero---e.g. depth of periodontal pocket; size of oral lesion. Range Distance between the largest and the smallest observation Simplest measure of variability Percentile Point below which a specified percent of observations lie Percentile of an observation x is given by: (# of obs less than x) + 0.5 /total number of obs in data X 100 Central Location The value on which a distribution tends to center Mean: the arithmetic average Median: the middle item of the data set Mode: the most frequent value Confidence Interval (CI) Measures the likelihood that the true value of a population parameter (e.g., mean) is within the margin of error of the sample estimate. 95% CI is the range of values that would cover the true population parameter 95% over time. 95% CI for a normal distribution: will “capture” µ 95% of the time. Descriptive Statistics  Dispersion  Variance --- measures the variation  Standard Deviation (SD)---the square root of the variance, denoted by σ , has the same unit as x Standard Error (SE)---an estimate of the precision of parameter estimates. It measures the variability of an estimate due to sampling: Kurtosis---characterizes the relative peakedness or flatness of a distribution (-2 to infinity) Skewness---measures the asymmetry of a distribution: (-3 to 3 Frequency Most commonly used method to describe categorical measures  Consists of categories, the number of observations and percentage corresponding to each category: Mode Most frequent value Hypothesis Testing  Goal: judge the evidence for a hypothesis  Steps for hypothesis testing ♦ Stating the null & alternative hypothesis ♦ Choosing an appropriate statistical test ♦ Conducting the statistical test to obtain the pvalue ♦ Comparing the p-value against a fixed cutoff for statistical significance – α (usually 0.05) and make conclusion 12 Type I error REJECT TRUE NULL Reject a null hypothesis when it is true---we have committed a Type I error (α error—0.05). Type II error ACCEPT FALSE NULL Accept a null hypothesis when it is false---we have committed a Type II error (β error—0.2). P-value of a test Probability that the test statistics assumes a value as extreme as, or more extreme than, that observed, given that the null hypothesis is true. Power (1-β) Probability that you reject the null hypothesis, given that the alternative hypothesis is true. Parametric test Statistical procedures based on distribution assumptions  t-test  Analysis of Variance (ANOVA)  Chi-Square test Non-parametric test Statistical procedures not based on distribution assumptions Sign-test  Kruskal-Wallis test (non-parametric ANOVA) 2-group T-test: Compare whether two independent groups have the same mean of a normally distributed variable with unknown variance. ANOVA Test means among multiple groupsUses F-test. It is a generalization of t-test and equivalent to t-test if comparing two groups. Data will need to satisfy several assumptions (e.g., the outcome has a normal distribution; equal variance for each group; the data are independent between and within groups.)ExampleNull=means of all groups are equalF-stat exceeds the critical value for 5% level with a p-value of 0.000<0.05not all means of three groups are the same.Pairwise comparison of means Chi-Square Test Compare observed data with the data we would expect to obtain according to a specific hypothesis.  Steps of χ2 goodness of fit test ---Divided the data into c categories; ---Estimate k parameters of the probability model with your hypothesis; ---Compute observed and corresponding expected cell frequencies; ---Test Statistic: 1. Create 6 intervals (categories): X ≤16.25, 16.25 < X ≤ 17.20, 17.20 < X ≤ 18.15, 18.15 < X ≤ 19.10, 19.10 < X ≤ 20.05, and 20.05 < X. 2. Null hypothesis H0: the underlying distribution from which the measurements came is N(18.37, 1.92), i.e. the normal distribution with mean 18.37, variance 1.92. 3. Calculate the observed frequency and expected frequency. The p-value is 0.1072, we will accept the null hypothesis . Sign test  Used to test if there is a difference between paired samples.  Independent pairs of sample data are collected: (x1,y1) (x2, y2)…, the difference of the pairs are calculated, and zeros are ignored.  The null hypothesis is: equal numbers of positive and negative differences. ---A one-sided sign test has p-value 0.1719 indicating that it is not significant at 5% level---no statistically significant difference in # of patients seen between the two offices. Kruskal-Wallis (K-W) Test  Based on the rank of observations to compare the distribution of a continuous variable among more than two groups—non-parametric ANOVA.  The only assumption required for the population distributions is that they are independent, and continuous.  Many software provide such test (e.g., kwallis in STATA.) Analysis of Covariance (ANCOVA) Continuous outcomeMerger of ANOVA and Regression Logistic Regression binary outcomeSimple --- single predictor Multiple --- two or more predictorsDependent variable is binaryLogistic function is non-linear in terms of the probability of event Linear Regression continuous outcome  Simple --- single predictor  Multiple --- two or more predictorsdependent->independentpredicted->predictorsresponse -> explanatoryoutcome->covariates Logistic Regression  The dependent variable is binary (e.g. whether inflammation of the gingiva presents.)  Logistic function is non-linear in terms of the probability of event. The parameter estimates can be expressed as odds ratio, which describe the relationship between exposure and outcome, controlling for other factors. Analysis of Covariance (ANCOVA)  A method for comparing mean values of the outcome between groups when adjusting for covariates (e.g., compare mean LOA across groups, adjusting for age)  The response is continuous and the covariates can be both continuous and categorical  An extension of ANOVA or a combination of ANOVA and linear regression Statistical significance Desired outcome of a study, planning to have enough sample size is of prime importance. – Due to limitations of resources and availability of subjects, we can only get limited sample size. Sample Size & Statistical Power Five key factors 1. Sample size--the minimum number of unique subjects in your data required to detect a certain difference 2. Effect size--the difference between parameters to be tested, (e.g., difference in LOA between groups) 3. Significance level (Type I error)--the probability that we reject a null hypothesis when it is true(commonly at 0.05) 4. Power --the probability of rejecting a null hypothesis when it is false (equals to 1-Type II error; commonly at 0.8) 5. Variability -- variation of the outcome measure Authoremm64 ID133493 Card Setpfd LIU.txt DescriptionPFD Lui Updated2012-03-21T04:13:48Z Show Answers