
Define Medicine
The aggregate of disciplines concerned with illness in humans

Two ways to classify medical disciplines
By type of patient:
1) Clinical Disciplines concerned with illness in individuals
2) Community disciplines concerned with illness in populations

Define Diagnosis and Diagnostic Probability
gnosis (To know)
Definition: perception of a person's current (or past) state of health
The correct diagnostic probability in a given context is the proportion of subjects like the one at hand (diagnostic profile) among whom the illness in question is present.

Define Etiognosis and Etiognosis Probability
Perception regarding the cause of a person's state of health
the correct etiognostic probability in a given context is the proportion of subjects like the one at hand (illness; antecedent present; etiognostic profile) among whom the antecedent was causal to the illness.

Define prognosis and prognostic probability
Definition: perception of a person's future course of health
The correct prognostic probability in a given context is the proportion of subjects like the one at hand (prospective intervention; prognostic profile) among whom the intervention effect, course or outcome will actually occur.

What intervention has had the most profound impact on public health in industrialized countries since WW2?
National health insurance. Clinical medicine moved into the public domain with regard to its financing.

What are the two types of knowledge?
 1) General: Abstract propositions about nature
 2) Particular: Propositions that are time and place specific
*General knowledge tends to be more generalizable

Two types of medical science
 1) Basic medical science:
  The source of innovation in medicine
  More in common with biology than with (the art) of medicine
 2) Applied medical science:
  The creation of knowledge (probability functions) that can be used for gnostication

What is Bayes Theorem and what does it imply
Theorem: Posttest probability= likelihood ratio X pretest probability
It means you need to always take into account the probability of someone having a disease before they take the test. The current belief: new evidence x prior belief

Does science have values?
Science stems from the word knowledge. The word refers to a rational and empirical method, and a body of knowledge.
There are no values in science except those related to objectivity.

Define Epidemiologic Research
Research on the frequency of occurrence of a phenomenon (of medical interest) in populationtime
The relation of any phenomenon to an antecedent (preceding) determinant can be given a causal interpretation if it is conditional on all extraneous (not pertaining to the matter under consideration) determinants.

Is epidemiology a science?
 Not in the subject matter sense
 It is catholic (universal), Epi is an aspect of various sciences.

Two things study design involves
1) Object design: the form the knowledge is to take, guided by how the knowledge will be used. It should serve the needs of medical practice. Most Epi discussions only focus on methods deisgn
2) Methods design: the methods to be deployed to obtain empirical content of a particular form.

Study design usually refers to the type of study with the alternatives being one of the following three:
 1) Cohort
 2) CaseControl
 2) Cross Sectional

What is a cohort study? What equation is used in Cohort studies?
refers to studies which recruit subjects who have been exposed and not exposed to some determinant of interest. These subjects are followed up to determine the rates of the outcome/disease that develop among the exposed and unexposed.
The rate ratio (relative risk) is used to express this association.

What is the difference between proportion rates and density rates?
proportion rates do not have time in the denominator whereas density rates do.

Difference between rate and risk
Some people say that this is the same thing, however, in this class we define risk as an abstract issue. Rate is what defines the concept of risk, rate gives us a sense of risk.

When is an odds ratio appropriate to use instead of relative risk in a cohort study?
 Two conditions must be satisfied:
 1) If the disease rates are low (a/a+b) ~ a/b
 2) If the disease is rare (<5% prevalence in population, for example).
If these conditions are met, the odds ratio will approximate the relative risk in a cohort study.

Relative risk can be misleading what might be better to use?
Rate difference shows how many extra cases of the disease there were in the exposed group compared to the unexposed group per 1,000

Define number needed to harm (NNH) and how to calculate it
indicates how many patients need to be exposed to a risk factor over a specific period to cause harm in one patient that would not otherwise have been harmed. The equation is the inverse of attributable risk (rate difference)

What is the number needed to treat? How do you calculate it?
The NNT is the number of patients who need to be treated to prevent one additional bad outcome (i.e. the number of patients that need to be treated for one to benefit compared with a control in a clinical trial). The ideal NNT is 1, where everyone improves with treatment and no one improves with control. The higher the NNT, the less effective is the treatment
It is defined as the inverse of the absolute risk reduction (or risk difference).

Define Case Control Study
refer to studies which recruit subjects who have an outcome/disease and other subjects who do not have the outcome/disease. The study investigators then determine the (past) exposure/determinant status of each of their cases and controls. Can only be used to calculate odds however if disease is rare, we can use odds synonymously with RR.

What is a crosssectional study?
refers to studies where subjects are selected without regard for the exposure or outcome status. The exposure and outcome status is determined by the study investigator at the same point in time.
can't use these studies to look at incidence
we don't know whether the exposure or outcome came first (problem with temporality)
Inclusion of prevalent cases can lead to bias such that markers of survival get mistaken for markers of disease

What are the two measures of disease frequency, and which ones do each of the study design types use?
 1) Incidence measure the frequency of occurrence of new cases of a disease over a particular time
 2) Prevalence measures the proportion of cases (old and new) of any disease/outcome at a particular point or period.
 Crosssectional prevalence rates (cannot provide incidence)
 Cohort studies prevalence and incidence rates
 Casecontrol studies prevalence rates (cannot provide incidence)

Define cumulative incidence
Incidence proportion i.e. the proportion of a fixed population that develops the outcome of interest over a specified time period.
The proportion of a fixed population that becomes diseased in a stated time period.

Define incidence density
looks at risk. the ratio of the number of new cases of the outcome of interest to the persontime at risk.
determined by dividing the number of new cases of the disease in the population by the population time accrued during follow up.
This index has a measure of time in the denominator

Two ways in which estimates of survival which account for such censoring (i.e. person time issues) can be obtained
1) life table or actuarial method estimates the survival at regular intervals (e.g. at the end of each year)
2) KaplanMeier Method estimate involves a recalculation of the survival estimate every time an even occurs. Usually has a steplike appearance
If intervals in the life table analysis were made shorter... the analysis would look more like Kaplan Meier

What are odds ratio's from casecontrol studies?
Incidence density ratios

What are the three axis used to look at study design?
 1) Directionality
 2) Sample Selection
 3) Timing

Define directionality
Investigator movement is forwards (cause to effect) in cohort studies
Investigator movement is backwards (from effect to cause) in casecontrol studies
Exposure and outcome occur simultaneously in cross sectional studies

Define sample selection
Cohort sampling on basis of exposure status
Casecontrol sampling on the basis of outcome status
Crosssectional random selection of subjects without regard for exposure/outcome status

Define Axis III: Timing
 If for the subjects included in the study:
  exposure and outcome have already occurred timing is historical
  exposure and outcome will occur during follow up timing is concurrent
  exposure has occurred but outcome will occur during follow up timing is mixed

Difference between cohort and dynamic population?
Cohort refers to a type of population. Membership in a cohort is defined by an event there is no exit from a cohort.
Dynamic populations are defined by a "state." Exit is possible and occurs when the state of an individual changes.

Define ecological studies
populations are treated as the units of study. When inferences made at the population level are translated to inferences at the individual level, there is a potential for making erroneous conclusions.

Explain the difference between casecontrol and cohort studies given incidence calculations
Both studies can calculate incidence density ratios. However, cohort studies can also provide an estimate of incidence density (absolute risk) whereas casecontrol studies can only calculate quasiincidence density rates which are useful for only calculating incidence density ratios. This is because casecontrol studies sample incidence density among noncases.

Describe the differences in matching in Cohort and Casecontrol studies
In cohort studies, matching for extranneous determinants of the outcome will ensure that the comparison being made is restricted to the determinant under consideration.
However, matching in cases and controls is thought to have the same effect, problems with imbalance in regard to extranneous variables can only be thought of when contrasting categories (like in cohort, e.g. smoking and nonsmoking), not when we contrast cases and controls.
Matching is intuitively obvious in cohort studies not in casecontrol

Define Equipoise
Total uncertainty to whether a drug works or doesn't. This is ethical grounds for conducting a clinical trial.
Theoretical equipoise is said to exist when, overall, the evidence on behalf of two alternative treatment regimens is exactly ballanced, and it is unknown what treatment should be preferred.

Define Efficacy and Effectiveness are they always different?
Efficacy effect of the drug in a controlled environment
Effectiveness the effect of the drug in the population. Usually we see the effect decrease by 2030% when compared to controlled trials.
Once you state explicitly all the specifications of the drug in question are i.e. the dose, the route of entry, the duration, there should be no difference between efficacy and effectiveness.

Define inclusion criteria, what does it ensure?
the domain within which the question is to be answered (i.e. does drug X have effect on pnemonia patients domain is pnemonia).
Inclusion criteria can also be used to confine the study into only a subdomain of people (i.e. a group at higher risk).
Inclusion criteria insures that the study has conceptual and practical meaning in terms of answering the question posed at the outset of the clinical trial

Three things exclusion criteria are used for
 Exclusion criteria are typically used to prevent recruitment of subjects:
 1) who may be hurt through idiosyncratic reactions to the drug in question (i.e. ppl who may be allergic to the drug)
 2) who will add "noise."
 3) who are likely to be noncomplient with the protocol.

What three features ensure the validity of a clinical trial?
 1) the use of randomization (and also possibly stratification) to assure the comparibility of the populations
 2) the use of placebo or sham treatment for comparison, to assure the comparibility of the effects
 3) the use of blinding to assure comparibility of information and in part, effects.

Define randomization, how often does it work?
randomization produces comparable/balanced groups with respect to known AND unknown risk factors for the outcome. This ensures comparability of populations in the drug and placebo arms of the trial.
works only when the number of subjects randomized is large, and on average.

define stratification
refers to the manoeuver by which subjects recruited into a trial are placed into seperate groups (strata) before being randomized. This ensures that the intervention and placebo arms are exactly balanced with regard to the factor on which the stratification is carried out.
Typically only the most powerful risk factors are used for stratification. The ones that you cannot leave until randomization
Stratification happens before randomization, if done after it is a completely different issue.

Give three reasons not to stratify
 1) large trial where randomization would produce ballanced groups
 2) multiple important outcomes
 3) special circumstances e.g. each strata will only have a small number of people, confidentiality is breached.

Define blocking
refers to the manoeuver which ensures that equal numbers of subjects are allocated to the intervention and placebo arms at any time during the process of subject enrollment (given 2 arms an allocation ratio of 1:1 is reached).

How will you assess if there are substantial differences in risk factor dstributions between drug and placebo arms? How do we solve this if it occurs?
Eyeball the frequency of the risk factor distributions in the drug and placebo arms of the trial and also look at the P value this is almost invariably sufficient to evaluate whether randomization worked.
It is generally accepted that baseline differences in risk factors must be adjusted before outcomes in the drug and placebo groups can be compared.

What are two ways to insure validity of a study if blinding is not possible?
1) have the person who records the outcome assessment of the treatment and placebo groups be blinded
2) make the outcome as hard/objective as possible (i.e. death or severe disease)

define intention to treat
the analysis by intention to treat is the only analysis that respects the randomization. there is no assurance that the groups in any other analysis are comparable in terms of the known and unknown risk factors.

Define generalizability, what is the common view and why is it flawed?
generalizability is an issue which is distinct from validity, it concerns whether or not the results carried out in one group of ppl can be applicable to another group (i.e. study in men applicable to women?)
common view is that no, we must interpret evidence literally and we cannot extrapolate results to any group not included in the trial.
Problem with this:
 1. When studies say, the effect of drug is modified by subjects’ characteristics (age, gender, race), they are in fact referring to “effect modification”. Look for effect modifiers
 2. Common misconception: outcome rates vary by race. Drug found to be useful for one race, should be useful for others . Is that so? What are the problems with this proposition:
 a. There are differences between races
 b. If compliance is law, the issue of relevance is raised not generalizability
 c. Extrapolating result from adult to children or elder lies must be considered with caution in terms of dosages of medication
 3. If we cannot generalize, why study at all? Because eventually the individuals in the study vary some how or other. We can never find exactly similar people to generalize our results to.

What is a clinical trial ideal and not ideal for?
 it is ideal for answering questions regarding efficacyof therapy. It is therefore critical for providing knowledge for causal prognosis setting. It has no/little bearing on the other cognitive products of medicine (i.e. diagnosis)
 it is not the ideal for answering questions regarding unintentded side effects especially those that are infrequent (as the trial sizes are usually too small to observe the rare side effects).

Similarities and differences between confounder and effect modification?
 Confounding:
 Nuisance phenomenon that we wish to eliminate
 Third level variable
 Distorts the true relationship between disease and determinant (X and Y)
 RR in stratified analysis are the same
 Always want to adjust for it
 Effect modification:
 Feature of nature, a biological phenomenon that we wish to explain
 Third level variable
 It is a subject characteristic/factor whose presence/absence alters the effect of a determinant on a disease
 RR in stratified analysis are not the same
 Want to explain it, not adjust for it
 Sometimes it is a the effect of chance! When? If you choose to calculate a wrong effect measure. E.g: OCP and cardiovascular study (RR=3 same for all age groups). If you calculated RD between different age groups, then you think it is a modifier but in fact it is not. Go with RR when you have a modifier

3 Condition for confounders?
1) The extraneous variable should be a risk factor for the disease or a marker for a risk factor. Which variable is the risk factor? It is based on a priori knowledge.
A variable cannot be a confounder if it merely correlates the exposure but not a risk factor for the disease
2) The extraneous variable should be associated with the determinant . This relationship has to be present in the study data and It is not based on a priori knowledge.
3) Confounder must not represent an intermediate step in the causal pathway between determinant and disease.

Two types of confounders?
Type I. extraneous determinant of outcome is confounder (SES), imbalance in Table 1 (solution: randomization)
Type II. Imbalance in cointerventions (other drugs given to intervention arm), occurs postrandomization

How to remove confounders at the design stage?
 Randomization (in RCT: to avoid risk factors being distributed unequally between two groups.
 Matching (Cohort study: Smoking and nonsmoking recruited after matching for alcohol consumption, study of CHD)

How to remove confounders at the analysis stage?
  Adjustment by restriction
  stratification
  standardization
  regression analysis

What are 3 types of weighting schemes?
1) MantelHaenzel weight: The amount of info is based on numbers in cells c and d and total
 2) Inverse of the variance
 Note: larger stratum with more info gets more weight (sample size is only one part of info)
3) Maximum likelihood methods: They give more weight where there is more comparative info

What is regression?
Modeling of data so that outcome occurrence is described as a function of its determinants. The effect of each X is independent of the effect of other X(s).

What is confounding by indication?
In a nonrandomized trial, outcome is worst among those receiving the intervention. Why? Because, we are trying to prevent the outcome among those with severe condition who are the indication for treatment. There is no randomization to cure for known and unknown confounder.

What is residual confounding due to?
SES, behavioral and life style factors

Two reasons for risidual confounders?
1) Sometimes we can not measure the confounder precisely.
2) Sometimes we misclassify the information with regard to determinant. This is less problematic. Can create two problems (residual confounder, effect modification)

What happens when we have nondifferential (random) missclassification?
The effect estimates toward null, with decrease in slope of the relation.

Three solutions to random misclassification?
1) Design the study as free as possible from SES (e.g. compare factory workers with another group of factory workers) If control for SES drops the RR from 3 to 1.5, and you have not measured SES accurately, then perhaps the whole relation between X and Y is abolished.
2) Determine the relation between outcomedeterminant before and after control for SES. If RR was the same before and after adjustment for SES, you are safe
3) Measure SES as accurate as possible

The logic of statistical inference, what are three points that we need to ponder?
If we use 3 percentile cut off point for growth chart, we will missclassify 3% of normal children. If we choose 5 percentile, 5% are misclassified,…
But, the probability that we misclassify the malnourished/overweight child in more than 3%, 5%,… How much higher? Depends on the country. Higher in severely malnourished country.
We will also misclassify some abnormal children as normal

What are the two types of tests used to test for statistical significance? What are their conditions?
 1) Parametric tests
 2) Non parametric tests (distribution free tests)
The process underlying parametric testing for statistical significance includes:
 1) The normal distribution described by 2 parameters: (1) the mean (measure of central tendency) and (2) SD (measure of dispersion)
 90% of all values in a normal distribution will fall within 1.645 SDs of the mean.
 95% 1.96 SD
 99%2.576 SD
 2) Central Limit Theorem: the distribution of the sample mean will approach a normal distribution as the size of the sample increases to infinity. If the distribution of the variable in the population is normal, this will kick in much sooner.
 a) Mean of many samples we draw from the population will be identical to the true population mean.
 b) Variance of the population (σ=sigma) will be = to the variance of the sample mean (SD) divided by the sample size (n): σ=SD/n
 c) Mean of our variable sample will have a normal distribution if
i. The original population has a normal distribution
ii. The sample size is large
3) 95% of the population will fall within 1.96SD from the mean. Random sampling will produce such a sample. If we sample Canadian population randomly, there is < 5% [(10095)/2= 2.5%] chance that we find a gender distribution of 0.51/0.49 (F/M) if that is the true gender distribution (in the population).

Three steps for using nonparametric tests?
Step 1: Before using these tests > transform the data (log, square) to change the distribution to normal. OR or RR is calculated. If the distribution of data is not normal, can log the data
Step 2: if that did not work, use ttest (more powerful than nonparametric tests; used when “n” is too small)
Step 3: if that did not work (distribution cannot be turned to normal), use nonparametric test (exact test)

Define Pvalue and what is the frequentist thoery?
P quantifies the probability that the H0 is true.
 Frequentist theory:
 1) if the H0 is true (no difference between drug and placebo) and
 2) we carry out 100 studies (comparing drug vs. placebo), and
 3) we compute a P value from each study,
then the P value will be <0.05 in only 5% of the studies. Or the probability that the P value is less than 0.05 “by chance alone” is 5%.
It implies that if the H0 is not true or H1 is true, then the probability that the P value obtained in a single study will be <0.05 is more than 5% (>0.05). In fact, the probability is between 0.051111 to 0.949999999.

Relationship between Ho, P, and alpha?
 Two important points:
  If H0 is true, the probability that P<alpha =alpha (or any other cut off point)
  If H0 is not true or H1 is true, the probability that P<alpha is >alpha.
However, in real world, we do not know what is a happening in the population.
For any questions that might come for this issue, just look at the alpha given (0.02, 0.05,…) and see if H0 is true or not.
  If H0 is true, then the P = alpha;
  if H0 is not true (H1 is true), then the P is > alpha

What does the frequentist theory tell us about confidence intervals?
Frequentist theory assures us that
  if we calculate a rate or a mean based on a sample of population and
  compute 95% CI around the rate or mean
then, the population rate or mean will be contained within our 95% CI in 95 of every 100 times we sample the population and calculate the rate or mean and 95% CI.
 90% CI is a narrower and 99% CI is a wider limit.
It is important how you interpret the CI.
Do say: CI contains the true value (=the parameter or µ) with a 95% probability

Which is more superior? P value or CI?
CI is superior to P value because you are giving more information.
  P value only (just ells you H0 is rejected or not) you should then look for ratio, difference,… to get a sense of magnitude of the effect
  CI gives a sense of precision is added to the information and the
  Rate ratio gives the magnitude of the effect.

When comparing means and CI, what three scenerios are possible and what do they mean?
1) The 95% CIs do not overlap > P<0.05, reject the H0
2) The 95% CI on one rate includes the point estimate of the 2nd rate > P<0.05 cannot reject H0
3) The 95% CI on the two rates just overlap with each other, but do not include the point estimates. > Not clear if P<0.05 or P>0.05. Need to do a significant test.

What are the 4 main points adressed in Mettinen's paper regarding the differences between cohort and casecontrol studies
1) causal evidence about determinantoutcome relationships cannot be obtained except through studies of incidence (i.e. not casecontrol studies). Measuring incidence rates among different determinant categories requires the following of populations and documentation of experiences
2) The cohort study represents a design option in which the study experience is documented with a census conducted with regard to cases and a census conducted with regard to noncases as well.
3) the casecontrol study represents a design option in which the study experience is documented, with a census conducted with regard to cases and a sampling of the experience with regard to noncases
4) There are two types of casecontrol studies, loosely reffered to as the populationbased case control study and the hospitalbased case control study

What is a population based casecontrol study?
Primary study base, secondary scheme for caseascertainment
The study base is defined first (i.e. casecontrol studies carried out within a defined cohort or clinical trial or city etc.)
The challenge is complete case ascertainment and controls should be neighbours or other random sampling techniques.

What is a hospital based casecontrol study?
primary scheme for case ascertainment, secondary study base
Cases identified first from a hospital registry
The challenge is sampling the study base experience (i.e. the catchment population of the hospital or cancer registry etc.)
Practical approach: identify a disease/condition that has the same referal pattern as your case disease/condition, however, is NOT associated with the determinant under study.

What is the difference between matching in Cohort and matching in Casecontrol studies?
 Matching in cohort studies bears directly on validity (similar effect as stratification in RCT). In a cohort and RCT, this insures that the confounding variable is equally distributed across the determinant contrast
 Difference: in matching in cohort there is selection involved with regard to subject recruitment. Stratification in a clinical trial has no bearing on entry of subjects into the study.
Matching in casecontrol does not bear on validity. We already expect beforehand that risk factors for the outcome will be more prevelent among cases than noncases. We match in casecontrol for efficiency reasons, we ensure that there are cases and controls for each strata of the confounder. If we don't do this, then one of our cells will have 0 cases or controls and this will cause problems with out model when adjusting for confounders.

How is a casecontrol study related to etiognostic probability, what is the equation for etiognostic probability
Casecontrol studies provide the rate ratio (RR) for linking the etiologic determinant and the disease.
Clinically, we are interested in the etiognostic probability that the agent (exposure) in question was responsible for the disease. This is obtained through this equation:
Etiognostic probability: (RR1)/RR

