quals: clinical data mining part I

  1. Sources of clinical data in a health care system (7)
    • 1) Government - census and demographics, epidemiological data, economic data
    • 2) Hospital/clinical/doctor - health records, doctor notes, queries
    • 3) Pharma/biotech/medical devices/diagnostics - domain knowledge
    • 4) drug store - sales
    • 5) internet/library/journals - domain knowledge, public databases
    • 6) individual/personal - fitness trackers, home monitors, forums
    • 7) insurance companies - claims
  2. Types of clinical data from the government (3). Why is it generated?
    • census and demographics - structured
    • epidemiological data - structured
    • economic data - structured
    • Uses: public use, policy making
  3. Types of clinical data from the hospital/clinic/doctor (3). Why is it generated?
    • health records: structured - procedures, diagnoses, prescriptions, unstructured - images (x-rays, CT-scans, slides, photographs)
    • doctor notes - unstructured
    • queries - unstructured
    • uses: operations, quality, billing
  4. Types of clinical data from the pharma/biotech/medical devices/diagnostics (1). Why is it generated?
    • domain knowledge - structured
    • uses: marketing, business intelligence
  5. Types of clinical data from the drug store (1). Why is it generated?
    • sales - structured
    • uses: operations, marketing
  6. Types of clinical data from the internet/library/journals (2). Why is it generated?
    • domain knowledge - unstructured
    • public databases - structured
    • uses: public use, professional gain
  7. Types of clinical data from the individual/personal (3). Why is it generated?
    • fitness tracker - structured (heartbeat monitor - unstructured)
    • home monitors - structured
    • forums - unstructured
    • uses: personal interest, caregiver records
  8. Types of clinical data from the insurance companies (1). Why is it generated?
    • claims - structured
    • uses: business intelligence, billing
  9. key players in healthcare in the united states (3)
    • patients
    • insurers and government policymakers
    • clinicians
  10. Views and priorities of patients in the healthcare system
    • view: what is right for me
    • prevention and care
    • information and unbiased guidance
    • perceived value
  11. Views and priorities of insurers and government policy makers in the healthcare system
    • view: what is best for society
    • measured effectiveness
    • access cost
  12. Views and priorities of clinicians in the healthcare system
    • view: what is best for medicine?
    • professionalism
    • autonomy
    • science and technology
  13. What is an observational study? (3)
    • uncontrolled assignments
    • researcher does not interfere with the world
    • subject self-select into comparator groups
  14. Advantages of observational studies (2)
    • can use secondary data
    • avoid ethical issues
  15. Issues using evidence from observational data in patient care? (5)
    • misclassification bias
    • selection bias
    • missing data
    • up-coding
    • ground truth difficult to source
  16. misclassification bias
    • data is labeled incorrectly
    • exposure/treatment misclassification
    • disease/outcome misclassification
  17. selection bias in cohort studies (2)
    • loss to follow-up bias: people who didn't get the treatment have less interaction with their health system and are less likely to follow up, so their disease goes unobserved
    • recording bias: exposure is more likely to have been recorded if the disease is likely (e.g. if the doctor suspects the patient could get lung cancer, they are more likely to ask if they smoke)
  18. selection bias in case-control studies (2)
    • self-selection bias: increased likelihood of sampling from a particular group (e.g. sick exposed people are more likely o participate in study)
    • control selection bias: process for picking controls means more controls are likely to have been not exposed (e.g. controls are drawn from the community)
  19. up-coding
    biases in billing codes assigned to patients
  20. 10 steps to good big data studies
    • 1) clearly state the type of question
    • 2) report standard diagnostics
    • 3) report the number of hypotheses tested
    • 4) report exact cohort definitions
    • 5) empirically calibrate significance thresholds, and confidence intervals
    • 6) test for, and quantify, non-stationarity in the data used
    • 7) examine multiple dimensions of performance (calibration, URPC, effect size, estimated FDR)
    • 8) quantify the "instability" of the analysis (i.e. variance in the face of alternative study designs and data sources)
    • 9) replication - over time, across sites, and using different study designs
  21. fruitful areas of activity (4)
    • risk stratification: cost, latent disease, decompensation
    • personalizing evidence: risk for me, what treatment will work
    • understanding disease: insights to disease progression, using passively collected data for outcome assessment
    • practice management: predicting missed appointments, medication adherence, cost-blooms, complications, care coordination, population health management
Card Set
quals: clinical data mining part I
bmi 215 quals questions