quals: clinical data mining part V

Home

Get App

Create

tips for predictive modeling project: data clean up will take about ___% of the time

80% - do not take short cuts here
tips for predictive modeling project: try _____ things first

simple
tips for predictive modeling project: don't get fooled by AUC, examine...

precision recall, calibration, net-reclassification
tips for predictive modeling project: ask whether more ____ and/or ____ will increase performance, and whether ____ from different models are correlated.
- data
- features
- errors
tips for predictive modeling project: don't get attached to...

one model
tips for predictive modeling project: think about model deployment (2)
- ease of applying the model
- cost of taking action
learning algorithms with bigger search spaces have ____ bias, but ____ variance.
- less bias
- more variance
shrinking the search space reduces the ____, at the cost of increasing ____
- variance
- bias
ways to evaluate predictive models
- AUROC
- AUPRC
- train, test, validation sets
error metrics: regression (2)
- mean-squared error
- absolute error
error metrics: classification (class labels) (3)
- misclassification error
- Cohen's kappa
- sensitivity, specificity, etc.
error metrics: classification (probability) (2)
- AUCs
- calibration
error metrics: discriminations
- distinguish between those who will die from those who will survive
- two bins: dead or alive
error metrics: calibration
- fine-grained accuracies at different levels of risk
- e.g. what fraction of patients placed by the score into the 35-40% survival bin actually survived?
error metrics: underfit
- high bias
- low variance
error metrics: overfit
- low bias
- high variance
field test for a predictive model (question to ask) (6)
- what would the intervention be?
- who dispenses the intervention?
- what is the threshold for action - i.e. what level of false positives are acceptable in the predictions/
- what is the outcome we track - consult rates, time between AD setup and death
- what is the performance measure fo the prediction - physicial agreement, useful consuls, actual accuracy
- what would be the mechanics of dispensing the intervention - what is the capacity to intervene
goals of data mining clinical text (7)
- biosurveillance
- automatic terminology management
- decision support
- automatic deidentification
- document coding
- cohort building
- discover new knowledge (text-mining)
biosurveillance

monitor next outbreak
automatic terminology management

add disease terms that are used but not in your list
decision support

recommend treatments
automatic deidentification

NLP challenges
document coding

query and reporting
cohort building

enable clinical research
discover new knowledge (text-mining)

make discoveries using informatics tools
key issues with data mining using EHR data (4)
- haiku of acronyms: ungrammatical, misspellings, concatenations, acronyms, abbreviations
- high variant in quality: clear communication (radiology reports) vs. documentation (progress notes)
- lot of copy-pasting: institution-specific template use
- ridiculous amount of agony in getting access: fear, misunderstanding, and confusion around security, privacy, de-identification, and anyonymization
common sources of bias in EHR data (6)
- insurance level - patients without coverage are less likely to seek professional care
- misdiagnosis, miscoding of drugs
- incomplete record keeping
- miscoding of diagnosis
- loss to follow-up
- incomplete/false record linkage
Things to worry about with data mining in EHR data (4)
- repeated observations
- irregular time intervals
- large number of (sparse) features
- timing - ordering of events is crucial, different questions require different time scales
learning health care system
- using data to directly impact point-of-care
- i.e. physicians look for evidence from other patients in EHR, applies to patient case
- allows researchers to do studies on the fly
- not adopted yet at the point of care
- stride should be used for research, not patient care - but something like this would be a good application of the learning health system
what are national efforts to enable reuse of electronic health data for research?
- OMOP (observational medical outcomes partnership)
- OHDSI (obsrvational health data sciences initiative)
- I2B2

Author

tulipyoursweety

350173

Card Set

quals: clinical data mining part V

Description

Updated

2019-12-31T05:35:31Z

Show Answers