The End Of Knowledge - Vault 2 - ICLR (2014-2023)

graph LR classDef healthcare fill:#f9d4d4, font-weight:bold, font-size:14px; classDef parkinson fill:#d4f9d4, font-weight:bold, font-size:14px; classDef mortality fill:#d4d4f9, font-weight:bold, font-size:14px; classDef challenges fill:#f9f9d4, font-weight:bold, font-size:14px; A[Suchi Saria
ICLR 2018] --> B[Data-driven transformation 1] A --> C[Policy changes incentivized
patient data digitization 2] A --> D[Parkinson's affects millions,
treatment manages symptoms 3] D --> E[Measuring Parkinson's severity
is subjective, infrequent 4] D --> F[Android app collects
Parkinson's symptom data 5] F --> G[Semi-supervised learning used
for severity mapping 6] F --> H[Objective function learns
severity from rankings 7] D --> I[mPDS correlates with
clinical instruments prospectively 8] D --> J[mPDS enables frequent,
objective symptom measurement 9] A --> K[Approach applicable to
other diseases 10] A --> L[Understanding domain crucial
for problem formulation 11] A --> M[Semi-supervised learning, sensors
enable limited supervision 12] A --> N[Early diagnosis recognizes
diseases before symptoms 13] A --> O[Machine learning may
enable early recognition 14] A --> P[Mortality risk prediction
uses initial measurements 15] P --> Q[Different datasets produce
conflicting risk scores 16] P --> R[Unknown interventions affect
risk between prediction and outcome 17] P --> S[ML averages over unseen factors,
causing sensitivity 18] P --> T[Counterfactual reasoning models
risk under interventions 19] T --> U[Counterfactual Gaussian process
estimates risk trajectories 20] T --> V[Counterfactual model produces
stable risk estimates 21] A --> W[Controlling for interventions
important beyond healthcare 22] A --> X[Robust ML requires
iterative process 23] A --> Y[Applied to early
sepsis prediction 24] A --> Z[Challenges: problem framing, robustness,
validation, uncertainty, collaboration 25] A --> AA[Healthcare offers impactful
ML applications 26] A --> AB[Survival analysis more
natural than classification 27] A --> AC[Constraints enforce expected
Parkinson's progression, sensitivity concerning 28] A --> AD[Deep learning raises generalization,
uncertainty quantification concerns 29] A --> AE[Problem formulation, bias, uncertainty,
weak supervision prioritized 30] class A,B,C,K,L,M,N,O,W,X,Y,Z,AA,AD,AE healthcare; class D,E,F,G,H,I,J,AC parkinson; class P,Q,R,S,T,U,V,AB mortality; class Z challenges;

Resume:

1.-Healthcare is undergoing a data-driven transformation, enabling novel software-based interventions to improve care quality and reduce costs.

2.-Policy changes in 2008-2010 incentivized healthcare systems to digitize patient data, making diverse datasets available for analysis.

3.-Parkinson's disease affects millions, costing billions annually, and treatment focuses on symptom management as the disease progresses.

4.-Measuring Parkinson's severity is challenging, relying on subjective assessments by neurologists during infrequent clinic visits.

5.-The team developed an Android app using phone sensors to collect active test data on Parkinson's symptoms from patients at home.

6.-Obtaining clinical ratings to map sensor data to severity is expensive, so they used semi-supervised learning with comparison pairs instead.

7.-An objective function was optimized to learn a severity score from sensor data that is concordant with clinician severity rankings.

8.-The mobile Parkinson's disease score (mPDS) showed high correlation with standard clinical instruments in a prospective study.

9.-mPDS enables frequent, objective measurement of Parkinson's symptoms and progression at home, which was not previously possible.

10.-This approach of augmenting clinical capability with machine learning on device data is applicable to many other diseases.

11.-Achieving clinical impact required deeply understanding the medical domain to innovate on problem formulation and data collection.

12.-Using semi-supervised learning and exploiting sensor data enabled learning clinically meaningful scores with limited supervision.

13.-Early diagnosis aims to recognize diseases earlier than current diagnostic guidelines based on visible signs and symptoms.

14.-Early recognition could enable prevention in some cases, and machine learning on patient data may make this possible.

15.-Mortality risk prediction in hospitals tries to use initial patient measurements to forecast death risk during admission.

16.-Training ML models on different hospital datasets can produce conflicting risk scores for the same patient.

17.-This problem arises because risk is affected by unknown interventions that occur between prediction time and outcome.

18.-ML averages over these unseen factors, causing model predictions to be sensitive to dataset-specific intervention patterns.

19.-Counterfactual reasoning is proposed, explicitly modeling risk under specific intervention regimes to control for these hidden factors.

20.-A counterfactual Gaussian process model is developed to estimate risk trajectories under different intervention regimes from time series.

21.-On simulated data, the counterfactual model produces stable risk estimates across intervention regimes while a standard model's estimates vary.

22.-Controlling for intervention regime when forecasting from temporal data is important for decision support applications beyond healthcare.

23.-Developing robust ML systems for healthcare requires an iterative process of model building, diagnostics, problem reformulation, and actionability assessment.

24.-This approach has been applied to early prediction of sepsis, a life-threatening condition, to enable early treatment.

25.-Further challenges include framing the right problems, robustness to unexpected scenarios, expensive gold-standard data for validation, calibrated uncertainty estimates, and human-machine collaboration.

26.-Healthcare offers many impactful application areas for machine learning as novel data sources become available.

27.-Survival analysis accounting for censoring is a more natural formulation than binary classification for the mortality forecasting problem.

28.-Adding constraints in the Parkinson's model to enforce expected progression over disease stages is possible but sensitivity is a concern.

29.-Deep learning for healthcare raises concerns about generalization to out-of-distribution patients and the adequacy of uncertainty quantification methods.

30.-In the speaker's experience, problem formulation, bias adjustment, uncertainty estimation, and weak supervision have been higher priority than complex models.

Knowledge Vault built byDavid Vivancos 2024