Taking the Pulse Of Ethical ML in Health
Marzyeh Ghassemi
graph LR classDef ethical fill:#f9d4d4, font-weight:bold, font-size:14px classDef biases fill:#d4f9d4, font-weight:bold, font-size:14px classDef interaction fill:#d4d4f9, font-weight:bold, font-size:14px classDef regulation fill:#f9f9d4, font-weight:bold, font-size:14px A[Taking the Pulse
Of Ethical ML
in Health] --> B[Ethical
AI] A --> C[Biases
Disparities] A --> D[Human-AI
Interaction] A --> E[Regulation
Approaches] B --> B1[Ethical AI:
Address biases,
fairness, safety. 1] B --> B2[Triage models:
Prioritize patients,
balance efficiency. 2] B --> B3[Human error:
Preventable adverse
events, misdiagnoses. 3] B --> B4[RCT limitations:
Limited applicability,
diverse populations. 4] C --> C1[AI biases:
Perpetuate, amplify
existing biases. 5] C --> C2[Racial bias:
AI detects race
from images. 6] C --> C3[Language model biases:
Produces biased
clinical notes. 7] C --> C4[Data collection:
Biases, representation
in datasets. 8] C --> C5[Labeling practices:
Induce measurement
gaps, biases. 9] C --> C6[Biased embeddings:
Pre-trained models
perpetuate biases. 11] D --> D1[Human-AI interaction:
Affects clinician,
patient decisions. 17] D --> D2[Explainable AI:
Post-hoc explanations,
new biases. 18] D --> D3[Trust in AI:
People trust flawed
AI advice. 19] D --> D4[Gender biases:
Physician-patient gender
impacts outcomes. 22] D --> D5[Racial disparities:
Preventative measures,
shared race. 23] D --> D6[Discrimination:
Poor care for certain
demographics. 24] E --> E1[Regulatory frameworks:
Government oversight
structures. 21] E --> E2[Lessons from aviation:
Feedback, auditing,
training. 20] E --> E3[Personalization vs.
Demographics worsen performance. 25] E --> E4[Privacy-utility trade-offs:
Differential privacy,
biases. 28] E --> E5[Stable algorithms:
Reduce trade-offs
in AI models. 29] E --> E6[Interdisciplinary approach:
ML, healthcare,
social sciences. 30] class A,B,B1,B2,B3,B4 ethical class C,C1,C2,C3,C4,C5,C6 biases class D,D1,D2,D3,D4,D5 interaction class E,E1,E2,E3,E4,E5,E6 regulation


1.- Ethical AI in healthcare: Developing machine learning models for medical applications while addressing biases, fairness, and safety concerns.

2.- Triage models: AI systems to prioritize patients in emergency rooms, balancing efficiency with potential biases.

3.- Human error in healthcare: Significant rates of preventable adverse events and misdiagnoses in current medical practice.

4.- Limitations of randomized controlled trials: Only 10-20% of treatments are based on RCTs, with limited applicability to diverse populations.

5.- AI model biases: Machine learning models can perpetuate and amplify existing biases in healthcare data and practices.

6.- Racial bias in imaging: AI models can detect self-reported race from medical images, raising concerns about potential discrimination.

7.- Language model biases: AI text generation models can produce biased content in clinical notes, reinforcing stereotypes.

8.- Data collection challenges: Importance of considering biases and representation in healthcare datasets used for AI training.

9.- Labeling practices: Standard data labeling methods can induce measurement gaps and biases in AI models.

10.- Normative vs. descriptive data: Collecting data for the specific context of application can improve model performance and fairness.

11.- Biased embeddings: Pre-trained models can perpetuate biases even when downstream tasks use balanced data.

12.- Demographic attributes in prediction: Including demographic features in models can sometimes worsen performance for specific subgroups.

13.- Fair use audits: Systematic evaluation of how demographic features impact model performance across different groups.

14.- Optimizing decision support checklists: Developing fair and effective clinical checklists using mixed integer programming.

15.- Distribution shifts: Challenges in deploying models across different healthcare settings with varying patient populations.

16.- Causal attribution of shifts: Using game theory to identify sources of performance drops when deploying models in new settings.

17.- Human-AI interaction: Considering how model recommendations are presented to clinicians and patients affects decision-making.

18.- Explainable AI limitations: Post-hoc explanations of black-box models can sometimes introduce new biases or reduce performance.

19.- Trust in AI systems: People tend to trust AI advice even when it's clearly flawed, highlighting the need for careful deployment.

20.- Lessons from aviation safety: Applying principles of systematic feedback, auditing, and training from aviation to healthcare AI.

21.- Regulatory frameworks: Proposing structures for government oversight and regulation of healthcare AI systems.

22.- Gender biases in healthcare: Differences in patient outcomes based on physician-patient gender concordance.

23.- Racial disparities in care: Improved preventative measures when patients and physicians share the same race.

24.- Discrimination in healthcare: Higher rates of poor quality care for certain demographic groups, like Muslim women in maternity services.

25.- Personalization vs. "worsonalization": Instances where including demographic information can worsen model performance for individuals.

26.- Challenges of individual-level interpretation: Difficulties in providing personalized explanations of AI model predictions to patients.

27.- Regulation of model updates: Balancing the need for model improvements with safety concerns in deployed healthcare AI systems.

28.- Privacy-utility trade-offs: Challenges in applying differential privacy techniques to healthcare data without introducing new biases.

29.- Stable learning algorithms: Developing methods to reduce extreme trade-offs between privacy, utility, and stability in AI models.

30.- Interdisciplinary approach: Combining insights from machine learning, healthcare, social sciences, and other fields to address ethical AI challenges.

