Knowledge Vault 6 /62 - ICML 2021
Rethinking Drug Discovery in the Era of Digital Biology
Daphne Koller
< Resume Image >

Concept Graph & Resume using Claude 3.5 Sonnet | Chat GPT4o | Llama 3:

graph LR classDef drug fill:#f9d4d4, font-weight:bold, font-size:14px classDef ML fill:#d4f9d4, font-weight:bold, font-size:14px classDef bio fill:#d4d4f9, font-weight:bold, font-size:14px classDef tech fill:#f9f9d4, font-weight:bold, font-size:14px classDef genetics fill:#f4d4f9, font-weight:bold, font-size:14px A[Rethinking Drug Discovery
in the Era
of Digital Biology] --> B[Drug
Discovery] A --> C[Machine
Learning] A --> D[Biology
Integration] A --> E[Genetics
Data] A --> F[Technology
Applications] B --> B1[New medications
development
process. 1] B --> B2[Decline in
R&D
productivity. 2] B --> B3[AI-driven drug
development
predictions. 3] B --> B4[Iterative model
training
process. 13] B --> B5[Vast chemical
compound
synthesis. 24] B --> B6[On-demand compound
library
creation. 25] C --> C1[AI with
high-quality
data. 4] C --> C2[Learning data
relationships
directly. 6] C --> C3[AI generalization
outside training
data. 18] C --> C4[AI-enhanced
experimental
procedures. 20] C --> C5[AI in
medical
image analysis. 30] C --> C6[Continuous
predictions
better. 27] D --> D1[Bio data and
AI
integration. 5] D --> D2[Biology and
AI
combination. 14] D --> D3[In vitro
biological
models. 15] D --> D4[3D stem cell
cultures. 16] D --> D5[AI-based biology
representations. 7] D --> D6[Interdisciplinary
biomedical
collaboration. 23] E --> E1[DNA for
disease
understanding. 9] E --> E2[Linking genetics
to
traits. 11] E --> E3[Genetic-disease
causal
links. 17] E --> E4[Reprogramming cells
for
modeling. 29] E --> E5[High-quality
datasets
importance. 19] E --> E6[Quantitative compound
affinity
readout. 12] F --> F1[Modeling biology
with AI
limits. 21] F --> F2[Moving research
to
biotech. 22] F --> F3[AI for
molecular
structures. 26] F --> F4[AI in
protein
prediction. 28] F --> F5[Targeted treatments
for patient
subgroups. 8] F --> F6[Human
genetic
data. 10] class A,B,B1,B2,B3,B4,B5,B6 drug class C,C1,C2,C3,C4,C5,C6 ML class D,D1,D2,D3,D4,D5,D6 bio class E,E1,E2,E3,E4,E5,E6 genetics class F,F1,F2,F3,F4,F5,F6 tech

Resume:

1.- Drug discovery: Process of developing new medications, with recent successes in vaccines, cancer treatments, and genetic therapies like for cystic fibrosis.

2.- Eroom's Law: Exponential decrease in pharmaceutical R&D productivity, with current cost per approved drug exceeding $5 billion due to high failure rates.

3.- Machine learning in drug discovery: Using AI to make better predictions at decision points throughout the drug development process.

4.- In-Citro approach: Integrating machine learning with high-quality data creation/collection to improve predictions in pharmaceutical R&D value chain.

5.- Convergence of life sciences and machine learning: Combining biological data generation tools with AI to drive insights in drug discovery.

6.- End-to-end learning: Machine learning approach that learns data representations, uncovering relationships between instances not apparent in original labeling.

7.- Human biology modeling: Using machine learning to create representations of human biology for predicting clinical impact of interventions.

8.- Patient heterogeneity: Understanding that complex diseases often comprise multiple biological processes, necessitating targeted therapeutics for specific patient subgroups.

9.- Human genetic data: Leveraging DNA sequencing and phenotypic information to understand genetic drivers of disease and drug targets.

10.- Biobanks: Large-scale collections of biological samples and data, like UK Biobank, enabling genetic and clinical research.

11.- Genome-wide association studies: Research linking genetic variants to clinical outcomes or traits, revealing genetic architecture of diseases and characteristics.

12.- Indexer technology: Provides more quantitative, sensitive readout of compound binding affinity, improving machine learning model inputs.

13.- Active learning loop: Iterative process of model training, compound selection, and testing to improve predictive capabilities in drug discovery.

14.- Digital biology: Emerging discipline combining quantitative biology measurement/intervention tools with data science/machine learning for biological insights and interventions.

15.- Organs-on-chips: In vitro models replicating multiple cell types and complex relationships, useful for studying biological systems.

16.- Organoids: 3D cell cultures derived from stem cells, forming miniature organ-like structures for more scalable, faithful organ recapitulation.

17.- Causal relations in genetics: Genetic variants associated with disease phenotypes often indicate causal relationships, with some confounding factors.

18.- Out-of-distribution robustness: Developing machine learning models that can generalize to data outside the training distribution, important for biological applications.

19.- Data quality and artifacts: Importance of high-quality, purpose-built datasets for machine learning in biology to avoid model focus on spurious correlations.

20.- Machine learning in wet lab processes: Using AI to optimize cell culture conditions and experimental procedures, enhancing data generation.

21.- Limits of learnability in biology: Philosophical question about which biological processes can be effectively modeled and predicted by machine learning.

22.- Academia to industry transition: Challenges and motivations for moving from academic research to industrial applications in biotech.

23.- Team-based approach: Importance of interdisciplinary collaboration and effective teamwork in tackling complex biomedical problems.

24.- DNA-encoded libraries: Technology enabling synthesis and testing of vast numbers of chemical compounds for drug discovery.

25.- Programmable DEL synthesis: Advanced method allowing on-demand creation of large compound libraries based on DNA-encoded instructions.

26.- Graph neural networks: Machine learning models effective at processing molecular structures for predicting binding affinities and other properties.

27.- Regression vs. classification in drug discovery: Utilizing continuous predictions (regression) can provide more informative results than binary classification.

28.- AlphaFold impact: Demonstration of machine learning's potential in solving complex biological problems like protein structure prediction.

29.- Induced pluripotent stem cells: Technology enabling creation of diverse cell types from reprogrammed adult cells, useful for disease modeling.

30.- Machine learning for histopathology: Application of AI to analyze medical images, potentially improving diagnostic accuracy and efficiency.

Knowledge Vault built byDavid Vivancos 2024