Knowledge Vault 6 /51 - ICML 2020
Doing Some Good with Machine Learning
Lester Mackey
< Resume Image >

Concept Graph & Resume using Claude 3.5 Sonnet | Chat GPT4o | Llama 3:

graph LR classDef main fill:#f9d4f9, font-weight:bold, font-size:14px classDef motivation fill:#f9d4d4, font-weight:bold, font-size:14px classDef projects fill:#d4f9d4, font-weight:bold, font-size:14px classDef methods fill:#d4d4f9, font-weight:bold, font-size:14px classDef covid fill:#f9f9d4, font-weight:bold, font-size:14px classDef future fill:#d4f9f9, font-weight:bold, font-size:14px Main[Doing Some Good
with Machine Learning] --> A[Motivation and
Background] Main --> B[Notable Projects] Main --> C[Methods and
Approaches] Main --> D[COVID-19 Response] Main --> E[Future Directions] A --> A1[Use computer
science for positive
world change 1] A --> A2[Improved seismic
detection for nuclear
proliferation 2] A --> A3[Inspired to
tackle social problems
with data 6] A --> A4[Started Statistics
for Social Good
at Stanford 7] A --> A5[Divide-and-conquer approach
for social problems 8] A --> A6[Collaborated with
like-minded organizations for
social good 13] B --> B1[ALS Prediction
Prize: predict disease
progression 3] B --> B2[Solution outperformed
clinicians, reduced trial
sizes 5] B --> B3[Optimized financial
coaching for low-income
clients 9] B --> B4[Predicted high-cost
healthcare individuals for
interventions 10] B --> B5[Created Opiate
Atlas for palliative
care research 11] B --> B6[Studied nonprofit
reviews, found harsh
feedback 12] C --> C1[ALS dataset:
irregular, outliers, missingness,
irrelevant features 4] C --> C2[Improved sub-seasonal
climate forecasts 15] C --> C3[Sub-Seasonal Climate
Forecaster Rodeo competition 16] C --> C4[Compiled Sub-Seasonal
Rodeo Dataset from
meteorological sources 17] C --> C5[Developed models,
ensembled for maximum
skill 18] C --> C6[Ensemble outperformed
competitors, improved temperature
forecasts 19] D --> D1[Suggested COVID
forecasting during pandemic 21] D --> D2[DELPHI group
tasked with COVID-19
forecasting 22] D --> D3[Collected, shared
US COVID indicators
for forecasting 23] D --> D4[Created COVIDCast
website with daily
county-level data 24] D --> D5[Improved precipitation
forecasts, room for
accuracy 20] D --> D6[Contributed to
community website for
social projects 14] E --> E1[Volunteer opportunities:
DataKind, DSSG, Statistics
Without Borders 25] E --> E2[Data science
contests for social
good 26] E --> E3[DSSG summer
fellowships for students
and postdocs 27] E --> E4[Teach ML
courses with social
impact focus 28] E --> E5[Incentivize ML
for social good
in organizations 29] E --> E6[Dedicate time
for positive social
change impact 30] class Main main class A,A1,A2,A3,A4,A5,A6 motivation class B,B1,B2,B3,B4,B5,B6 projects class C,C1,C2,C3,C4,C5,C6 methods class D,D1,D2,D3,D4,D5,D6 covid class E,E1,E2,E3,E4,E5,E6 future

Resume:

1.- Author hoped to use computer science to change the world for the better since starting grad school.

2.- Worked on combating nuclear proliferation by improving seismic event detection for the Comprehensive Test Ban Treaty Organization.

3.- After PhD, participated in the ALS Prediction Prize to predict disease progression rate to help clinicians and clinical trials.

4.- Key challenges with the ALS dataset included irregular time series, outliers, missingness, and many irrelevant features.

5.- Their solution placed first, outperforming 12 clinicians. It's estimated to reduce ALS drug trial sizes by 20%.

6.- As a postdoc, inspired by DataKind and the Data Science for Social Good program to tackle social problems with data.

7.- Started the Statistics for Social Good working group at Stanford to work on problems like poverty, hunger, education, trafficking.

8.- Took a divide and conquer approach - members identified problem partners, datasets, inferential questions and reported back to the group.

9.- Worked with SparkPoint on optimizing financial coaching services for low-income clients based on detailed client interaction data.

10.- With the Clinical Excellence Research Center, predicted high future healthcare cost individuals in Denmark for proactive interventions.

11.- Helped the Global Oncology Initiative create an opiate consumption data visualization called the Opiate Atlas for palliative care researchers.

12.- Studied nonprofit reviews with Great Nonprofits, finding clients reliant on services for basic needs gave some of the harshest feedback.

13.- Collaborated with other like-minded organizations and individuals working towards using data science for social good.

14.- Contributed to the community website statsforchange.github.io to share resources, potential partners, datasets, and ideas for social good projects.

15.- After moving, began working on improving sub-seasonal (2-6 week) climate forecasts which are important but very inaccurate.

16.- The US Bureau of Reclamation ran a year-long real-time sub-seasonal forecasting competition called the Sub-Seasonal Climate Forecaster Rodeo.

17.- Compiled the Sub-Seasonal Rodeo Dataset from many public meteorological data sources to enable training sub-seasonal forecasting models.

18.- Developed two models (multi-task linear regression and auto-regressive k-NN) and ensembled them to maximize cosine similarity skill.

19.- Their ensemble model outperformed the top competitor and improved over the US operational forecast (CFSv2) by 37-53% for temperature.

20.- For precipitation, their approach improved over CFSv2 by 128-154%, but there is still significant room for improving accuracy.

21.- During the COVID-19 pandemic, received suggestion to work on COVID forecasting given expertise of colleagues in flu forecasting.

22.- The DELPHI group at CMU, a CDC Influenza Forecasting Center of Excellence, was tasked with COVID-19 forecasting.

23.- The DELPHI COVID-19 response team first focused on collecting and sharing US COVID indicators for use in forecasting.

24.- Created COVIDCast website with daily, county-level data on symptoms, searches, doctor's visits. Data available via API for forecasting.

25.- Many opportunities exist to volunteer data science skills through DataKind, Data Science for Social Good Solve platform, Statistics Without Borders.

26.- Participating in data science contests like those on Innocent Applies, DrivenData, Dream Challenges, Kaggle is another avenue.

27.- Data Science for Social Good summer fellowships are available for students/postdocs at CMU, Stanford, UW to learn skills on real projects.

28.- We should teach more ML courses with real social impact, publish more ML for social good work, provide dedicated venues.

29.- Companies and universities should incentivize ML for social good through goals, support, and rewards like law firms do for pro bono work.

30.- If everyone dedicated even 1-5% of their time to being a positive force for social change through their work, imagine the impact.

Knowledge Vault built byDavid Vivancos 2024