If we decide to do another version of the course, here are some new topics that could be exciting to add. This is off the top of my head, feel free to suggest other topics.
Bias / fairness
- Detecting and reducing bias in ML systems
- Ethics for ML practitioners
Deployment
- More complicated web serving scenarios (ensembles, graphs of models, low-latency, larger models)
- More prescriptive recommendations on deployment (how to do AB tests, shadow mode, instant rollbacks, etc)
- Model optimization (quantization, distillation, compression, etc)
- Edge / mobile deployment
- On-prem or data-sensitive deployment
Troubleshooting
- More specific pytorch recommendations
Testing
- More specific testing recommendations -- "test coverage" for ML, what to do when tests fail etc
- More on data slices, how to pick them, and how to manage them
- Testing suggestions for language data
Monitoring
- More on what to monitor
- How to set up a monitoring system
Data
- Managing data at a larger scale
- Managing user data for ML
Infrastructure / tooling
- Feature stores -- why, when, and how
- Logging infrastructure for ML
- Spark -- why, when, and how
- Tools for building reproducible data pipelines (Airflow, Kubeflow, etc)
Model lifecycle management
- How to know when to retrain models
- How to set up reproducible retraining pipelines
- How to select data for your next training run (active learning & friends)
If we decide to do another version of the course, here are some new topics that could be exciting to add. This is off the top of my head, feel free to suggest other topics.
Bias / fairness
Deployment
Troubleshooting
Testing
Monitoring
Data
Infrastructure / tooling
Model lifecycle management