Course material for iXperience Data Science 2018. Explanatory notes and code for work in deep learning, machine learning and data science.
Week 1: Data Science Fundamentals
Monday
Tuesday
Wednesday
Thursday
Friday
Topic Summary
Introduction to Data Science
Introduction to Python
Fundamentals of data Manipulation in Python
Data visualization
Collaborative work and version control
Class structure
The pipeline from data to models in production. Deep learning and the data scientist's skillset.
Syntax, data structures: lists, dictionaries, functions, classes.
Numpy and Pandas.
Matplotlib deep dive.
Git and Github.
Homework Assignments
Vim, Tmux, navigating the terminal.
Python programming exercises.
Data structures in python and view construction in pandas.
Plotting figures with Matplotlib.
Collaborative project extracting features from cryptocurrency trading and order book data.
Week 2: Introduction to Machine Learning
Monday
Tuesday
Wednesday
Thursday
Friday
Topic Summary
Introduction to Machine Learning
Machine Learning algorithms
Evaluation of classifiers
Essential SQL for data scientists
Spark and Big Data
Class structure
Quality of fit, bias variance trade-off,decision boundaries.
Tree-based methods, support vector machines, hyperparameter optimization.
Class imbalance, ROC, precision and recall, confusion matrices, boosting.
Declarative languages, SQL syntax, selecting, grouping, joining, indices and optimisation.
RDDs, big data pipelines and the PySpark API.
Homework Assignments
Plotting decision boundaries, evaluating model complexity, bias and variance. Cross validation
Hyperparameter optimisation: grid search vs random.
Modelling with class imbalance, rigorous model evaluation.
SQL exercises.
Spark pipeline for feature extraction.
Week 3: Advanced Machine Learning
Monday
Tuesday
Wednesday
Thursday
Friday
Topic Summary
Dimensionality reduction
Clustering
GPU Server Setup
Introduction to neural networks
Convolutional networks
Class structure
Linear vs non-linear dimensionality reduction. PCA, t-SNE.
Density-based clustering, DB-SCAN, hierarchical clustering.
GPU acceleration, Nvidia CUDA and CUDNN.
Feedforward networks motivation and development, introduction to the Keras API.
Why convolutions, genesis and building blocks of convolutional models, transfer learning.
Homework Assignments
t-SNE, density and preseved quantities.
Assessing clustering quality.
Setting up a GPU server for deep learning with Google Cloud Compute.
Feedforward networks with Keras.
Convolutional networks and transfer learning.
Monday
Tuesday
Wednesday
Thursday
Friday
Topic Summary
Recurrent models
Recurrent models
Autoencoders
Model productionization
Putting it all together
Class structure
Simple RNN cells, memory and vanishing gradients.
Generators, LSTMs and implementation.
Foundations of autoencoders and unsupervised learning.
Model serving and APIs with Flask and Celery
Integrating model design and productionization.
Homework Assignments
Recurrent model intuitions.
Temperature prediction and generative sequence modelling.
Generative adversarial network design.
Creating a web server to host a trained model.
Start-to-finish modelling pipeline.