Skip to content

Harshchafle/Machine-Learning-and-Data-Analytics-with-python

Repository files navigation

Machine Learning Learning Timetable

Phase 1: Foundations & Supervised Learning

Python for Data Science & Math Fundamentals

Day 1: Python Refresh & NumPy

Topic 1: Python Refresher (if needed): Functions, classes, modules.

Topic 2: Introduction to NumPy: Arrays, array operations, broadcasting.

Day 2: Pandas for Data Manipulation

Topic 1: Pandas DataFrames: Creation, selection, indexing.

Topic 2: Data Cleaning with Pandas: Handling missing values, duplicates.

Day 3: Data Visualization with Matplotlib & Seaborn

Topic 1: Matplotlib Basics: Line plots, scatter plots, histograms.

Topic 2: Seaborn for Statistical Plots: Box plots, violin plots, heatmaps.

Day 4: Introduction to Linear Algebra for ML

Topic 1: Vectors and Matrices: Operations, dot product, transpose.

Topic 2: Systems of Linear Equations (conceptual understanding).

Day 5: Introduction to Calculus for ML

Topic 1: Derivatives and Gradients (conceptual understanding).

Topic 2: Gradient Descent intuition (how models learn).

Day 6: Probability & Statistics Basics

Topic 1: Probability distributions (Normal, Bernoulli).

Topic 2: Descriptive Statistics (mean, median, mode, variance, standard deviation).

Day 7: Review & Practice

Review all concepts learned.

Solve small coding challenges using NumPy and Pandas.

Week 2: Exploratory Data Analysis (EDA) & Data Preprocessing

Day 1: Advanced EDA Techniques

Topic 1: Correlation analysis.

Topic 2: Outlier detection and treatment.

Day 2: Feature Engineering - Numerical Data

Topic 1: Scaling (Standardization, Normalization).

Topic 2: Binning, polynomial features.

Day 3: Feature Engineering - Categorical Data

Topic 1: One-Hot Encoding.

Topic 2: Label Encoding, Target Encoding (conceptual).

Day 4: Data Splitting & Cross-Validation

Topic 1: Train-Validation-Test Split.

Topic 2: K-Fold Cross-Validation.

Day 5: Introduction to Scikit-learn

Topic 1: Scikit-learn API: Estimators, transformers, pipelines.

Topic 2: Basic data loading and preprocessing with scikit-learn.

Day 6: Building a First ML Pipeline

Topic 1: Create a basic pipeline for data preprocessing and model training.

Topic 2: Implement a simple regression model end-to-end.

Day 7: Review & Mini Project

Consolidate EDA and preprocessing skills.

Mini-Project: Perform EDA and preprocess a simple dataset (e.g., Iris, Boston Housing).

Week 3: Linear & Logistic Regression

Day 1: Linear Regression Theory

Topic 1: Simple Linear Regression: Hypothesis, cost function (MSE).

Topic 2: Gradient Descent for Linear Regression.

Day 2: Implementing Linear Regression

Topic 1: Coding Linear Regression from scratch (using NumPy).

Topic 2: Linear Regression with scikit-learn.

Day 3: Regression Evaluation Metrics

Topic 1: MAE, MSE, RMSE, R-squared.

Topic 2: Interpreting regression coefficients.

Day 4: Logistic Regression Theory

Topic 1: Logistic Regression: Sigmoid function, cross-entropy loss.

Topic 2: Gradient Descent for Logistic Regression.

Day 5: Implementing Logistic Regression

Topic 1: Coding Logistic Regression from scratch (using NumPy).

Topic 2: Logistic Regression with scikit-learn.

Day 6: Classification Evaluation Metrics

Topic 1: Accuracy, Precision, Recall, F1-score.

Topic 2: Confusion Matrix.

Day 7: Review & Practice

Solve problems involving both regression and classification on small datasets.

Week 4: Regularization & Model Selection

Day 1: Overfitting & Underfitting

Topic 1: Bias-Variance Trade-off.

Topic 2: Identifying overfitting and underfitting.

Day 2: Regularization Techniques

Topic 1: L1 (Lasso) Regularization.

Topic 2: L2 (Ridge) Regularization.

Day 3: Hyperparameter Tuning

Topic 1: Grid Search.

Topic 2: Random Search.

Day 4: Model Persistence

Topic 1: Saving and loading models (Joblib, Pickle).

Topic 2: Versioning trained models (conceptual).

Day 5: Project Structuring Introduction

Topic 1: Basic folder structure for an ML project (data/, notebooks/, src/, models/).

Topic 2: README.md and requirements.txt importance.

Day 6: Git & GitHub Basics

Topic 1: Version control with Git: add, commit, push, pull.

Topic 2: Creating a GitHub repository for your project.

Day 7: Mini-Project: Predictive Model with Regularization & Hyperparameter Tuning

Choose a classification or regression dataset.

Implement a full pipeline: preprocessing, model training with regularization, hyperparameter tuning, evaluation.

Structure your project with basic folders and use Git.

Phase 2: Advanced Supervised & Unsupervised Learning (Weeks 9-16)

This phase introduces more complex algorithms and moves into unsupervised methods.

Week 5: Decision Trees & Ensemble Methods (Part 1)

Day 1: Decision Tree Theory

Topic 1: How Decision Trees work: Splits, impurity (Gini, Entropy).

Topic 2: Overfitting in Decision Trees and Pruning.

Day 2: Implementing Decision Trees

Topic 1: Decision Trees for Classification with scikit-learn.

Topic 2: Decision Trees for Regression with scikit-learn.

Day 3: Ensemble Learning Introduction

Topic 1: Bagging vs. Boosting (conceptual).

Topic 2: Random Forests theory.

Day 4: Random Forests Implementation

Topic 1: Random Forest Classifier with scikit-learn.

Topic 2: Random Forest Regressor with scikit-learn.

Day 5: Feature Importance

Topic 1: Understanding feature importance from tree-based models.

Topic 2: Practical applications of feature importance.

Day 6: Advanced Evaluation Metrics

Topic 1: ROC AUC Curve and PR Curve (for imbalanced data).

Topic 2: Model interpretability beyond coefficients.

Day 7: Review & Practice

Compare performance of Linear/Logistic Regression, Decision Trees, and Random Forests on a dataset.

Week 6: Ensemble Methods (Part 2) & SVMs

Day 1: Gradient Boosting Theory

Topic 1: AdaBoost (conceptual).

Topic 2: Gradient Boosting Machines (GBM) intuition.

Day 2: XGBoost & LightGBM

Topic 1: Introduction to XGBoost library.

Topic 2: Introduction to LightGBM library.

Day 3: Support Vector Machines (SVM) Theory

Topic 1: Linear SVM: Max-margin classifier.

Topic 2: Kernels (Polynomial, RBF) for non-linear data.

Day 4: Implementing SVMs

Topic 1: SVM Classifier with scikit-learn.

Topic 2: SVM Regressor with scikit-learn.

Day 5: Choosing the Right Algorithm

Topic 1: Factors influencing algorithm choice (data size, linearity, interpretability).

Topic 2: When to use which algorithm.

Day 6: Project Setup & Environment Management

Topic 1: Virtual environments (conda, venv).

Topic 2: requirements.txt best practices.

Day 7: Mini-Project: Advanced Classification/Regression

Apply XGBoost/LightGBM or SVM to a moderately complex dataset.

Focus on hyperparameter tuning and robust evaluation.

Week 7: Unsupervised Learning - Clustering

Day 1: Introduction to Unsupervised Learning

Topic 1: Difference between supervised and unsupervised learning.

Topic 2: Use cases for clustering and dimensionality reduction.

Day 2: K-Means Clustering Theory

Topic 1: K-Means algorithm: Centroids, inertia.

Topic 2: Elbow method for optimal K.

Day 3: Implementing K-Means

Topic 1: K-Means with scikit-learn.

Topic 2: Interpreting clusters.

Day 4: Hierarchical Clustering

Topic 1: Agglomerative vs. Divisive clustering (conceptual).

Topic 2: Dendrograms.

Day 5: DBSCAN Clustering

Topic 1: DBSCAN algorithm: Density-based clustering.

Topic 2: Advantages and disadvantages over K-Means.

Day 6: Cluster Evaluation Metrics

Topic 1: Silhouette Score.

Topic 2: Davies-Bouldin Index.

Day 7: Review & Practice

Apply different clustering algorithms to an unlabeled dataset and compare results.

Week 8: Unsupervised Learning - Dimensionality Reduction

Day 1: Introduction to Dimensionality Reduction

Topic 1: Curse of Dimensionality.

Topic 2: Use cases for dimensionality reduction.

Day 2: Principal Component Analysis (PCA) Theory

Topic 1: PCA intuition: Variance maximization, eigenvalues/eigenvectors.

Topic 2: Interpreting principal components.

Day 3: Implementing PCA

Topic 1: PCA with scikit-learn.

Topic 2: Using PCA for visualization and feature extraction.

Day 4: t-SNE and UMAP (Conceptual)

Topic 1: t-SNE for visualization of high-dimensional data.

Topic 2: UMAP for dimensionality reduction and visualization.

Day 5: Feature Selection Techniques

Topic 1: Filter methods (e.g., variance threshold, correlation).

Topic 2: Wrapper methods (e.g., RFE).

Day 6: Model Interpretability

Topic 1: SHAP values (conceptual).

Topic 2: LIME (conceptual).

Day 7: Mini-Project: Unsupervised Learning Application

Apply clustering and/or dimensionality reduction to a suitable dataset.

Visualize the results and interpret the findings.

Phase 3: Deep Learning & MLOps Foundations (Weeks 9-16)

This phase introduces Deep Learning concepts and lays the groundwork for MLOps.

Week 9: Introduction to Neural Networks & TensorFlow/Keras

Day 1: Neural Network Fundamentals

Topic 1: Perceptrons, activation functions.

Topic 2: Feedforward Neural Networks architecture.

Day 2: Backpropagation & Optimization

Topic 1: How neural networks learn (conceptual understanding of backpropagation).

Topic 2: Optimizers (SGD, Adam, RMSprop).

Day 3: TensorFlow/Keras Basics

Topic 1: Setting up TensorFlow/Keras environment.

Topic 2: Building a simple dense neural network for classification.

Day 4: Training & Evaluation of NNs

Topic 1: Training process (epochs, batch size).

Topic 2: Loss functions, metrics for neural networks.

Day 5: Overfitting in Neural Networks

Topic 1: Dropout regularization.

Topic 2: Early stopping.

Day 6: Introduction to Computer Vision with CNNs

Topic 1: Convolutional Neural Network (CNN) intuition.

Topic 2: Common CNN architectures (LeNet, AlexNet - conceptual).

Day 7: Review & Practice

Implement a simple neural network for a classification task (e.g., MNIST).

Week 10: Convolutional Neural Networks (CNNs)

Day 1: Deeper Dive into CNNs

Topic 1: Convolutional layers, pooling layers.

Topic 2: Batch Normalization.

Day 2: Building a CNN for Image Classification

Topic 1: Practical implementation of a CNN in Keras for image classification (e.g., CIFAR-10).

Topic 2: Data Augmentation for image data.

Day 3: Transfer Learning

Topic 1: What is transfer learning and why it's useful.

Topic 2: Using pre-trained models (e.g., VGG16, ResNet) with Keras.

Day 4: Object Detection (Conceptual)

Topic 1: Introduction to object detection tasks.

Topic 2: High-level overview of R-CNN, YOLO, SSD.

Day 5: Image Segmentation (Conceptual)

Topic 1: Introduction to image segmentation tasks.

Topic 2: High-level overview of U-Net, Mask R-CNN.

Day 6: Model Explainability for Deep Learning (Conceptual)

Topic 1: Grad-CAM, LRP (Layer-wise Relevance Propagation).

Topic 2: Importance of interpreting deep learning models.

Day 7: Mini-Project: Image Classification with Transfer Learning

Build an image classifier using a pre-trained CNN on a custom dataset.

Week 11: Natural Language Processing (NLP) Fundamentals

Day 1: NLP Basics

Topic 1: Text preprocessing: Tokenization, stemming, lemmatization.

Topic 2: Bag-of-Words, TF-IDF.

Day 2: Recurrent Neural Networks (RNNs)

Topic 1: RNN architecture and sequential data.

Topic 2: Limitations of simple RNNs.

Day 3: LSTMs & GRUs

Topic 1: Long Short-Term Memory (LSTM) networks.

Topic 2: Gated Recurrent Units (GRU).

Day 4: Implementing RNNs/LSTMs for Text

Topic 1: Text classification with LSTMs in Keras.

Topic 2: Sentiment analysis example.

Day 5: Word Embeddings

Topic 1: Word2Vec, GloVe (conceptual).

Topic 2: Using pre-trained word embeddings.

Day 6: Attention Mechanism & Transformers (Conceptual)

Topic 1: Introduction to Attention.

Topic 2: High-level overview of Transformer architecture.

Day 7: Mini-Project: Basic NLP Task

Build a text classifier (e.g., spam detection) using traditional ML and then an LSTM.

Week 12: MLOps - Problem Definition & Data Management

Day 1: ML Project Lifecycle Overview

Topic 1: CRISP-DM vs. typical ML lifecycle (Problem Definition, Data, Modeling, Deployment, Monitoring).

Topic 2: Why MLOps is crucial.

Day 2: Defining the ML Problem

Topic 1: Business understanding: What problem are we solving?

Topic 2: Defining success metrics (business and technical).

Day 3: Data Collection Strategies

Topic 1: Data sources (databases, APIs, web scraping).

Topic 2: Data acquisition pipelines (conceptual).

Day 4: Data Versioning (DVC - Conceptual)

Topic 1: Why version data?

Topic 2: Introduction to Data Version Control (DVC).

Day 5: Feature Stores (Conceptual)

Topic 1: Why use feature stores?

Topic 2: Online vs. offline feature stores.

Day 6: Data Governance & Ethics in ML

Topic 1: Privacy concerns (GDPR, PII).

Topic 2: Fairness, bias in data and models.

Day 7: Case Study Analysis

Analyze a real-world ML project's problem definition and data strategy.

Week 13: MLOps - Model Development & Experiment Tracking

Day 1: Experiment Tracking

Topic 1: Why track experiments?

Topic 2: Introduction to MLflow (logging parameters, metrics, models).

Day 2: Reproducibility in ML

Topic 1: Ensuring reproducible results.

Topic 2: Seed management, environment consistency.

Day 3: Code Organization for ML Projects

Topic 1: Modular code structure (src/, scripts/, tests/).

Topic 2: Best practices for writing clean and testable ML code.

Day 4: Unit Testing for ML Code

Topic 1: Testing data preprocessing functions.

Topic 2: Testing model training components.

Day 5: Model Registry (Conceptual)

Topic 1: Centralized model management.

Topic 2: Versioning and staging models.

Day 6: CI/CD for ML (Conceptual)

Topic 1: Continuous Integration/Continuous Deployment for ML.

Topic 2: Automated testing and deployment.

Day 7: Practice: MLflow Integration

Integrate MLflow into one of your previous mini-projects to track experiments.

Week 14: MLOps - Deployment Strategies

Day 1: Model Deployment Overview

Topic 1: Different deployment patterns (batch, real-time, edge).

Topic 2: REST APIs for model serving.

Day 2: Flask/FastAPI for Model Serving

Topic 1: Building a simple Flask/FastAPI endpoint to serve predictions.

Topic 2: Receiving requests and returning predictions.

Day 3: Docker for Containerization

Topic 1: Docker fundamentals: Images, containers, Dockerfiles.

Topic 2: Containerizing your ML model and its dependencies.

Day 4: Cloud Deployment (AWS/GCP/Azure - Conceptual)

Topic 1: Overview of cloud ML services (SageMaker, Vertex AI, Azure ML).

Topic 2: Basic conceptual understanding of deploying a Docker container to a cloud platform.

Day 5: Kubernetes (Conceptual)

Topic 1: Introduction to Kubernetes for orchestration.

Topic 2: Scaling and managing containerized applications.

Day 6: Serverless Functions for ML (Conceptual)

Topic 1: Deploying models as serverless functions (AWS Lambda, Google Cloud Functions).

Topic 2: Pros and cons of serverless.

Day 7: Mini-Project: Deploy a Simple Model

Take one of your trained models, create a Flask/FastAPI endpoint, and containerize it with Docker.

Week 15: MLOps - Monitoring & Maintenance

Day 1: Model Monitoring

Topic 1: Why monitor models in production?

Topic 2: Key metrics to monitor (performance, data drift, concept drift).

Day 2: Data Drift Detection

Topic 1: Techniques for detecting changes in input data distribution.

Topic 2: Tools for data drift monitoring (e.g., Evidently AI - conceptual).

Day 3: Concept Drift Detection

Topic 1: Detecting changes in the relationship between input and output.

Topic 2: Strategies for handling concept drift (model retraining).

Day 4: Model Retraining Strategies

Topic 1: Scheduled retraining.

Topic 2: Event-driven retraining.

Day 5: A/B Testing for ML Models

Topic 1: Comparing different model versions in production.

Topic 2: Gradual rollout strategies.

Day 6: Incident Response & Alerting

Topic 1: Setting up alerts for model performance degradation.

Topic 2: Troubleshooting common MLOps issues.

Day 7: Review & Project Ideation

Reflect on the entire ML lifecycle.

Brainstorm ideas for a capstone project that incorporates MLOps principles.

Week 16: Advanced MLOps & Special Topics (as needed)

Day 1: Responsible AI & Explainable AI (XAI)

Topic 1: Bias and fairness in ML models.

Topic 2: Introduction to XAI tools (e.g., LIME, SHAP) in MLOps context.

Day 2: ML Security

Topic 1: Adversarial attacks on ML models.

Topic 2: Securing ML pipelines.

Day 3: Model Governance & Auditing

Topic 1: Compliance and regulatory requirements.

Topic 2: Maintaining audit trails for models.

Day 4: Reinforcement Learning (Conceptual)

Topic 1: Introduction to RL: Agents, environments, rewards.

Topic 2: Q-learning, Deep Q-Networks (conceptual).

Day 5: Graph Neural Networks (Conceptual)

Topic 1: Introduction to GNNs.

Topic 2: Use cases for graph data.

Day 6: Ethics in AI

Topic 1: Societal impact of AI.

Topic 2: Ethical guidelines for AI development.

Day 7: Capstone Project Planning & Literature Review

Finalize your capstone project idea.

Start researching relevant papers, datasets, and existing solutions.

About

Learning Python to Upskill in programming

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors