This repository collects assignments, lecture material, and supporting resources for the Language & Image Processing course in the Master's program at CIMAT. The coursework blends Natural Language Processing (NLP) and Computer Vision (CV) with a focus on reproducible pipelines in Python.
This course covers key areas of natural language processing and computer vision including:
- Introduction to NLP and traditional algorithms
- Embeddings
- Neural networks and text classification
- Deep learning architectures for text
- Transformers
- Introduction to Computer Vision
- Fundamentals of Convolutional Neural Networks (CNN)
- Pre-trained models: VGG, ResNet
- Diffusion models
- Fine-tuning and feature extraction
- Object detection with pre-trained models
- Introduction to text in images
- Text generation models in images
The repository follows the next structure:
language-image-processing/
├── nlp/ # Natural Language Processing assignments
│ ├── 01_corpus_analysis/
│ └── 02_deep_learning_arquitectures/
├── cv/ # Computer Vision assignments (upcoming)
├── LICENSE
└── README.md
| Assignment | Module | Key Methods | Link |
|---|---|---|---|
| 01 | Corpus Analysis | Token statistics, Zipf law, TF-IDF, Logistic/SVM baselines | 📂 View |
| 02 | Deep Learning Architectures | RNN/LSTM/GRU, LLaMA-3 LoRA, mDeBERTa, text generation & classification | 📂 View |
| Assignment | Module | Key Methods | Link |
|---|---|---|---|
| 03 | ... | ... | ... |
| 04 | ... | ... | ... |
Programming & Analysis:
- Python (≥3.10) for preprocessing, modeling, and experimentation
- Core libraries:
pandas,numpy,scikit-learn,matplotlib,seaborn,spaCy,gensim,tqdm
Documentation & Reporting:
- LaTeX for formal reports
- Markdown for repository documentation
- Git for version control and collaboration
Development Tools:
- JupyterLab / VS Code for interactive exploration
- Virtual environments (
venv) for isolated dependencies - spaCy language models (
es_core_news_sm) for Spanish NLP tasks
- Python 3.10+
pipandvirtualenv(orpython -m venv)- Optional: GPU-enabled PyTorch/TensorFlow for advanced experiments
- LaTeX distribution for compiling reports
This project is licensed under the MIT License — see LICENSE for details.
This repository represents academic work in natural language processing and computer vision.