English-to-Hindi Neural Machine Translation (Transformer)

A complete implementation of the Transformer architecture ("Attention Is All You Need") built from scratch in PyTorch for translating English text to Hindi. Trained on the IIT Bombay English-Hindi Corpus.

📌 Overview

This project implements a Sequence-to-Sequence Transformer model with Multi-Head Self-Attention and Cross-Attention mechanisms. It features a custom training loop, dynamic learning rate scheduling, and real-time validation metrics.

Key Features:

Architecture: 6-layer Encoder/Decoder, 512 embedding dimension, 8 attention heads.
Dataset: cfilt/iitb-english-hindi.
Tokenizer: Word-level tokenization trained specifically on the corpus.
Visualization: Integration with Altair for attention map visualization.

📊 Performance & Results

The model was trained for over 900,000 steps. The best performance was observed at checkpoint Step 651,791.

Key Metrics

Metric	Score	Description
SacreBLEU	62.48	High correspondence with reference translations.
CHRF++	71.78	Strong character-level n-gram overlap.
CER	0.28	Low Character Error Rate (~28%).
Training Loss	2.95	Converged from initial ~10.24.

Training Progression

Loss Convergence: The model shows a consistent downward trend in loss, stabilizing effectively after step 600k.

Metric Evolution: Both SacreBLEU and CHRF++ show strong correlation with training steps, peaking around the selected checkpoint.

🧠 Qualitative Analysis

The model demonstrates robust handling of formal sentence structures and proper noun transliteration.

Source (English)	Prediction (Hindi)	Note
(a) the manufacture or processing of goods;	( क ) माल का विनिर्माण या प्रसंस्करण	Perfect translation of legal/formal text.
Daniel Molkentin	डेनियल मॉल्केनटिन	Accurate transliteration of foreign names.
Last Quarter Moon	पिछले चौथाई चन्द्रमा	Correct semantic understanding.

Attention Visualization

[Image of Encoder-Decoder Attention]

Understanding how the Decoder focuses on specific Encoder tokens during generation (Cross-Attention):

🛠️ Installation & Setup

Clone the repository

git clone https://github.com/KaranAnchan/en-hi-nmt-transformer.git
cd en-hi-nmt-transformer

Install dependencies
```
pip install requirements.txt
```

🚀 Usage

Inference (Translation)

To translate a sentence using the trained weights:

import torch
from model import build_transformer
from config import get_config
from train import greedy_decode, get_ds

# 1. Load Config and Weights
config = get_config()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Re-build the model structure (ensure vocab sizes match your training)
# You might need to load the tokenizer first to get exact vocab sizes
model = build_transformer(vocab_src_len=..., vocab_tgt_len=..., 
                          seq_len=config['seq_len'], d_model=config['d_model'])

checkpoint = torch.load("weights/tmodel_06.pt")
model.load_state_dict(checkpoint['model_state_dict'])
model.to(device)

# 2. Translate
src_text = "The manufacture of goods."
# ... (Load tokenizers and preprocess text) ...
# output = greedy_decode(model, src_text, ...)
print(output)
# Output: "माल का विनिर्माण।"

Training

To train the model from scratch:

python train.py

Modify config.py to adjust batch size, learning rate, or number of epochs.

📂 Project Structure

model.py: Complete Transformer architecture (Embeddings, Positional Encoding, Multi-Head Attention, Encoder/Decoder blocks).
train.py: Training loop, validation, and checkpoint saving.
dataset.py: Custom PyTorch Dataset class for bilingual text.
config.py: Hyperparameters and file path configurations.
attention_visual.ipynb: Notebook for generating attention heatmaps.

📜 Acknowledgements

Dataset provided by IIT Bombay.
Architecture based on the paper Attention Is All You Need (Vaswani et al., 2017).

Name		Name	Last commit message	Last commit date
Latest commit History 124 Commits
results/visualizations		results/visualizations
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Untitled.ipynb		Untitled.ipynb
attention_heatmap.py		attention_heatmap.py
attention_visual.ipynb		attention_visual.ipynb
beautiful_metrics.png		beautiful_metrics.png
beautiful_train_loss.png		beautiful_train_loss.png
config.py		config.py
dataset.py		dataset.py
decoder.png		decoder.png
encoder-decoder.png		encoder-decoder.png
inference.ipynb		inference.ipynb
model.py		model.py
qualitative_eval.py		qualitative_eval.py
requirements.txt		requirements.txt
tokenizer_en.json		tokenizer_en.json
tokenizer_hi.json		tokenizer_hi.json
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

English-to-Hindi Neural Machine Translation (Transformer)

📌 Overview

📊 Performance & Results

Key Metrics

Training Progression

🧠 Qualitative Analysis

Attention Visualization

🛠️ Installation & Setup

🚀 Usage

Inference (Translation)

Training

📂 Project Structure

📜 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

English-to-Hindi Neural Machine Translation (Transformer)

📌 Overview

📊 Performance & Results

Key Metrics

Training Progression

🧠 Qualitative Analysis

Attention Visualization

🛠️ Installation & Setup

🚀 Usage

Inference (Translation)

Training

📂 Project Structure

📜 Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages