Tiny Shakespeare Transformer

A lightweight character-level language model based on the Transformer architecture, trained on the Tiny Shakespeare dataset.

Overview

This project demonstrates how to build a simple decoder-only Transformer model from scratch using PyTorch.
It includes full pipeline setup: data loading, tokenization, model training, evaluation, and saving artifacts.

Dataset: Tiny Shakespeare
Tokenizer: Character-level tokenizer based on GPT-2
Model: Custom Transformer Decoder
Frameworks: PyTorch, Hugging Face Transformers, Datasets
Logging: Weights and Biases (wandb)

Model Architecture

Token Embedding Layer
Multiple Transformer Decoder Layers
GELU Activation Functions
Dropout Regularization
Final Linear Layer projecting to vocabulary size

Hyperparameters

Parameter	Value
Batch size	64
Block size	128
Embedding dim	256
Number of heads	4
Number of layers	4
Dropout	0.1
Learning rate	3e-4
Epochs	3–5

Training

Dataset is tokenized at the character level.
Inputs are grouped into fixed-length sequences (block_size).
Training and validation split (80/20).
Model is optimized using the AdamW optimizer.
Loss is computed using CrossEntropyLoss.

Training metrics (loss, learning rate, epoch time) are logged to Weights and Biases Project Dashboard.

Model Access

The trained model and tokenizer are available on Hugging Face Hub:

NataliaH/tiny_shakespeare_transformer on Hugging Face

How to Use

Install required dependencies:

pip install torch transformers datasets wandb scikit-learn

Load the model and tokenizer:

import torch
from transformers import AutoTokenizer

model = TransformerModel(...)  # Initialize model architecture
model.load_state_dict(torch.load("model.pth"))
tokenizer = AutoTokenizer.from_pretrained("gpt2")

Generate text or fine-tune further as needed.

Project Links

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
LanguageModel_Project.ipynb		LanguageModel_Project.ipynb
README.md		README.md
README_Aufgabe.md		README_Aufgabe.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tiny Shakespeare Transformer

Overview

Model Architecture

Hyperparameters

Training

Model Access

How to Use

Project Links

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Tiny Shakespeare Transformer

Overview

Model Architecture

Hyperparameters

Training

Model Access

How to Use

Project Links

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages