Self-Implemented LLaMA2 model

This repo contains the implementation of Llama2-7B model (code base), which is designed with minimal dependencies (only torch and sentencepiece) to provide a straightforward setup.

Beyond minimal-llama, I added:

Refined decoding
Correct LoRA fine-tuning
Add KV cache
Add multi-turn conversation
Beam search (in progress).

🧪Demo Usage

🔥Quick Start

Setup environment:

conda create --name llama python=3.10
conda activate llama

git clone https://github.com/YUECHE77/LLaMA2.git
cd LLaMA2

pip install torch sentencepiece

Download Llama-2-7b-chat Model and Tokenizer from huggingface. You can download the base model if you prefer.

LoRA Fine-tuning

Similar to minimal-llama, this repo uses the Alpaca dataset with only 200 samples for quick experimentation.

python finetune.py \
    --model-path /path/to/Llama-2-7b-chat \
    --data-path alpaca_data_200_samples.json \
    --save-path /path/to/save/lora_weights.pth \
    --lr 1e-5 \
    --accumulate-steps 8

Regular Batch Inference

Do not use --lora-path if you haven't fine-tuned.

python inference.py \
    --model-path /path/to/Llama-2-7b-chat \
    --lora-path /path/to/lora_weights.pth \
    --max-len 128 \
    --sampling \
    --temperature 0.7 \
    --top-k 50 \
    --top-p 0.9

Chat with LLaMA2

Use 'exit' to end the conversation. You can modify the maximum length for history in ModelArgs (see model.py). The history get truncated if exceed this value.

python chat.py \
    --model-path /path/to/Llama-2-7b-chat \
    --lora-path /path/to/lora_weights.pth \
    --max-len 128 \
    --sampling \
    --temperature 0.7 \
    --top-k 50 \
    --top-p 0.9

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
llama		llama
.gitignore		.gitignore
README.md		README.md
alpaca_data_200_samples.json		alpaca_data_200_samples.json
chat.py		chat.py
chat_demo.png		chat_demo.png
finetune.py		finetune.py
inference.py		inference.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-Implemented LLaMA2 model

🧪Demo Usage

🔥Quick Start

LoRA Fine-tuning

Regular Batch Inference

Chat with LLaMA2

Reference

About

Uh oh!

Releases

Packages

Languages

YUECHE77/LLaMA2

Folders and files

Latest commit

History

Repository files navigation

Self-Implemented LLaMA2 model

🧪Demo Usage

🔥Quick Start

LoRA Fine-tuning

Regular Batch Inference

Chat with LLaMA2

Reference

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages