Project PRIMAL: 4-bit Prime-Harmonic Training Engine

Status: Active Research / Proof of Concept Hardware Target: Consumer GPUs (e.g., GTX 1080 Ti, RTX 3060) License: MIT

🚀 The 11GB Challenge

Training Large Language Models (LLMs) usually requires massive VRAM because of the **Shadow Weight Tax- Neural Vocoder: HiFi-GAN v3 integration via SpeechBrain.

Automation: voice_test.py for batch inference verification.

🚀 Quick Start (Inference)

To generate speech from text, ensure you are in the project root and run:

# Single Sentence
python tts_inference.py --text "Project Trinity is alive." --checkpoint "checkpoints/ghost_tts/best_sentinel.pt" --output "output.wav"

# Automated Batch Test
python voice_test.py

Results will be saved in tests/voice_samples/. ing (QAT) keeps the model in 4-bit but maintains a full FP16/FP32 copy of the weights for updates, effectively doubling memory usage.

Project PRIMAL removes the shadow weights entirely.

It implements a Discrete Optimization Loop that trains a 0.1B parameter model directly on a rigid 4-bit integer grid. This allows for massive batch sizes and high throughput on older cards like the GTX 1080 Ti.

⚡ Key Features

1. Prime Harmonic Grid (v3.0.0)

Instead of linear INT4 quantization (which wastes precision on large numbers), PRIMAL uses a custom 13-value Look-Up Table (LUT) derived from prime reciprocals. This concentrates precision around zero, where 90% of LLM weights reside.

Index	Value	Description
0-6	`±1, ±0.5, ±0.33...`	Coarse adjustment for "body" layers.
7	`0.0`	Exact zero for sparsity.
Fine	`±0.66, ±0.25...`	(Layer 12 Only) High-precision bridge.

2. The "Poltergeist"# Antigravity: Shadowless 8-bit Discrete Training (Linear Protocol)

Discrete training often fails due to "stochastic thrashing" during gradient accumulation. PRIMAL solves this with Decoupled Flipping:

Backward Pass: No updates. Gradients cast "votes" (+1 or -1) into an int8 buffer.
Optimizer Step: Votes are aggregated. Weights only flip if there is a consensus across micro-batches (e.g., Batch 64).
Adaptive Probability: Weights flip stochastically based on their magnitude (Z-Score filtering).

3. Efficiency Benchmarks (GTX 1080 Ti)

Metric	Model Size	Result	Notes
Training VRAM	0.1B Params	10.3 GB	@ Batch Size 64 (Full Saturation)
Inference VRAM	1.1B Params	~550 MB	4-bit Loading Verified
Throughput	0.1B Params	~6,000 TPS	Training (Python Loop)

🛠️ Installation & Usage

Prerequisites

NVIDIA GPU (Pascal or newer)
CUDA 11.8+
Python 3.10+

Quick Start

# Clone the repo
git clone https://github.com/batteryphil/Primal-Discrete-LLM-Training.git
cd Primal-Discrete-LLM-Training

# Install dependencies
pip install -r requirements.txt

# Run the training demo (0.1B Model)
python primal_train_ghost.py

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.agent/workflows		.agent/workflows
build/temp.win-amd64-cpython-310/Release		build/temp.win-amd64-cpython-310/Release
dashboard		dashboard
models		models
src		src
tests/voice_samples		tests/voice_samples
.gitattributes		.gitattributes
.gitignore		.gitignore
ANTIGRAVITY_v4_3_ARCHIVE.txt		ANTIGRAVITY_v4_3_ARCHIVE.txt
ANTIGRAVITY_v5_2_ZERO_FLOAT_500M.txt		ANTIGRAVITY_v5_2_ZERO_FLOAT_500M.txt
BENCHMARKS.md		BENCHMARKS.md
FILE_INVENTORY.md		FILE_INVENTORY.md
LICENSE		LICENSE
LLM_CODE_AND_EXPLANATIONS.txt		LLM_CODE_AND_EXPLANATIONS.txt
OPERATION_QWEN_1_5B_LOG.md		OPERATION_QWEN_1_5B_LOG.md
PAPER.md		PAPER.md
PROGRESS_REPORT.txt		PROGRESS_REPORT.txt
PROTOCOL_v5_6_CORE_DYNAMICS.txt		PROTOCOL_v5_6_CORE_DYNAMICS.txt
PROTOCOL_v6_00_STATE_REPORT.txt		PROTOCOL_v6_00_STATE_REPORT.txt
Primal_Engine_Architecture_Report.txt		Primal_Engine_Architecture_Report.txt
QWEN25_CODER_1_5B_TO_PRIME_PLAN.md		QWEN25_CODER_1_5B_TO_PRIME_PLAN.md
QWEN3_1_7B_TO_PRIME_PLAN.md		QWEN3_1_7B_TO_PRIME_PLAN.md
README.md		README.md
ROADMAP.md		ROADMAP.md
Reddit_Post_Draft.txt		Reddit_Post_Draft.txt
build.bat		build.bat
build_output.txt		build_output.txt
check_audio.py		check_audio.py
check_buffers.py		check_buffers.py
check_buffers_live.py		check_buffers_live.py
check_checkpoint.py		check_checkpoint.py
check_monitor_api.py		check_monitor_api.py
experiment_runner.py		experiment_runner.py
fix_checkpoint_voltage.py		fix_checkpoint_voltage.py
ghost_core.py		ghost_core.py
ghost_tts.py		ghost_tts.py
git_history.txt		git_history.txt
git_history_utf8.txt		git_history_utf8.txt
inspect_dataset_mel.py		inspect_dataset_mel.py
inspect_metadata.py		inspect_metadata.py
interactive_test.py		interactive_test.py
launch_remote.bat		launch_remote.bat
live_salad_test.py		live_salad_test.py
manifolds.py		manifolds.py
monitor.log		monitor.log
monitor_auto_bench.py		monitor_auto_bench.py
monitor_gui.py		monitor_gui.py
monitor_primal.py		monitor_primal.py
monitor_qwen.py		monitor_qwen.py
ngrok.exe		ngrok.exe
night_shift_preflight.py		night_shift_preflight.py
overfit_test.log		overfit_test.log
overfit_test.py		overfit_test.py
perplexity_coder.json		perplexity_coder.json
perplexity_project_real.json		perplexity_project_real.json
primal_bench.py		primal_bench.py
primal_cuda.cpp		primal_cuda.cpp
primal_cuda_kernel.cu		primal_cuda_kernel.cu
primal_infer_ghost.py		primal_infer_ghost.py
primal_train_ghost.py		primal_train_ghost.py
primal_train_modular.py		primal_train_modular.py
primal_val_perplexity.py		primal_val_perplexity.py
project_primal_v3_source.txt		project_primal_v3_source.txt
protocol_v5_90_integrity.py		protocol_v5_90_integrity.py
qwen25_prime_importer.py		qwen25_prime_importer.py
qwen25_prime_train.py		qwen25_prime_train.py
qwen25_prime_wrapper.py		qwen25_prime_wrapper.py
repair_json.py		repair_json.py
requirements.txt		requirements.txt
run_ghost.bat		run_ghost.bat
salad_test.py		salad_test.py
samples_coder.json		samples_coder.json
samples_project_real.json		samples_project_real.json
setup_primal_cuda.py		setup_primal_cuda.py
stats_coder.json		stats_coder.json
stats_project_real.json		stats_project_real.json
stats_v5.json		stats_v5.json
test_inference.py		test_inference.py
test_qwen_cpu.py		test_qwen_cpu.py
test_qwen_gpu.py		test_qwen_gpu.py
train_ghost_tts.py		train_ghost_tts.py
training_errors.log		training_errors.log
training_phase2.log		training_phase2.log
training_project_real_v4.log		training_project_real_v4.log
training_snapshot_v6_13.md		training_snapshot_v6_13.md
training_v5.log		training_v5.log
training_v5_71.log		training_v5_71.log
trinity_benchmark_suite.py		trinity_benchmark_suite.py
trinity_density.py		trinity_density.py
trinity_peak.log		trinity_peak.log
trinity_peak.py		trinity_peak.py
tts_data_pipeline.py		tts_data_pipeline.py
tts_inference.py		tts_inference.py
tts_overfitting_report.txt		tts_overfitting_report.txt
validation_test.py		validation_test.py
verify_dmv.py		verify_dmv.py
verify_ignition.py		verify_ignition.py
verify_modular_init.py		verify_modular_init.py
verify_momentum.py		verify_momentum.py
verify_qwen25_prime.py		verify_qwen25_prime.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project PRIMAL: 4-bit Prime-Harmonic Training Engine

🚀 The 11GB Challenge

🚀 Quick Start (Inference)

⚡ Key Features

1. Prime Harmonic Grid (v3.0.0)

2. The "Poltergeist"# Antigravity: Shadowless 8-bit Discrete Training (Linear Protocol)

3. Efficiency Benchmarks (GTX 1080 Ti)

🛠️ Installation & Usage

Prerequisites

Quick Start

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Project PRIMAL: 4-bit Prime-Harmonic Training Engine

🚀 The 11GB Challenge

🚀 Quick Start (Inference)

⚡ Key Features

1. Prime Harmonic Grid (v3.0.0)

2. The "Poltergeist"# Antigravity: Shadowless 8-bit Discrete Training (Linear Protocol)

3. Efficiency Benchmarks (GTX 1080 Ti)

🛠️ Installation & Usage

Prerequisites

Quick Start

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages