Language Neuron Detection

This repository explores language-specific neurons in LLMs. The current workflow has seven steps:

1_tokenize.py Tokenize the source dataset for each language and save train/validation token tensors to data/1_output/<experiment>/.
2_record_activations.py Run the base model on the tokenized data and record per-language neuron statistics to data/2_output/<experiment>/.
3_identify_neurons.py Apply LAPE over the recorded activations and save the selected language-specific neurons to data/3_output/<experiment>/.
4_identified_neurons_eval.py Evaluate neuron ablations directly on the base model and save cross-language perplexity matrices to data/4_output/<experiment>/.
5_generate_language_specific_model.py Export one ablated model per language by zeroing the selected neurons and save them to data/5_output/<experiment>/.
6_eval_language_models.py Evaluate the exported language-specific models and save cross-language perplexity matrices to data/6_output/<experiment>/.
finetuning_impl/7_finetuning.py via 7-finetuning.sh Finetune the model while restricting updates to the selected neuron set, using configs/7_finetuning_latn.yaml for the current experiment setup.

Setup

Use a simple pip setup with the requirements file:

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Running the pipeline

The main pipeline scripts use Hydra with configs/default.yaml:

python 1_tokenize.py
python 2_record_activations.py
python 3_identify_neurons.py
python 4_identified_neurons_eval.py
python 5_generate_language_specific_model.py
python 6_eval_language_models.py

Step 7 uses the finetuning config and helper launcher:

bash 7-finetuning.sh

Configuration

Shared pipeline settings live in configs/default.yaml.
The current finetuning experiment lives in configs/7_finetuning_latn.yaml.
cfg.main.languages defines the language set used across the main pipeline unless a later step overrides it.

Notes

2_visualize_activations.py is a helper for inspecting recorded activations and is not one of the numbered pipeline steps.
Step 7 is implemented separately from the Hydra pipeline in configs/default.yaml; it uses the selected neuron artifact from step 3.

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
configs		configs
eval		eval
finetuning_impl		finetuning_impl
.gitignore		.gitignore
1_tokenize.py		1_tokenize.py
2-4_record-to-eval.sh		2-4_record-to-eval.sh
2_record_activations.py		2_record_activations.py
2_visualize_activations.py		2_visualize_activations.py
3_identify_neurons.py		3_identify_neurons.py
4_identified_neurons_eval.py		4_identified_neurons_eval.py
5_generate_language_specific_model.py		5_generate_language_specific_model.py
6_eval_language_models.py		6_eval_language_models.py
7-finetuning-devana-job.sh		7-finetuning-devana-job.sh
7-finetuning-scheadule.sh		7-finetuning-scheadule.sh
7-finetuning.sh		7-finetuning.sh
README.md		README.md
batch-experiment.sh		batch-experiment.sh
calculate_entropy.py		calculate_entropy.py
eval_utils.py		eval_utils.py
gradient_deep_research.pdf		gradient_deep_research.pdf
misc.py		misc.py
proposal.md		proposal.md
requirements.txt		requirements.txt
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Language Neuron Detection

Setup

Running the pipeline

Configuration

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Language Neuron Detection

Setup

Running the pipeline

Configuration

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages