This repository explores language-specific neurons in LLMs. The current workflow has seven steps:
1_tokenize.pyTokenize the source dataset for each language and save train/validation token tensors todata/1_output/<experiment>/.2_record_activations.pyRun the base model on the tokenized data and record per-language neuron statistics todata/2_output/<experiment>/.3_identify_neurons.pyApply LAPE over the recorded activations and save the selected language-specific neurons todata/3_output/<experiment>/.4_identified_neurons_eval.pyEvaluate neuron ablations directly on the base model and save cross-language perplexity matrices todata/4_output/<experiment>/.5_generate_language_specific_model.pyExport one ablated model per language by zeroing the selected neurons and save them todata/5_output/<experiment>/.6_eval_language_models.pyEvaluate the exported language-specific models and save cross-language perplexity matrices todata/6_output/<experiment>/.finetuning_impl/7_finetuning.pyvia7-finetuning.shFinetune the model while restricting updates to the selected neuron set, usingconfigs/7_finetuning_latn.yamlfor the current experiment setup.
Use a simple pip setup with the requirements file:
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtThe main pipeline scripts use Hydra with configs/default.yaml:
python 1_tokenize.py
python 2_record_activations.py
python 3_identify_neurons.py
python 4_identified_neurons_eval.py
python 5_generate_language_specific_model.py
python 6_eval_language_models.pyStep 7 uses the finetuning config and helper launcher:
bash 7-finetuning.sh- Shared pipeline settings live in
configs/default.yaml. - The current finetuning experiment lives in
configs/7_finetuning_latn.yaml. cfg.main.languagesdefines the language set used across the main pipeline unless a later step overrides it.
2_visualize_activations.pyis a helper for inspecting recorded activations and is not one of the numbered pipeline steps.- Step 7 is implemented separately from the Hydra pipeline in
configs/default.yaml; it uses the selected neuron artifact from step 3.