SIM-RAG: Self-practicing for Inner Monologue-based Retrieval Augmented Generation

This is the repository for the paper: Knowing You Don’t Know: Learning When to Continue Search in Multi-round RAG through Self-Practicing (SIGIR '25). It provides a framework to run SIM-RAG experiments and evaluate models. Follow the steps below to set up and run your experiments.

Prerequisites

Clone the repository to your local machine:

git clone https://github.com/your/repository.git
cd repository

Ensure you have all the required dependencies installed (refer to requirements.txt or installation instructions in the repo).
If you're using GPT, make sure to set your API key in the environment. You can do this by adding the following line to your .bashrc, .zshrc, or equivalent shell configuration file:
```
export OPENAI_API_KEY="your-api-key-here"
```
Then, run:
```
source ~/.bashrc  # or `source ~/.zshrc` for Zsh users
```
Likewise, if you're using Llama, make sure to set the local path to Llama in the environment:
```
export LLAMA_PATH="/path/to/your/llama"
```
Then, run:
```
source ~/.bashrc  # or `source ~/.zshrc` for Zsh users
```

Prepare Data

Download our prebuilt corpus files corpus.pkl, wiki_corpus.pkl, retriever_settings.pkl, and wiki_retriever_settings.pkl into the bm25_search directory for retrieval.

git clone https://huggingface.co/datasets/dyang39/SIM-RAG-Corpus bm25_search

(Optional) Prepare the original datasets.

The datasets have already been prepared and are ready to use. However, if you'd like to prepare them yourself, you can place the 2WikiMultihopQA dataset (downloaded from its GitHub repository) in the data directory, and the scripts will automatically load HotpotQA and TriviaQA directly from Hugging Face. Once ready, run the following scripts to process the datasets:

python /data/prepare_2wikimultihopqa.py
python /data/prepare_triviaqa.py
python /data/prepare_hotpotqa.py

Run Experiment

To run the SIM-RAG experiment, you'll first need to create a custom script using run_SIM-RAG.py. This script will guide you through entering parameters and generating an executable .sh file for your experiment.

Run run_SIM-RAG.py:
```
python run_SIM-RAG.py
```
Follow the prompts to enter details for the SIM-RAG experiment.
After you have entered all details, the script will generate a .sh file in the bash_scripts directory, which can be used to run the SIM-RAG experiment.
Change the permissions of the generated .sh file to make it executable:
```
chmod +x bash_scripts/{script_filename}
```
Run the generated .sh file to start the experiment:
```
./bash_scripts/{script_filename}
```

Evaluating Predictions

Once the SIM-RAG experiment is complete, you can evaluate the predictions using evaluate_SIM-RAG.py.

The predictions for each dataset are saved in the predictions/ directory in the format {name}_predictions.csv.
To evaluate the predictions, run evaluate_SIM-RAG.py with the experiment ame:
```
python evaluate_SIM-RAG.py --experiment_name {name}
```
The script will output the EM and F1 score of the predictions.

Fine-grained, intermediate evaluation data and statistics can also be found in logs/{name}_log.txt

Customizing Experiments

To run offline, modify the .sh file to run each command with nohup. If you have downloaded checkpoints for an already trained Critic, modify the .sh to only run the last line (inference) while passing the path to the Critic into --dm_path. Make sure the tokenizer is in the same directory.

Critic Checkpoints

We provide a general-purpose Critic:

SIM-RAG-Llama3-2B: This Flan-T5-based Critic is fine-tuned on six datasets, including TriviaQA, HotPotQA, 2WikiMultiHopQA, PopQA, and Musique. It can be directly used for a general-purpose Critic in our Inference pipeline.

To reproduce our result in the paper, we provide the following checkpoints:

Citation

If you find this work useful, please kindly cite

@article{yang2025rag,
  title={Knowing You Don't Know: Learning When to Continue Search in Multi-round RAG through Self-Practicing},
  author={Yang, Diji and Zeng, Linda and Rao, Jinmeng and Zhang, Yi},
  journal={arXiv preprint arXiv:2505.02811},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SIM-RAG: Self-practicing for Inner Monologue-based Retrieval Augmented Generation

Prerequisites

Prepare Data

Run Experiment

Evaluating Predictions

Customizing Experiments

Critic Checkpoints

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
SIGIR '25		SIGIR '25
bm25_search		bm25_search
data		data
dm_training		dm_training
generation		generation
inference		inference
logs		logs
predictions		predictions
utils		utils
README.md		README.md
evaluate_SIM-RAG.py		evaluate_SIM-RAG.py
requirements.txt		requirements.txt
run_SIM-RAG.py		run_SIM-RAG.py

ucscirkm/SIM-RAG

Folders and files

Latest commit

History

Repository files navigation

SIM-RAG: Self-practicing for Inner Monologue-based Retrieval Augmented Generation

Prerequisites

Prepare Data

Run Experiment

Evaluating Predictions

Customizing Experiments

Critic Checkpoints

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages