HEAR-Bench is an audio-augmented benchmark built on top of RoboTwin 2.0. This project focuses on two workflows:
- collecting simulation data with synchronized RGB, state, and audio observations
- training and evaluating pi0-style policies on those tasks
The main HEAR-specific changes are in envs/, assets/audios/, the audio task configs in task_config/, and the pi0 data processing / evaluation pipeline.
- Officially supported: data collection, audio task definitions, pi0 data processing, pi0 training, and pi0 evaluation.
- Main entrypoints:
- collection:
script/collect_data.py,script/collect_data_mp.py - processing:
policy/pi0/scripts/process_data.py,policy/pi0/scripts/process_data_mp.py - evaluation:
script/eval_policy.py
- collection:
- Legacy compatibility wrappers
script/eval_policy_openpi_qwen3*.pyare kept, but the unified evaluation entry isscript/eval_policy.py.
envs/: RoboTwin tasks and HEAR audio task implementationsassets/audios/: bundled HEAR-specific audio assetstask_config/: collection/evaluation configs, includingdemo_clean_audio*.ymlscript/: collection, evaluation, asset installation, and utility scriptspolicy/pi0/: pi0 data processing, training, and deployment codedescription/: instruction generation used after collection
HEAR-Bench follows the same simulator setup, dependency chain, and asset download flow as RoboTwin 2.0. Users should treat this repository as a RoboTwin-based extension rather than a separate simulator stack.
Reference:
- RoboTwin 2.0 install guide: https://robotwin-platform.github.io/doc/usage/robotwin-install.html
- RoboTwin 2.0 repository: https://github.com/RoboTwin-Platform/RoboTwin
If this project is added into IRMVLab/HEAR, keep the folder name as HEAR-Bench and run all commands from HEAR/HEAR-Bench.
conda create -n hear-bench python=3.10 -y
conda activate hear-bench
sudo apt install libvulkan1 mesa-vulkan-drivers vulkan-tools ffmpeg
bash script/_install.sh
bash script/_download_assets.shIf embodiment paths are still unresolved after downloading assets, run:
python script/update_embodiment_config_path.pyHEAR-specific audio files are already included in assets/audios/. No extra asset download is required for audio tasks beyond the standard RoboTwin assets.
policy/pi0 follows the OpenPI package layout and expects a separate Python 3.11 environment.
cd policy/pi0
pip install uv
uv sync
cd ../..Single-process collection:
bash collect_data.sh open_microwave_audio demo_clean_audio 0Parallel collection:
python script/collect_data_mp.py \
--task_name open_microwave_audio \
--task_config demo_clean_audio \
--gpus 0,1 \
--num_workers 8Collected trajectories are written to data/<task_name>/<task_config>/.
Sequential processing:
python policy/pi0/scripts/process_data.py \
--task_name open_microwave_audio \
--setting demo_clean_audio \
--expert_data_num 50Parallel processing:
python policy/pi0/scripts/process_data_mp.py \
--task_name open_microwave_audio \
--setting demo_clean_audio \
--expert_data_num 50By default, processed data is exported to processed_data/<task>-<setting>-<num>. Use --output_dir to override it.
cd policy/pi0
bash finetune.sh robotwin hear_pi0 0
cd ../..Use your processed dataset and OpenPI config as needed for custom experiments.
Recommended command:
python script/eval_policy.py \
--config policy/pi0/deploy_policy.yml \
--task_name pour_water_audio_full \
--task_config demo_clean_audio \
--policy_name pi0 \
--train_config_name robotwin \
--model_name hear_pi0 \
--checkpoint_id 30000 \
--ckpt_setting hear_pi0 \
--seed 0Evaluation outputs are saved under eval_result/<task>/<policy>/<task_config>/<ckpt_setting>/.
- Upstream RoboTwin assets (
background_texture,embodiments,objects) are still downloaded throughscript/_download_assets.sh. - HEAR-Bench only adds lightweight audio assets under
assets/audios/; users do not need a second asset package. - When publishing into the
IRMVLab/HEARrepository, this folder should be committed asHEAR-Bench/, and the parent repository README should link to this directory rather than duplicating setup details.
HEAR-Bench builds on RoboTwin 2.0 and reuses its simulator, asset pipeline, and task infrastructure. Please cite RoboTwin and OpenPI as appropriate in downstream work.