MuJoCo workspace for SO-101 pick-and-place and teacher-policy training.
The env loads the upstream SO-101 Menagerie MJCF at runtime, builds the task scene on top of it, and keeps the Menagerie gripper collision model. The current default training-scene physics are aligned closely with the standalone grasp-tuning scene in scripts/tune_grasp_physics.py, except the training env still uses the tabletop task scene instead of a raised support platform.
- Base robot model:
external/mujoco_menagerie/robotstudio_so101/so101.xml - Calibration reference:
external/SO-ARM100/Simulation/SO101/so101_new_calib.xml - Table: black tabletop in front of the robot
- Task object: dynamic green box,
28 mm x 28 mm x 28 mm,0.03 kg - Receptacle: purple round receptacle, driven by mocap during the task
- Cameras:
cam_highandwrist_cam - Wrist image observation key:
observation.images.cam_right_wrist
The task language still says "green cylinder" in a few places, but the current task object in code is a box/cube.
Training-scene defaults now match the grasp-tuning baseline closely:
- Gripper collision model: Menagerie collision geometry
- Integrator:
implicitfast - Solver:
Newton - Timestep:
0.002 - Solver iterations:
30 - Solver line-search iterations:
60 - Object mass:
0.03 kg - Object contact friction:
1.0 0.005 0.0005 - Object
condim:6 - Gripper contact friction:
1.0 0.005 0.0005 - Gripper
condim:6 - Contact
solref:0.01 1 - Table friction:
0.95 0.02 0.001
These values live in so101_rl/config.py and are applied in so101_rl/scene.py.
pick_place: move the green object into the receptaclepick_only: grasp and lift the objectgrasp_only: establish a grasplift_only: lift the object after grasping
pick_place succeeds when the object is physically inside the receptacle and settled there. The narrower task modes use grasp/lift success criteria instead.
The env supports two observation modes:
full: image observations plus state/task keysteacher: privileged state vector for RL
The current teacher observation is 52D, not 47D.
The repo does not vendor the Menagerie XML directly into the package. It resolves the robot model from one of:
SO101_MJCF_PATH=/absolute/path/to/so101.xmlSO101_MENAGERIE_DIR=/absolute/path/to/robotstudio_so101SO101_TRS_DIR=/absolute/path/to/Simulation/SO101robot_descriptions
The default local search path prefers:
external/mujoco_menagerie/robotstudio_so101external/SO-ARM100/Simulation/SO101
Create a venv and install the repo:
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -e .Install training extras if you want PPO scripts:
python3 -m pip install -e .[training]Fetch the upstream assets:
bash scripts/fetch_so101_assets.shTypical local environment setup:
export SO101_MENAGERIE_DIR="$PWD/external/mujoco_menagerie/robotstudio_so101"
export SO101_TRS_DIR="$PWD/external/SO-ARM100/Simulation/SO101"For headless rendering:
export MUJOCO_GL=eglFor GUI viewing:
unset MUJOCO_GLView the current training scene:
unset MUJOCO_GL && python3 -m so101_rl.viewerShort headless smoke test:
MUJOCO_GL=egl python3 -m so101_rl.smoke --steps 20 --policy zeroTeacher-task smoke test:
MUJOCO_GL=egl python3 scripts/pick_teacher_smoke.py --steps 20 --policy zeroStandalone grasp-physics tuning scene:
unset MUJOCO_GL && python3 scripts/tune_grasp_physics.pyHeadless grasp-physics tuning run:
MUJOCO_GL=egl python3 scripts/tune_grasp_physics.py --headless --print-every 20Current scripts in scripts/:
control_spec.pyinspect_observations.pyscripted_pick_place.pyeval_scripted_policy.pycollect_teacher_dataset.pypick_teacher_smoke.pyinspect_pick_teacher_starts.pytrain_pick_teacher_ppo.pyeval_pick_teacher_ppo.pytune_grasp_physics.py
Typical PPO training command:
MUJOCO_GL=egl python3 scripts/train_pick_teacher_ppo.pyTypical PPO evaluation command:
MUJOCO_GL=egl python3 scripts/eval_pick_teacher_ppo.py \
--checkpoint outputs/ppo_pick_teacher/best_model/best_model.zip \
--deterministicso101_rl/config.pyso101_rl/model_source.pyso101_rl/scene.pyso101_rl/envs/pick_place.pyso101_rl/viewer.pyso101_rl/smoke.pyscripts/tune_grasp_physics.py
- Control is joint-position delta control over the six position actuators.
- With
timestep=0.002andframe_skip=10, the policy/control loop runs at50 Hz. - The receptacle is a mocap body, so the task goal is kinematic even though the object remains fully dynamic.
- The viewer and smoke tooling still depend on the local OpenGL setup. Use
unset MUJOCO_GLfor GUI andMUJOCO_GL=eglfor headless runs.