Skip to content

Latest commit

Β 

History

History
397 lines (295 loc) Β· 8.37 KB

File metadata and controls

397 lines (295 loc) Β· 8.37 KB

Post-Training Workflow Guide

What to do after your model finishes training on Colab


πŸ“₯ Step 1: Download Your Trained Model

After training completes on Google Colab:

  1. Find model in Google Drive:

    MyDrive/physicalaihack/models/act_shape_insertion/
    

    This is a folder containing:

    • model.safetensors (model weights)
    • config.json (model config)
    • Normalization files
    • Training config
  2. Download to your Mac:

    • Download the entire act_shape_insertion folder
    • Or use Drive desktop app
  3. Place in local models directory:

    # The folder should be at:
    /Users/bencxr/dev/physicalaihack/models/act_shape_insertion/
    
    # Verify files exist:
    ls models/act_shape_insertion/
    # Should show: model.safetensors, config.json, etc.

πŸ§ͺ Step 2: Run Evaluation

Basic Evaluation (20 episodes)

# Activate environment
source lerobot-env/bin/activate

# Run evaluation
python eval_act_sim.py

What happens:

  • Runs 20 test episodes in simulation
  • Computes success rate, cycle time, failure modes
  • Saves videos to eval_videos/
  • Saves metrics to eval_results/

Expected time: 5-10 minutes

Advanced Options

# More episodes for better statistics
python eval_act_sim.py --episodes 50

# No visual rendering (faster)
python eval_act_sim.py --no-render

# Don't save videos (saves disk space)
python eval_act_sim.py --no-videos

# Custom model path
python eval_act_sim.py --model models/best_model.pth

πŸ“Š Step 3: Analyze Results

View Latest Results

python analyze_eval_results.py --latest

Output:

  • Success rate vs target (70%)
  • Cycle time vs target (<10s)
  • Gap analysis

Compare Multiple Runs

python analyze_eval_results.py --compare

Output:

  • Trends across evaluations
  • Best/worst/average metrics
  • Common failure patterns

View All Results

python analyze_eval_results.py

🎯 Step 4: Interpret Results

Success Rate: 70%+ βœ…

You're ready for the hackathon!

  • Model performs well
  • Proceed to hardware transition
  • Document your approach

Next steps:

  • Review failure cases to understand edge cases
  • Prepare for hardware differences
  • Practice explanation for judges

Success Rate: 50-70% ⚠️

Good but can improve:

  • Collect 20-30 more demos (focus on failures)
  • Package and upload to Colab
  • Retrain (2-4 hours)
  • Re-evaluate

Focus areas:

  • Watch failed episodes in eval_videos/
  • Identify specific failure patterns
  • Add demos that show correct behavior

Success Rate: <50% ❌

Needs significant improvement:

  • Review demo quality with inspect_demos.py
  • Collect 50+ high-quality demos
  • Ensure consistent technique
  • Consider simplifying task initially
  • Retrain with larger dataset

Common issues:

  • Inconsistent demos (different approaches)
  • Too few demos (<20)
  • Poor demo quality (jerky movements)
  • Training bugs (check Colab logs)

πŸ”„ Step 5: Iteration Loop

If you need to improve:

1. Analyze Failures

# Watch failed episodes
open eval_videos/episode_005.mp4  # Replace with failed episode

Look for:

  • Where does policy fail? (grab, transport, release)
  • Is it consistent or random?
  • Does it fail in specific situations?

2. Collect Targeted Demos

python teleop_sim.py

Focus on:

  • Scenarios where policy fails
  • Consistent approach technique
  • Smooth, deliberate movements
  • Successful insertions only

Target: 20-50 total demos (including previous)

3. Retrain on Colab

# Package new demos
cd sim_data
tar -czf shape_insertion_data_v2.tar.gz *

Then:

  1. Upload to Google Drive
  2. Run training notebook (2-4 hours)
  3. Download new model
  4. Repeat evaluation

4. Track Progress

# Compare all evaluation runs
python analyze_eval_results.py --compare

Look for:

  • Improvement in success rate
  • Reduction in specific failure modes
  • More consistent performance

πŸ“ File Structure After Evaluation

physicalaihack/
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ act_shape_insertion_final.pth     # Downloaded from Colab
β”‚   └── best_model.pth                    # (optional) best checkpoint
β”‚
β”œβ”€β”€ eval_videos/                          # Created by eval_act_sim.py
β”‚   β”œβ”€β”€ episode_001.mp4                   # Video of episode 1
β”‚   β”œβ”€β”€ episode_002.mp4
β”‚   └── ...
β”‚
β”œβ”€β”€ eval_results/                         # Created by eval_act_sim.py
β”‚   β”œβ”€β”€ eval_results_20260128_143022.json # Metrics from run 1
β”‚   β”œβ”€β”€ eval_results_20260128_151544.json # Metrics from run 2
β”‚   └── ...
β”‚
β”œβ”€β”€ sim_data/                             # Your collected demos
β”‚   └── shape_insertion_demos.pkl
β”‚
β”œβ”€β”€ eval_act_sim.py                       # Evaluation script ⭐
β”œβ”€β”€ analyze_eval_results.py               # Analysis script
└── POST-TRAINING-GUIDE.md                # This guide

πŸŽ“ Understanding the Metrics

Success Rate

  • What: Percentage of episodes that successfully insert shape
  • Target: 70%+ for hackathon readiness
  • Formula: (successes / total_episodes) Γ— 100

Cycle Time

  • What: Time from episode start to successful insertion
  • Target: <10 seconds
  • Only counts: Successful episodes

Failure Modes

  • failed_to_grab: Policy never picked up the shape
  • dropped_shape: Picked up but dropped before slot
  • timeout: Ran out of time (500 steps)
  • other: Miscellaneous failures

πŸ’‘ Tips for Success

During Evaluation

  • Watch at least a few episodes live to understand behavior
  • Save videos for later analysis and presentation
  • Run multiple evaluation rounds (3-5) for better statistics
  • Check for consistent vs random failures

During Iteration

  • Quality > Quantity for demos
  • Be consistent in your approach
  • Focus on smooth, deliberate movements
  • Only save successful demonstrations
  • Collect 20+ demos minimum, 50+ is better

Before Hackathon

  • Achieve 70%+ success rate
  • Document your approach
  • Understand failure modes
  • Prepare for hardware differences
  • Practice explaining to judges

🚨 Troubleshooting

"Model not found" Error

# Check model exists
ls -lh models/

# If missing, download from Google Drive
# Place at: models/act_shape_insertion_final.pth

"Import Error: ACT"

# Ensure LeRobot is installed
source lerobot-env/bin/activate
pip list | grep lerobot

# If missing, reinstall
cd lerobot
pip install -e .

Policy Performs Poorly

  • Check model downloaded correctly (not corrupted)
  • Verify training completed successfully
  • Review training loss curves in Colab
  • Ensure demo quality is good
  • Consider collecting more demos

Videos Not Saving

# Check directory exists and is writable
ls -ld eval_videos/

# Check disk space
df -h .

πŸ“ž Quick Reference

Common Commands

# Basic evaluation
python eval_act_sim.py

# Extended evaluation
python eval_act_sim.py --episodes 50

# View latest results
python analyze_eval_results.py --latest

# Compare all runs
python analyze_eval_results.py --compare

# Watch a specific episode
open eval_videos/episode_001.mp4

# Collect more demos
python teleop_sim.py

# Package for retraining
cd sim_data && tar -czf shape_insertion_data_v2.tar.gz *

Key Files

  • eval_act_sim.py - Main evaluation script
  • analyze_eval_results.py - Results analysis
  • inspect_demos.py - Demo quality checker
  • teleop_sim.py - Collect more demos
  • colab-notebook-clean.ipynb - Retraining

🎯 Success Checklist

Before hackathon (this week):

  • Model trained on Colab (2-4 hours)
  • Model downloaded to Mac
  • Evaluation run (5-10 minutes)
  • Success rate >70%
  • Cycle time <10s
  • Failure modes understood
  • Videos saved for presentation
  • Approach documented

If success rate <70%:

  • Analyzed failures
  • Collected 20+ more demos
  • Retrained on Colab
  • Re-evaluated
  • Achieved target metrics

πŸŽ‰ You're Ready!

Once you hit 70%+ success rate:

  1. βœ… You're ready for the hackathon
  2. πŸŽ₯ Save your best videos for demo
  3. πŸ“ Document your approach
  4. πŸ€– Prepare for hardware transition

At the hackathon:

  • Fine-tune on real robot demos
  • Transfer your sim-to-real approach
  • Show your simulation results
  • Present to judges!

Good luck! πŸš€πŸ€–


Last updated: January 2026