Skip to content

alessioarcara/aaas-project

Repository files navigation

Autonomous and Adaptive Systems Project

alessio.arcara@studio.unibo.it

W&B Report

Learned Agents: Multi-layout Showcase

Asymmetric Advantages
Asymmetric Advantages
Coordination Ring
Coordination Ring
Counter Circuit
Counter Circuit
Cramped Room
Cramped Room
Forced Coordination
Forced Coordination

Installation

Click to expand

1. Clone the repository:

git clone git@github.com:alessioarcara/aaas-project.git
cd aaas-project

2. Set up the environment:

You can set up the environment using uv or standard pip.

Option A: Using uv (recommended)

uv venv
source .venv/bin/activate  # On Windows use `.venv\Scripts\activate`
uv sync

Option B: Using pip

This project adheres to PEP 621 standards using pyproject.toml.

python -m venv .venv
source .venv/bin/activate   # On Windows: .venv\Scripts\activate
pip install -e .

Usage

Train an agent pair

This project uses configuration files to manage experiments (see configs/). To train an agent pair, use the following script. Multiple configuration files can be stacked; the rightmost file overrides the previous ones.

uv run python scripts/train.py --configs configs/base.yaml configs/experiment_1.yaml

Arguments:

  • --configs: One or more paths to YAML config files (space-separated).

Test an agent pair

The notebooks/testbench.ipynb serves as an interactive tool for project analysis:

  1. Agent Pair Selection: Select any agent pair from available checkpoints or from your own training runs using the checkpoint dropdown.
  2. Quantitative Metrics: Automatically benchmark agents across layouts.
  3. Qualitative Assessment: Visualize agent pair behavior in a selected layout using the layout dropdown.

Tune hyperparameters

Hyperparameter optimization is performed using Optuna.

uv run tune.py --config configs/base.yaml --study-name your_study_name --trials 100
  • --config: Base configuration file.
  • --study-name: Name of the Optuna study.
  • --trials: Number of optimization trials.

To modify which hyperparameters are optimized, edit the objective function inside the tuning script.

About

A JAX/Flax implementation of Independent PPO for cooperative multi-agent reinforcement learning on the Overcooked-AI benchmark

Topics

Resources

License

Stars

Watchers

Forks

Contributors