Conversation
…t when it is launched
Adds per-core power tracking alongside the existing flat node-level model, without changing the reward signal. The flat model charges COST_USED_MW for any used node regardless of utilization; the proportional model scales the compute portion linearly from COST_IDLE_MW (0 cores) to COST_USED_MW (96 cores), i.e. COST_IDLE + (COST_USED - COST_IDLE) * (cores_used / CORES_PER_NODE). New metrics (eval log + TensorBoard): - PropPower (MWh): agent/base/base_off under proportional model - metrics/prop_power_mwh, baseline_prop_power_mwh, baseline_off_prop_power_mwh - metrics/savings_prop_power_vs_baseline_off (MWh gap vs baseline_off) - metrics/savings_prop_cost_vs_baseline_off (€ gap vs baseline_off, directly comparable to metrics/savings_off) All proportional values are computed post-hoc from already-tracked per-step lists (episode_on_nodes, episode_used_cores, episode_price_stats), so no per-step changes to power_cost() or power_consumption_mwh() were needed.
📝 WalkthroughWalkthroughThe changes refactor power and energy cost calculations from node-count-based to core-utilization-based metrics across the reward system. Job completion tracking is relocated from launch-time to completion-time in job management. Multiple function signatures are updated to pass metrics objects and new core-usage parameters throughout baseline, environment, and reward calculation modules. Episode summary output is enhanced with power consumption metrics. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
src/reward_calculation.py (1)
267-301: Consider moving the price signal to core occupancy too.Lines 294-295 switched cost and efficiency to
num_used_cores, but Lines 299-301 still scale the price term withnum_used_nodes. With that mix, a one-core workload and a fully packed node get the same price reward if they touch the same number of nodes, even though their power exposure is now very different. Passing core occupancy into_reward_price()would keep the reward components aligned with the new per-core model.♻️ Possible direction
- def _reward_price(self, current_price: float, average_future_price: float, num_used_nodes: int) -> float: + def _reward_price(self, current_price: float, average_future_price: float, num_used_cores: int) -> float: """ Active signed price reward with fast saturation and negative-price overdrive. - Saturates quickly with better-than-context prices and used nodes. - Always applies overdrive when current price is negative. """ - if num_used_nodes <= 0: + if num_used_cores <= 0: return 0.0 + used_node_equiv = num_used_cores / CORES_PER_NODE context_avg = self._price_context_average(average_future_price) price_span = max(self.prices.MAX_PRICE - self.prices.MIN_PRICE, 1e-6) relative_advantage = (context_avg - current_price) / price_span advantage_component = self.PRICE_ADVANTAGE_GAIN * relative_advantage tau = self.PRICE_NODE_TAU_POS if advantage_component >= 0.0 else self.PRICE_NODE_TAU_NEG - node_component = 1.0 - np.exp(-num_used_nodes / tau) + node_component = 1.0 - np.exp(-used_node_equiv / tau) raw_reward = advantage_component * node_component if current_price < 0.0: negative_strength = (1.0 - np.exp(-abs(current_price) / self.NEGATIVE_PRICE_TAU)) - negative_node_component = (1.0 - np.exp(-num_used_nodes / self.NEGATIVE_PRICE_NODE_TAU)) + negative_node_component = (1.0 - np.exp(-used_node_equiv / self.NEGATIVE_PRICE_NODE_TAU)) overdrive = negative_node_component * negative_strength ... - price_reward = self._reward_price( - current_price, average_future_price, num_used_nodes - ) + price_reward = self._reward_price( + current_price, average_future_price, num_used_cores + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/reward_calculation.py` around lines 267 - 301, The price-term is still scaled by node count while cost/efficiency moved to per-core, causing mismatched signals; update calculate to pass num_used_cores (or a derived core occupancy) into _reward_price instead of num_used_nodes, and update _reward_price's signature/implementation to interpret that parameter as core-occupancy (adjust scaling/normalization inside _reward_price accordingly); ensure any docstrings and callers reflect the new parameter name (e.g., core_occupancy or num_used_cores) so price_reward aligns with the per-core cost/efficiency model.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/evaluation_summary.py`:
- Around line 66-70: The episode summary f-string is emitting a duplicated
"Power=" field; update the f-string that builds the line using
baseline_occupancy_nodes_pct and episode_data (keys agent_power_consumption_mwh,
baseline_power_consumption_mwh, baseline_power_consumption_off_mwh) to remove
the redundant "Power=" segment so only the single Power metric (already emitted
earlier) remains; locate the formatted string construction in
evaluation_summary.py and delete the second Power=/... group while preserving
the remaining text and numeric formatting.
In `@src/job_management.py`:
- Around line 81-83: Episode metrics are being polluted because completion-time
increments (in job completion handling around running_jobs and updates to
jobs_completed and avg_wait_time) count jobs that were submitted in previous
episodes; update the logic so episode boundaries are respected by either: (a)
tagging each Job object with its submission_episode (e.g., set
submission_episode when the job is enqueued) and only increment per-episode
counters (jobs_completed, avg_wait_time used by record_episode_completion())
when job.submission_episode == current_episode, or (b) maintain separate
episode-local launch/completion counters (e.g., episode_jobs_launched,
episode_jobs_completed, episode_wait_accumulator) that are reset by
record_episode_completion() and incremented at launch/completion time; ensure
record_episode_completion() uses those episode-local counters (not global
running_jobs/queue/backlog) so completion_rate cannot exceed 100%.
In `@test/test_sanity_env.py`:
- Around line 387-389: The environment returned by make_env_with_carry() passed
into check_env() isn't explicitly closed; update the snippet that calls
check_env(make_env_with_carry(), warn=True) to capture the created env (e.g.,
env = make_env_with_carry()), pass that env to check_env(env, warn=True), and
then call env.close() (or use a try/finally to ensure env.close() runs) after
the check to match the cleanup pattern used elsewhere.
---
Nitpick comments:
In `@src/reward_calculation.py`:
- Around line 267-301: The price-term is still scaled by node count while
cost/efficiency moved to per-core, causing mismatched signals; update calculate
to pass num_used_cores (or a derived core occupancy) into _reward_price instead
of num_used_nodes, and update _reward_price's signature/implementation to
interpret that parameter as core-occupancy (adjust scaling/normalization inside
_reward_price accordingly); ensure any docstrings and callers reflect the new
parameter name (e.g., core_occupancy or num_used_cores) so price_reward aligns
with the per-core cost/efficiency model.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a7e10be4-9d08-4e37-8880-42306e668ce1
📒 Files selected for processing (7)
src/baseline.pysrc/environment.pysrc/evaluation_summary.pysrc/job_management.pysrc/metrics_tracker.pysrc/reward_calculation.pytest/test_sanity_env.py
| f"Baseline Occupancy (Nodes)={baseline_occupancy_nodes_pct:.2f}%, " | ||
| f"Power={float(episode_data['agent_power_consumption_mwh']):.1f}/" | ||
| f"{float(episode_data['baseline_power_consumption_mwh']):.1f}/" | ||
| f"{float(episode_data['baseline_power_consumption_off_mwh']):.1f} MWh " | ||
| f"(agent/base/base_off)" |
There was a problem hiding this comment.
Drop the second Power= field.
Line 67 duplicates the same metric already emitted at Lines 40-43, so the episode line now contains two Power= sections. That makes the output noisy and brittle for any downstream log parsing.
✂️ Proposed fix
- f"Baseline Occupancy (Nodes)={baseline_occupancy_nodes_pct:.2f}%, "
- f"Power={float(episode_data['agent_power_consumption_mwh']):.1f}/"
- f"{float(episode_data['baseline_power_consumption_mwh']):.1f}/"
- f"{float(episode_data['baseline_power_consumption_off_mwh']):.1f} MWh "
- f"(agent/base/base_off)"
+ f"Baseline Occupancy (Nodes)={baseline_occupancy_nodes_pct:.2f}%"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| f"Baseline Occupancy (Nodes)={baseline_occupancy_nodes_pct:.2f}%, " | |
| f"Power={float(episode_data['agent_power_consumption_mwh']):.1f}/" | |
| f"{float(episode_data['baseline_power_consumption_mwh']):.1f}/" | |
| f"{float(episode_data['baseline_power_consumption_off_mwh']):.1f} MWh " | |
| f"(agent/base/base_off)" | |
| f"Baseline Occupancy (Nodes)={baseline_occupancy_nodes_pct:.2f}%" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/evaluation_summary.py` around lines 66 - 70, The episode summary f-string
is emitting a duplicated "Power=" field; update the f-string that builds the
line using baseline_occupancy_nodes_pct and episode_data (keys
agent_power_consumption_mwh, baseline_power_consumption_mwh,
baseline_power_consumption_off_mwh) to remove the redundant "Power=" segment so
only the single Power metric (already emitted earlier) remains; locate the
formatted string construction in evaluation_summary.py and delete the second
Power=/... group while preserving the remaining text and numeric formatting.
| Completion is counted here (when duration hits zero), not at launch time. | ||
| 'wait_time' on each job is the time spent in the queue before being launched, | ||
| which is the standard HPC metric for scheduler responsiveness. |
There was a problem hiding this comment.
Completion-time episode counters now mix different job cohorts.
src/environment.py Lines 294-304 only reset episode metrics between truncations; they do not clear running_jobs, the queue, or backlog. After Lines 117-125 moved these increments to completion time, a job submitted in episode N and finishing in episode N+1 contributes to episode N+1's jobs_completed and avg_wait_time, while record_episode_completion() still divides by that episode's jobs_submitted. That makes the per-episode completion stats drift and can push completion_rate past 100%. Track submission cohort/episode on the job, or keep separate episode-local "started/launched" counters for the summary.
Also applies to: 114-125
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/job_management.py` around lines 81 - 83, Episode metrics are being
polluted because completion-time increments (in job completion handling around
running_jobs and updates to jobs_completed and avg_wait_time) count jobs that
were submitted in previous episodes; update the logic so episode boundaries are
respected by either: (a) tagging each Job object with its submission_episode
(e.g., set submission_episode when the job is enqueued) and only increment
per-episode counters (jobs_completed, avg_wait_time used by
record_episode_completion()) when job.submission_episode == current_episode, or
(b) maintain separate episode-local launch/completion counters (e.g.,
episode_jobs_launched, episode_jobs_completed, episode_wait_accumulator) that
are reset by record_episode_completion() and incremented at launch/completion
time; ensure record_episode_completion() uses those episode-local counters (not
global running_jobs/queue/backlog) so completion_rate cannot exceed 100%.
| from stable_baselines3.common.env_checker import check_env | ||
| check_env(make_env_with_carry(), warn=True) | ||
| print("[OK] gym check passed") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify check_env usage and whether close() is paired in the same control block.
rg -n -C4 'if args\.check_gym|check_env\(|\.close\(' test/test_sanity_env.pyRepository: FairRootGroup/powersched
Length of output: 2611
🏁 Script executed:
#!/bin/bash
# Search for other usages of check_env in the repository
rg -n 'check_env' --type pyRepository: FairRootGroup/powersched
Length of output: 510
🌐 Web query:
stable-baselines3 check_env closes environment cleanup
💡 Result:
Yes, the stable-baselines3 check_env function calls env.close on the environment during execution, specifically as part of the render check when skip_render_check=False (default is True, so it skips by default). If env.render_mode is set, it calls env.render followed by env.close. For environments wrapped in vectorized environments like DummyVecEnv (used internally in _check_nan), the VecEnv.close method is not explicitly called by check_env, but VecEnvs have a close method for cleanup that closes underlying envs (e.g., SubprocVecEnv sends "close" command and joins processes). check_env does not perform full environment cleanup beyond potential render-related close; it leaves the environment in a stepped state after calling reset and step for validation. Users should handle further cleanup if needed after calling check_env(env). To avoid render close: pass skip_render_check=True (default). Official source code confirms: in check_env, if not skipping render check: _check_render(env), which does env.render; env.close if render_mode present.
Citations:
- 1: https://stable-baselines3.readthedocs.io/en/master/_modules/stable_baselines3/common/env_checker.html
- 2: https://stable-baselines3.readthedocs.io/en/v1.0/_modules/stable_baselines3/common/env_checker.html
- 3: https://stable-baselines3.readthedocs.io/en/master/common/env_checker.html
🏁 Script executed:
#!/bin/bash
# Check test_checkenv.py for check_env usage and cleanup
cat -n test/test_checkenv.pyRepository: FairRootGroup/powersched
Length of output: 1065
Close the environment instance returned by check_env() explicitly.
The environment created at line 388 is not explicitly closed. The check_env() function from stable-baselines3 does not automatically close the passed environment instance, leaving it in a stepped state. In this long-running sanity script that may be invoked repeatedly, explicit cleanup prevents resource/descriptor leaks, consistent with the pattern used elsewhere in this file (lines 115, 136, 198, 378).
♻️ Proposed fix
if args.check_gym:
from stable_baselines3.common.env_checker import check_env
- check_env(make_env_with_carry(), warn=True)
+ gym_env = make_env_with_carry()
+ try:
+ check_env(gym_env, warn=True)
+ finally:
+ gym_env.close()
print("[OK] gym check passed")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from stable_baselines3.common.env_checker import check_env | |
| check_env(make_env_with_carry(), warn=True) | |
| print("[OK] gym check passed") | |
| from stable_baselines3.common.env_checker import check_env | |
| gym_env = make_env_with_carry() | |
| try: | |
| check_env(gym_env, warn=True) | |
| finally: | |
| gym_env.close() | |
| print("[OK] gym check passed") |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/test_sanity_env.py` around lines 387 - 389, The environment returned by
make_env_with_carry() passed into check_env() isn't explicitly closed; update
the snippet that calls check_env(make_env_with_carry(), warn=True) to capture
the created env (e.g., env = make_env_with_carry()), pass that env to
check_env(env, warn=True), and then call env.close() (or use a try/finally to
ensure env.close() runs) after the check to match the cleanup pattern used
elsewhere.
No description provided.