Conversation
📝 WalkthroughWalkthroughAdded a pre-admission queue-aging pass, enabled MAX_JOB_AGE-based dropping, centralized per-step "lost" job accounting across queue/backlog/assignment, and extended metrics/evaluation with loss-rate and proportional-cost/savings fields. Changes
Sequence Diagram(s)sequenceDiagram
participant Env as ComputeClusterEnv
participant Queue as JobQueue
participant Backlog as Backlog
participant Assigner as assign_jobs_to_available_nodes
participant Nodes as ComputeNodes
participant Metrics as MetricsTracker
Env->>Queue: age_job_queue(job_queue_2d, next_empty_slot)
Queue-->>Env: (next_empty_slot, queue_aged_dropped)
Env->>Backlog: age_backlog_queue(backlog)
Backlog-->>Env: backlog_aged_dropped
Env->>Assigner: admit + assign (includes new arrivals)
Assigner->>Nodes: assign jobs
Assigner-->>Env: queue_dropped (per-call)
Env->>Env: total_dropped = queue_aged_dropped + backlog_aged_dropped + backlog_dropped + sum(queue_dropped)
Env->>Metrics: record per-step losses and proportional-costs
Metrics-->>Env: episode fields (jobs_lost_total, loss_rate, agent_prop_cost, ...)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
🚥 Pre-merge checks | ❌ 3❌ Failed checks (1 warning, 2 inconclusive)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/environment.py`:
- Around line 365-366: The code currently folds queue-full rejections
(backlog_dropped) into the reward-facing counters (num_dropped_this_step and
ultimately jobs_dropped) causing RewardCalculator.calculate to penalize them
like MAX_JOB_AGE expirations; instead, keep backlog_dropped as a separate
metric-only drop: stop adding backlog_dropped into num_dropped_this_step and
into the aggregate jobs_dropped rollup, leaving increments to
metrics.jobs_rejected_queue_full and metrics.episode_jobs_rejected_queue_full
only; if you intended both to be penalized, alternatively rename those metrics
and the reward variable to "total_rejections" and update
RewardCalculator.calculate to reference the renamed total so intent is clear.
In `@src/job_management.py`:
- Around line 13-18: age_backlog_queue currently increments job wait times every
call and assign_jobs_to_available_nodes calls it multiple times per simulated
hour, causing jobs to age multiple hours; modify
age_backlog_queue(backlog_queue: deque, _metrics: MetricsTracker, _is_baseline:
bool = False) to accept an optional flag (e.g., apply_age: bool = True) that
skips incrementing waiting age when False, and update callers (notably
assign_jobs_to_available_nodes and any refill/greedy follow-up invocations) to
pass apply_age=False for follow-up calls within the same simulation step while
ensuring the single once-per-step call still uses apply_age=True so jobs age
only once per step.
In `@src/reward_calculation.py`:
- Around line 367-368: The current override sets drop_penalty_weighted = -1
which bypasses drop_weight and can make rewards increase after the 5-drop
threshold; instead saturate the drop count before weighting and then apply the
configured drop_weight. Replace the hard-coded -1 branch with: compute
saturated_count = min(num_dropped_this_step, 5), compute raw_drop_penalty =
-saturated_count * (the per-drop penalty used elsewhere, e.g.
drop_penalty_per_drop or the previous formula), then set drop_penalty_weighted =
raw_drop_penalty * drop_weight so that drop_weight==0 disables the term and the
>5 case is monotone with the 5-drop case (no extra hard-coded value). Ensure you
use the existing symbols num_dropped_this_step, drop_penalty_weighted,
drop_weight (and drop_penalty_per_drop or the original raw penalty expression)
when making the change.
In `@train.py`:
- Around line 399-401: The percentage prints for proportional cost savings
divide by totals that can be zero or negative; update the two percent
expressions in the print block that references
total_savings_prop_cost_vs_baseline / total_baseline_prop_cost and
total_savings_prop_cost_vs_baseline_off / total_baseline_off_prop_cost to use
the existing safe_ratio() (and format with fmt_optional()) helpers instead of
raw division so the output is guarded for non-positive totals; locate the print
lines around the PROPORTIONAL COST SAVINGS block and replace the raw divisions
with safe_ratio(...) wrapped/printed via fmt_optional(...) consistent with how
cost_per_1000 is handled.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: bdc0b799-fcc6-43fd-bd64-feb1928a7fc5
📒 Files selected for processing (8)
src/baseline.pysrc/config.pysrc/environment.pysrc/evaluation_summary.pysrc/job_management.pysrc/metrics_tracker.pysrc/reward_calculation.pytrain.py
| self.metrics.jobs_rejected_queue_full += backlog_dropped | ||
| self.metrics.episode_jobs_rejected_queue_full += backlog_dropped |
There was a problem hiding this comment.
Queue-full rejections are now feeding the reward-side drop counter.
Lines 365-366 already track backlog_dropped separately as jobs_rejected_queue_full, but Line 385 also folds it into num_dropped_this_step, and Lines 504-505 then roll it into jobs_dropped. Because RewardCalculator.calculate() consumes num_dropped_this_step, queue-capacity rejections now get the same penalty as MAX_JOB_AGE expirations. If that split is intentional, rename the metric/penalty to “total rejections”; otherwise keep backlog_dropped out of the reward-facing counter.
Possible split between penalized drops and queue-full rejections
- num_dropped_this_step = backlog_aged_dropped + backlog_dropped
+ num_dropped_this_step = backlog_aged_dropped
+ num_rejected_queue_full_this_step = backlog_dropped
@@
- self.metrics.jobs_dropped += num_dropped_this_step
- self.metrics.episode_jobs_dropped += num_dropped_this_step
+ self.metrics.jobs_dropped += num_dropped_this_step + num_rejected_queue_full_this_step
+ self.metrics.episode_jobs_dropped += num_dropped_this_step + num_rejected_queue_full_this_stepAlso applies to: 385-385, 504-505
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/environment.py` around lines 365 - 366, The code currently folds
queue-full rejections (backlog_dropped) into the reward-facing counters
(num_dropped_this_step and ultimately jobs_dropped) causing
RewardCalculator.calculate to penalize them like MAX_JOB_AGE expirations;
instead, keep backlog_dropped as a separate metric-only drop: stop adding
backlog_dropped into num_dropped_this_step and into the aggregate jobs_dropped
rollup, leaving increments to metrics.jobs_rejected_queue_full and
metrics.episode_jobs_rejected_queue_full only; if you intended both to be
penalized, alternatively rename those metrics and the reward variable to
"total_rejections" and update RewardCalculator.calculate to reference the
renamed total so intent is clear.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
src/reward_calculation.py (1)
364-367:⚠️ Potential issue | 🟠 MajorApply
drop_weightto the new lost-job penalty.Line 367 writes the raw penalty straight into
drop_penalty_weighted, so--drop-weight 0no longer disables this term and the default reward changes as soon as any job is lost.Suggested fix
- drop_penalty_weighted = 0 - if num_dropped_this_step > 0: - drop_penalty_weighted = -1.0 - 0.25 * min(num_dropped_this_step - 1, 1000) # harsher penalty for losing many jobs, capped at -251.0 for 1000+ jobs + raw_drop_penalty = 0.0 + if num_dropped_this_step > 0: + raw_drop_penalty = -1.0 - 0.25 * min(num_dropped_this_step - 1, 1000) # harsher penalty for losing many jobs, capped at -251.0 for 1000+ jobs + drop_penalty_weighted = weights.drop_weight * raw_drop_penaltyBased on learnings: Configure reward components using weighted parameters: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), job-age-weight (default 0.0), and drop-weight (default 0.0)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/reward_calculation.py` around lines 364 - 367, The lost-job penalty is being assigned directly to drop_penalty_weighted so it ignores the configured drop_weight; update the calculation in the block that computes drop_penalty_weighted (which uses num_dropped_this_step) to multiply the computed penalty by the drop_weight parameter (i.e., scale the -1.0 - 0.25 * ... expression by drop_weight) so that when drop_weight is 0 the term contributes nothing and when nonzero it applies the intended scaled penalty.train.py (1)
390-401:⚠️ Potential issue | 🟡 MinorRemove the empty f-strings.
Ruff already flags Lines 390, 397, and 401 with F541, so these headings will keep failing lint until the unused
fprefixes are dropped.Suggested fix
- print(f"\n=== AGENT LOST JOBS PER SAVED EURO ===") + print("\n=== AGENT LOST JOBS PER SAVED EURO ===") @@ - print(f"\n=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===") + print("\n=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===") @@ - print(f"\n=== PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===") + print("\n=== PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@train.py` around lines 390 - 401, The lint errors come from unused f-string prefixes on heading prints; remove the leading f from the plain heading print calls that contain no interpolations (the lines printing "=== AGENT LOST JOBS PER SAVED EURO ===", "=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===", and "=== PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===") while leaving the other formatted print(...) calls (e.g., those using variables like total_jobs_dropped, total_agent_prop_power_mwh, etc.) unchanged so only strings without {} are regular string literals.
🧹 Nitpick comments (2)
src/job_management.py (1)
242-246: Thenum_droppedreturn slot looks stale now.This helper initializes
num_droppedat Line 246 and never mutates it, so the extra tuple element and the new caller contract just add bookkeeping noise. Consider returning(num_processed_jobs, next_empty_slot, next_job_id)and dropping the downstreamqueue_dropped/extra_droppedplumbing.Possible simplification
-) -> tuple[int, int, int, int]: +) -> tuple[int, int, int]: @@ - num_dropped = 0 @@ - return num_processed_jobs, next_empty_slot, num_dropped, next_job_id + return num_processed_jobs, next_empty_slot, next_job_id🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/job_management.py` around lines 242 - 246, The helper currently declares and returns a stale num_dropped variable that is never mutated; remove the num_dropped local and stop returning it from the helper (return only (num_processed_jobs, next_empty_slot, next_job_id) from the helper function), then update all callers of that helper (those unpacking into queue_dropped / extra_dropped or expecting a 4-tuple) to accept the 3-tuple and remove any downstream bookkeeping/logic that records queue_dropped/extra_dropped; ensure the helper's signature and any documentation/comments (and tests) are updated to reflect the new 3-value contract.src/metrics_tracker.py (1)
205-208: Use a strict zip for the new proportional-cost accumulator.These episode lists are supposed to have one sample per step. Without
strict=True, a missing append silently truncatesbaseline_prop_costand hides the bookkeeping bug.Suggested fix
baseline_prop_cost: float = sum( (COST_IDLE_MW * MAX_NODES + _compute_delta_mw * (cores / CORES_PER_NODE)) * price - for cores, price in zip(self.episode_baseline_used_cores, self.episode_price_stats) + for cores, price in zip(self.episode_baseline_used_cores, self.episode_price_stats, strict=True) )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/metrics_tracker.py` around lines 205 - 208, The proportional-cost accumulator using sum over zip(self.episode_baseline_used_cores, self.episode_price_stats) can silently truncate if the two episode lists differ in length; update the comprehension in the baseline_prop_cost calculation to use zip(..., strict=True) so a length mismatch raises immediately (refer to baseline_prop_cost, self.episode_baseline_used_cores, self.episode_price_stats, and the expression using COST_IDLE_MW, MAX_NODES, _compute_delta_mw, CORES_PER_NODE) and fail-fast on bookkeeping bugs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/reward_calculation.py`:
- Around line 364-367: The lost-job penalty is being assigned directly to
drop_penalty_weighted so it ignores the configured drop_weight; update the
calculation in the block that computes drop_penalty_weighted (which uses
num_dropped_this_step) to multiply the computed penalty by the drop_weight
parameter (i.e., scale the -1.0 - 0.25 * ... expression by drop_weight) so that
when drop_weight is 0 the term contributes nothing and when nonzero it applies
the intended scaled penalty.
In `@train.py`:
- Around line 390-401: The lint errors come from unused f-string prefixes on
heading prints; remove the leading f from the plain heading print calls that
contain no interpolations (the lines printing "=== AGENT LOST JOBS PER SAVED
EURO ===", "=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===", and "===
PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===") while leaving the other
formatted print(...) calls (e.g., those using variables like total_jobs_dropped,
total_agent_prop_power_mwh, etc.) unchanged so only strings without {} are
regular string literals.
---
Nitpick comments:
In `@src/job_management.py`:
- Around line 242-246: The helper currently declares and returns a stale
num_dropped variable that is never mutated; remove the num_dropped local and
stop returning it from the helper (return only (num_processed_jobs,
next_empty_slot, next_job_id) from the helper function), then update all callers
of that helper (those unpacking into queue_dropped / extra_dropped or expecting
a 4-tuple) to accept the 3-tuple and remove any downstream bookkeeping/logic
that records queue_dropped/extra_dropped; ensure the helper's signature and any
documentation/comments (and tests) are updated to reflect the new 3-value
contract.
In `@src/metrics_tracker.py`:
- Around line 205-208: The proportional-cost accumulator using sum over
zip(self.episode_baseline_used_cores, self.episode_price_stats) can silently
truncate if the two episode lists differ in length; update the comprehension in
the baseline_prop_cost calculation to use zip(..., strict=True) so a length
mismatch raises immediately (refer to baseline_prop_cost,
self.episode_baseline_used_cores, self.episode_price_stats, and the expression
using COST_IDLE_MW, MAX_NODES, _compute_delta_mw, CORES_PER_NODE) and fail-fast
on bookkeeping bugs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d3a54a8c-8417-4fc5-99fa-382032527540
📒 Files selected for processing (11)
CLAUDE.mdsrc/baseline.pysrc/callbacks.pysrc/config.pysrc/environment.pysrc/evaluation_summary.pysrc/job_management.pysrc/metrics_tracker.pysrc/plotter.pysrc/reward_calculation.pytrain.py
✅ Files skipped from review due to trivial changes (1)
- CLAUDE.md
🚧 Files skipped from review as they are similar to previous changes (3)
- src/evaluation_summary.py
- src/config.py
- src/baseline.py
| drop_penalty = min(0, PENALTY_DROPPED_JOB * num_dropped_this_step) | ||
| drop_penalty_weighted = weights.drop_weight * drop_penalty | ||
|
|
||
| # 6. penalty for lost jobs (aged out or rejected because queue/backlog was full) |
There was a problem hiding this comment.
How about we keep the original and simply bump (reduce) PENALTY_DROPPED_JOB to a (much) lower value? Then the reward scale stays compatible with the rest of the components, and drop_weight=0.0 correctly disables it.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
train.py (1)
390-390:⚠️ Potential issue | 🟡 MinorRemove the no-op
fprefixes from these section headers.Lines 390, 397, and 401 are plain strings, so Ruff still raises F541 here and the lint job stays red.
✂️ Minimal fix
- print(f"\n=== AGENT LOST JOBS PER SAVED EURO ===") + print("\n=== AGENT LOST JOBS PER SAVED EURO ===") @@ - print(f"\n=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===") + print("\n=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===") @@ - print(f"\n=== PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===") + print("\n=== PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===")Also applies to: 397-397, 401-401
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@train.py` at line 390, Three print calls use unnecessary f-strings which trigger Ruff F541; for the header prints (e.g., the lines printing "=== AGENT LOST JOBS PER SAVED EURO ===" and the two other plain header strings in the same block) remove the leading f so they are simple string literals (change print(f"...") to print("...")); locate the statements in train.py by matching those exact header string contents and update all three occurrences.src/reward_calculation.py (1)
367-370:⚠️ Potential issue | 🟠 MajorApply the lost-job curve through
weights.drop_weight.As written, this branch bypasses the configured weight entirely, so
--drop-weight=0no longer disables the term and a single lost job injects a fixed penalty on a very different scale from the other normalized components. If the steeper curve is intentional, keep it as a raw penalty and multiply it byweights.drop_weightbefore adding it toreward.⚖️ Keep the new shape, but respect the configured weight
- drop_penalty_weighted = 0 + drop_penalty_weighted = 0.0 if self.ALLOW_DROP_PENALTY and num_dropped_this_step > 0: - drop_penalty_weighted = -1.0 - 0.25 * min(num_dropped_this_step - 1, 1000) # harsher penalty for losing many jobs, capped at -251.0 for 1000+ jobs + raw_drop_penalty = -1.0 - 0.25 * min(num_dropped_this_step - 1, 1000) + drop_penalty_weighted = weights.drop_weight * raw_drop_penaltyBased on learnings: Configure reward components using weighted parameters: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), job-age-weight (default 0.0), and drop-weight (default 0.0)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/reward_calculation.py` around lines 367 - 370, The drop-penalty branch currently sets drop_penalty_weighted to a raw negative value and ignores the configured weight; modify the logic in the ALLOW_DROP_PENALTY branch so the calculated penalty (based on num_dropped_this_step and the steeper curve) is multiplied by weights.drop_weight before being used in the final reward, e.g., compute the negative penalty as now into drop_penalty_weighted and then scale it by self.weights.drop_weight (or equivalent) so --drop-weight=0 disables the term and configured weights control its magnitude when added to reward.
🧹 Nitpick comments (1)
src/metrics_tracker.py (1)
205-213: Make the new proportional-cost reducers fail fast on timeline mismatches.These loops assume one baseline sample per price sample. Plain
zip()silently truncates on the first short input, which would underreport the new proportional totals instead of surfacing the mismatch.strict=True(or an explicit length check) is safer here.🔍 Safer aggregation
baseline_prop_cost: float = sum( (COST_IDLE_MW * MAX_NODES + _compute_delta_mw * (cores / CORES_PER_NODE)) * price - for cores, price in zip(self.episode_baseline_used_cores, self.episode_price_stats) + for cores, price in zip( + self.episode_baseline_used_cores, + self.episode_price_stats, + strict=True, + ) ) baseline_off_prop_cost: float = sum( (COST_IDLE_MW * used + _compute_delta_mw * (cores / CORES_PER_NODE)) * price - for used, cores, price in zip(self.episode_baseline_used_nodes, self.episode_baseline_used_cores, self.episode_price_stats) + for used, cores, price in zip( + self.episode_baseline_used_nodes, + self.episode_baseline_used_cores, + self.episode_price_stats, + strict=True, + ) )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/metrics_tracker.py` around lines 205 - 213, The proportional-cost aggregations (baseline_prop_cost and baseline_off_prop_cost) use zip(...) which silently truncates mismatched timelines; update the loops to fail fast by using zip(..., strict=True) or perform explicit length checks between self.episode_price_stats and self.episode_baseline_used_cores / self.episode_baseline_used_nodes before computing baseline_prop_cost, baseline_off_prop_cost (and then compute savings_prop_cost_vs_baseline from agent_prop_cost) so any timeline length mismatch raises an error rather than underreporting totals.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/reward_calculation.py`:
- Around line 367-370: The drop-penalty branch currently sets
drop_penalty_weighted to a raw negative value and ignores the configured weight;
modify the logic in the ALLOW_DROP_PENALTY branch so the calculated penalty
(based on num_dropped_this_step and the steeper curve) is multiplied by
weights.drop_weight before being used in the final reward, e.g., compute the
negative penalty as now into drop_penalty_weighted and then scale it by
self.weights.drop_weight (or equivalent) so --drop-weight=0 disables the term
and configured weights control its magnitude when added to reward.
In `@train.py`:
- Line 390: Three print calls use unnecessary f-strings which trigger Ruff F541;
for the header prints (e.g., the lines printing "=== AGENT LOST JOBS PER SAVED
EURO ===" and the two other plain header strings in the same block) remove the
leading f so they are simple string literals (change print(f"...") to
print("...")); locate the statements in train.py by matching those exact header
string contents and update all three occurrences.
---
Nitpick comments:
In `@src/metrics_tracker.py`:
- Around line 205-213: The proportional-cost aggregations (baseline_prop_cost
and baseline_off_prop_cost) use zip(...) which silently truncates mismatched
timelines; update the loops to fail fast by using zip(..., strict=True) or
perform explicit length checks between self.episode_price_stats and
self.episode_baseline_used_cores / self.episode_baseline_used_nodes before
computing baseline_prop_cost, baseline_off_prop_cost (and then compute
savings_prop_cost_vs_baseline from agent_prop_cost) so any timeline length
mismatch raises an error rather than underreporting totals.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1f367dd8-c24f-4870-9e45-a05be8e2d582
📒 Files selected for processing (11)
CLAUDE.mdsrc/baseline.pysrc/callbacks.pysrc/config.pysrc/environment.pysrc/evaluation_summary.pysrc/job_management.pysrc/metrics_tracker.pysrc/plotter.pysrc/reward_calculation.pytrain.py
✅ Files skipped from review due to trivial changes (1)
- CLAUDE.md
🚧 Files skipped from review as they are similar to previous changes (3)
- src/config.py
- src/evaluation_summary.py
- src/job_management.py
Fix lost-job handling, queue aging, and evaluation reporting - age queued/backlogged jobs exactly once per step before refill/new arrivals - drop MAX_JOB_AGE expirations during that start-of-step aging pass - stop repeated same-step assignment attempts from aging waiting jobs multiple times - treat all lost jobs consistently in the reward signal, including queue-full rejections - replace the old >5-drop override with an explicit escalating hard loss penalty - clarify logs/metrics/docs that jobs_dropped is total lost jobs, while queue-full rejects remain tracked separately - expose lost-job aliases in summaries/logging for clearer reporting - guard proportional cost savings percentages with safe_ratio()/fmt_optional() - update bookkeeping tests for the new aging and loss-accounting behavior Fix: Remove experimental Drop-Penalty
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/environment.py (1)
364-372:⚠️ Potential issue | 🟠 MajorInvalidate the pending-job cache when arrivals spill only into backlog.
This block only marks a mutation when jobs land in the fixed queue or when overflow is rejected. If
add_new_jobs()pushes arrivals straight intoself.backlog_queue,_update_pending_job_stats()keeps the previousbacklog_sizeandpending_*values for the whole step.Based on learnings: Environment state space must include node counts, job queue, electricity prices, pending job statistics (count, core-hours, avg duration, max nodes), and backlog size.🩹 Proposed fix
+ backlog_size_before = len(self.backlog_queue) new_jobs, self.next_empty_slot, backlog_dropped = add_new_jobs( job_queue_2d, new_jobs_count, new_jobs_durations, new_jobs_nodes, new_jobs_cores, self.next_empty_slot, self.backlog_queue ) - if len(new_jobs) > 0: + if len(new_jobs) > 0 or len(self.backlog_queue) != backlog_size_before: queue_backlog_mutated = True🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/environment.py` around lines 364 - 372, Record the backlog size before calling add_new_jobs (e.g., prev_backlog = len(self.backlog_queue)), then after the call compare len(self.backlog_queue) to prev_backlog and set queue_backlog_mutated = True if they differ; keep the existing checks for len(new_jobs) > 0 and backlog_dropped > 0 and update metrics as before, and ensure this change makes _update_pending_job_stats() see the updated backlog_size and pending_* values when arrivals are pushed into self.backlog_queue but not into the fixed queue.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@train.py`:
- Around line 390-403: Change the three header print calls that use f-strings
with no substitutions (e.g., print(f"\n=== AGENT LOST JOBS PER SAVED EURO ==="),
print(f"\n=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ==="), etc.) to use
literal strings (remove the leading f) to avoid Ruff F541; also update the power
header text to indicate these are proportional metrics (e.g., include
"PROPORTIONAL" or "PROPORTIONAL POWER & PRICE METRICS") because the block prints
*_prop_power_mwh and *_prop_mean_price values (see variables
total_agent_prop_power_mwh, total_baseline_prop_power_mwh,
total_baseline_off_prop_power_mwh and total_agent_prop_mean_price,
total_baseline_prop_mean_price, total_baseline_off_prop_mean_price). Ensure the
label change is applied only to the header string and not to the variable
formatting or fmt_optional calls.
---
Outside diff comments:
In `@src/environment.py`:
- Around line 364-372: Record the backlog size before calling add_new_jobs
(e.g., prev_backlog = len(self.backlog_queue)), then after the call compare
len(self.backlog_queue) to prev_backlog and set queue_backlog_mutated = True if
they differ; keep the existing checks for len(new_jobs) > 0 and backlog_dropped
> 0 and update metrics as before, and ensure this change makes
_update_pending_job_stats() see the updated backlog_size and pending_* values
when arrivals are pushed into self.backlog_queue but not into the fixed queue.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5b62862f-cfee-410d-8594-856ef11d1227
📒 Files selected for processing (11)
CLAUDE.mdsrc/baseline.pysrc/callbacks.pysrc/config.pysrc/environment.pysrc/evaluation_summary.pysrc/job_management.pysrc/metrics_tracker.pysrc/plotter.pysrc/reward_calculation.pytrain.py
✅ Files skipped from review due to trivial changes (2)
- CLAUDE.md
- src/reward_calculation.py
🚧 Files skipped from review as they are similar to previous changes (3)
- src/callbacks.py
- src/job_management.py
- src/evaluation_summary.py
| print(f"\n=== AGENT LOST JOBS PER SAVED EURO ===") | ||
| print(f" Total Lost Jobs (Agent): {total_jobs_dropped:,}") | ||
| print(f" Total Lost Jobs (Baseline): {total_baseline_jobs_dropped:,}") | ||
| print(f" Vs Baseline: {fmt_optional(total_dropped_jobs_per_saved_euro, 6)} jobs/€") | ||
| print(f" Vs Baseline_off: {fmt_optional(total_dropped_jobs_per_saved_euro_off, 6)} jobs/€") | ||
|
|
||
|
|
||
| print(f"\n=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===") | ||
| print(f" Agent: Power={total_agent_power_mwh:,.1f} MWh, Mean Price={total_agent_mean_price:.2f} €/MWh") | ||
| print(f" Baseline: Power={total_baseline_power_mwh:,.1f} MWh, Mean Price={total_baseline_mean_price:.2f} €/MWh") | ||
| print(f" Baseline_off: Power={total_baseline_off_power_mwh:,.1f} MWh, Mean Price={total_baseline_off_mean_price:.2f} €/MWh") | ||
| print(f" Agent: Power={total_agent_prop_power_mwh:,.1f} MWh, Mean Price={total_agent_prop_mean_price:.2f} €/MWh") | ||
| print(f" Baseline: Power={total_baseline_prop_power_mwh:,.1f} MWh, Mean Price={total_baseline_prop_mean_price:.2f} €/MWh") | ||
| print(f" Baseline_off: Power={total_baseline_off_prop_power_mwh:,.1f} MWh, Mean Price={total_baseline_off_prop_mean_price:.2f} €/MWh") | ||
| print(f"\n=== PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===") | ||
| print(f" Vs Baseline: €{total_savings_prop_cost_vs_baseline:,.0f}, {fmt_optional(prop_savings_pct_vs_baseline, 1)}%") | ||
| print(f" Vs Baseline_off: €{total_savings_prop_cost_vs_baseline_off:,.0f}, {fmt_optional(prop_savings_pct_vs_baseline_off, 1)}%") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
python - <<'PY'
import ast
from pathlib import Path
path = Path("train.py")
tree = ast.parse(path.read_text())
for node in ast.walk(tree):
if isinstance(node, ast.JoinedStr) and not any(isinstance(v, ast.FormattedValue) for v in node.values):
print(f"{path}:{node.lineno}: constant f-string")
PYRepository: FairRootGroup/powersched
Length of output: 1603
Use literal strings for headers and label the power block as proportional.
Three header print statements use f-strings without any substitutions, triggering Ruff F541. Additionally, the header at line 397 should indicate these are proportional metrics, since the variables printed (*_prop_power_mwh, *_prop_mean_price) are proportional values.
🧹 Proposed fix
- print(f"\n=== AGENT LOST JOBS PER SAVED EURO ===")
+ print("\n=== AGENT LOST JOBS PER SAVED EURO ===")
print(f" Total Lost Jobs (Agent): {total_jobs_dropped:,}")
print(f" Total Lost Jobs (Baseline): {total_baseline_jobs_dropped:,}")
print(f" Vs Baseline: {fmt_optional(total_dropped_jobs_per_saved_euro, 6)} jobs/€")
print(f" Vs Baseline_off: {fmt_optional(total_dropped_jobs_per_saved_euro_off, 6)} jobs/€")
- print(f"\n=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===")
+ print("\n=== PROPORTIONAL POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===")
print(f" Agent: Power={total_agent_prop_power_mwh:,.1f} MWh, Mean Price={total_agent_prop_mean_price:.2f} €/MWh")
print(f" Baseline: Power={total_baseline_prop_power_mwh:,.1f} MWh, Mean Price={total_baseline_prop_mean_price:.2f} €/MWh")
print(f" Baseline_off: Power={total_baseline_off_prop_power_mwh:,.1f} MWh, Mean Price={total_baseline_off_prop_mean_price:.2f} €/MWh")
- print(f"\n=== PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===")
+ print("\n=== PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===")
print(f" Vs Baseline: €{total_savings_prop_cost_vs_baseline:,.0f}, {fmt_optional(prop_savings_pct_vs_baseline, 1)}%")
print(f" Vs Baseline_off: €{total_savings_prop_cost_vs_baseline_off:,.0f}, {fmt_optional(prop_savings_pct_vs_baseline_off, 1)}%")🧰 Tools
🪛 Ruff (0.15.7)
[error] 390-390: f-string without any placeholders
Remove extraneous f prefix
(F541)
[error] 397-397: f-string without any placeholders
Remove extraneous f prefix
(F541)
[error] 401-401: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@train.py` around lines 390 - 403, Change the three header print calls that
use f-strings with no substitutions (e.g., print(f"\n=== AGENT LOST JOBS PER
SAVED EURO ==="), print(f"\n=== POWER & PRICE METRICS (TOTAL OVER EVALUATION)
==="), etc.) to use literal strings (remove the leading f) to avoid Ruff F541;
also update the power header text to indicate these are proportional metrics
(e.g., include "PROPORTIONAL" or "PROPORTIONAL POWER & PRICE METRICS") because
the block prints *_prop_power_mwh and *_prop_mean_price values (see variables
total_agent_prop_power_mwh, total_baseline_prop_power_mwh,
total_baseline_off_prop_power_mwh and total_agent_prop_mean_price,
total_baseline_prop_mean_price, total_baseline_off_prop_mean_price). Ensure the
label change is applied only to the header string and not to the variable
formatting or fmt_optional calls.
No description provided.