Pr2 drop by enlorenz · Pull Request #44 · FairRootGroup/powersched

enlorenz · 2026-03-25T09:28:49Z

No description provided.

coderabbitai · 2026-03-25T09:29:05Z

📝 Walkthrough

Walkthrough

Added a pre-admission queue-aging pass, enabled MAX_JOB_AGE-based dropping, centralized per-step "lost" job accounting across queue/backlog/assignment, and extended metrics/evaluation with loss-rate and proportional-cost/savings fields.

Changes

Cohort / File(s)	Summary
Config `src/config.py`	Activated `MAX_JOB_AGE = WEEK_HOURS * 2` and clarified `PENALTY_DROPPED_JOB` comment.
Job management `src/job_management.py`	Added `age_job_queue(job_queue_2d, next_empty_slot) -> (next_empty_slot, dropped)`; enabled backlog aging to return dropped counts; removed inline hotfix aging from assignment and updated docstrings.
Environment & Baseline control flow `src/environment.py`, `src/baseline.py`	Run `age_job_queue` pre-admission; capture `queue_aged_dropped` and `backlog_aged_dropped`; propagate and accumulate `queue_dropped` from assigner; use `_rejected_queue_full` metrics for immediate rejections; defer single-step increment of `metrics._jobs_dropped`.
Metrics, Reward & Evaluation `src/metrics_tracker.py`, `src/reward_calculation.py`, `src/evaluation_summary.py`, `train.py`, `src/plotter.py`, `src/callbacks.py`	Introduce loss-rate aliases and `jobs_lost_total`/`baseline_jobs_lost_total`; add proportional power/cost fields and `savings_prop_cost_vs_baseline`; remap "dropped" → "lost" in outputs, plots, logs; reward docstrings/comments updated (penalty logic unchanged).
Docs `CLAUDE.md`	Rename `--drop-weight` wording from "dropped jobs" to "lost jobs"; add trailing newline.

Sequence Diagram(s)

sequenceDiagram
    participant Env as ComputeClusterEnv
    participant Queue as JobQueue
    participant Backlog as Backlog
    participant Assigner as assign_jobs_to_available_nodes
    participant Nodes as ComputeNodes
    participant Metrics as MetricsTracker

    Env->>Queue: age_job_queue(job_queue_2d, next_empty_slot)
    Queue-->>Env: (next_empty_slot, queue_aged_dropped)
    Env->>Backlog: age_backlog_queue(backlog)
    Backlog-->>Env: backlog_aged_dropped
    Env->>Assigner: admit + assign (includes new arrivals)
    Assigner->>Nodes: assign jobs
    Assigner-->>Env: queue_dropped (per-call)
    Env->>Env: total_dropped = queue_aged_dropped + backlog_aged_dropped + backlog_dropped + sum(queue_dropped)
    Env->>Metrics: record per-step losses and proportional-costs
    Metrics-->>Env: episode fields (jobs_lost_total, loss_rate, agent_prop_cost, ...)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Limit backlog size to a reasonable value and count drops there #22: Modifies backlog/queue drop accounting and return-value propagation used here.
Rework of Rewards and Queueing + Small hotfixes #17: Overlaps job-aging logic and backlog overflow handling changes.
Pr1 reward #40: Related metric/summary edits introducing proportional-cost and loss-rate fields.

🚥 Pre-merge checks | ❌ 3

❌ Failed checks (1 warning, 2 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Pr2 drop' is vague and generic, using non-descriptive terminology that fails to convey meaningful information about the changeset.	Replace with a descriptive title summarizing the main change, such as 'Distinguish between dropped and lost jobs with aging and rejection metrics' or 'Refactor job loss tracking to separate age expiry from queue-full rejection'.
Description check	❓ Inconclusive	No pull request description was provided by the author, making it impossible to assess whether it relates to the changeset.	Add a description explaining the main objective: distinguishing dropped vs. lost jobs, implementing age-based job expiry, and updating metrics throughout the codebase.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/environment.py`:
- Around line 365-366: The code currently folds queue-full rejections
(backlog_dropped) into the reward-facing counters (num_dropped_this_step and
ultimately jobs_dropped) causing RewardCalculator.calculate to penalize them
like MAX_JOB_AGE expirations; instead, keep backlog_dropped as a separate
metric-only drop: stop adding backlog_dropped into num_dropped_this_step and
into the aggregate jobs_dropped rollup, leaving increments to
metrics.jobs_rejected_queue_full and metrics.episode_jobs_rejected_queue_full
only; if you intended both to be penalized, alternatively rename those metrics
and the reward variable to "total_rejections" and update
RewardCalculator.calculate to reference the renamed total so intent is clear.

In `@src/job_management.py`:
- Around line 13-18: age_backlog_queue currently increments job wait times every
call and assign_jobs_to_available_nodes calls it multiple times per simulated
hour, causing jobs to age multiple hours; modify
age_backlog_queue(backlog_queue: deque, _metrics: MetricsTracker, _is_baseline:
bool = False) to accept an optional flag (e.g., apply_age: bool = True) that
skips incrementing waiting age when False, and update callers (notably
assign_jobs_to_available_nodes and any refill/greedy follow-up invocations) to
pass apply_age=False for follow-up calls within the same simulation step while
ensuring the single once-per-step call still uses apply_age=True so jobs age
only once per step.

In `@src/reward_calculation.py`:
- Around line 367-368: The current override sets drop_penalty_weighted = -1
which bypasses drop_weight and can make rewards increase after the 5-drop
threshold; instead saturate the drop count before weighting and then apply the
configured drop_weight. Replace the hard-coded -1 branch with: compute
saturated_count = min(num_dropped_this_step, 5), compute raw_drop_penalty =
-saturated_count * (the per-drop penalty used elsewhere, e.g.
drop_penalty_per_drop or the previous formula), then set drop_penalty_weighted =
raw_drop_penalty * drop_weight so that drop_weight==0 disables the term and the
>5 case is monotone with the 5-drop case (no extra hard-coded value). Ensure you
use the existing symbols num_dropped_this_step, drop_penalty_weighted,
drop_weight (and drop_penalty_per_drop or the original raw penalty expression)
when making the change.

In `@train.py`:
- Around line 399-401: The percentage prints for proportional cost savings
divide by totals that can be zero or negative; update the two percent
expressions in the print block that references
total_savings_prop_cost_vs_baseline / total_baseline_prop_cost and
total_savings_prop_cost_vs_baseline_off / total_baseline_off_prop_cost to use
the existing safe_ratio() (and format with fmt_optional()) helpers instead of
raw division so the output is guarded for non-positive totals; locate the print
lines around the PROPORTIONAL COST SAVINGS block and replace the raw divisions
with safe_ratio(...) wrapped/printed via fmt_optional(...) consistent with how
cost_per_1000 is handled.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bdc0b799-fcc6-43fd-bd64-feb1928a7fc5

📥 Commits

Reviewing files that changed from the base of the PR and between 8225f99 and 15b06f8.

📒 Files selected for processing (8)

src/baseline.py
src/config.py
src/environment.py
src/evaluation_summary.py
src/job_management.py
src/metrics_tracker.py
src/reward_calculation.py
train.py

coderabbitai · 2026-03-25T09:40:46Z

src/environment.py

+            self.metrics.jobs_rejected_queue_full += backlog_dropped
+            self.metrics.episode_jobs_rejected_queue_full += backlog_dropped


⚠️ Potential issue | 🟠 Major

Queue-full rejections are now feeding the reward-side drop counter.

Lines 365-366 already track backlog_dropped separately as jobs_rejected_queue_full, but Line 385 also folds it into num_dropped_this_step, and Lines 504-505 then roll it into jobs_dropped. Because RewardCalculator.calculate() consumes num_dropped_this_step, queue-capacity rejections now get the same penalty as MAX_JOB_AGE expirations. If that split is intentional, rename the metric/penalty to “total rejections”; otherwise keep backlog_dropped out of the reward-facing counter.

Possible split between penalized drops and queue-full rejections

- num_dropped_this_step = backlog_aged_dropped + backlog_dropped + num_dropped_this_step = backlog_aged_dropped + num_rejected_queue_full_this_step = backlog_dropped @@ - self.metrics.jobs_dropped += num_dropped_this_step - self.metrics.episode_jobs_dropped += num_dropped_this_step + self.metrics.jobs_dropped += num_dropped_this_step + num_rejected_queue_full_this_step + self.metrics.episode_jobs_dropped += num_dropped_this_step + num_rejected_queue_full_this_step

Also applies to: 385-385, 504-505

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/environment.py` around lines 365 - 366, The code currently folds queue-full rejections (backlog_dropped) into the reward-facing counters (num_dropped_this_step and ultimately jobs_dropped) causing RewardCalculator.calculate to penalize them like MAX_JOB_AGE expirations; instead, keep backlog_dropped as a separate metric-only drop: stop adding backlog_dropped into num_dropped_this_step and into the aggregate jobs_dropped rollup, leaving increments to metrics.jobs_rejected_queue_full and metrics.episode_jobs_rejected_queue_full only; if you intended both to be penalized, alternatively rename those metrics and the reward variable to "total_rejections" and update RewardCalculator.calculate to reference the renamed total so intent is clear.

src/job_management.py

src/reward_calculation.py

train.py

coderabbitai

♻️ Duplicate comments (2)

src/reward_calculation.py (1)

364-367: ⚠️ Potential issue | 🟠 Major

Apply drop_weight to the new lost-job penalty.

Line 367 writes the raw penalty straight into drop_penalty_weighted, so --drop-weight 0 no longer disables this term and the default reward changes as soon as any job is lost.

Suggested fix

-        drop_penalty_weighted = 0
-        if num_dropped_this_step > 0:
-            drop_penalty_weighted = -1.0 - 0.25 * min(num_dropped_this_step - 1, 1000)  # harsher penalty for losing many jobs, capped at -251.0 for 1000+ jobs
+        raw_drop_penalty = 0.0
+        if num_dropped_this_step > 0:
+            raw_drop_penalty = -1.0 - 0.25 * min(num_dropped_this_step - 1, 1000)  # harsher penalty for losing many jobs, capped at -251.0 for 1000+ jobs
+        drop_penalty_weighted = weights.drop_weight * raw_drop_penalty

Based on learnings: Configure reward components using weighted parameters: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), job-age-weight (default 0.0), and drop-weight (default 0.0)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/reward_calculation.py` around lines 364 - 367, The lost-job penalty is
being assigned directly to drop_penalty_weighted so it ignores the configured
drop_weight; update the calculation in the block that computes
drop_penalty_weighted (which uses num_dropped_this_step) to multiply the
computed penalty by the drop_weight parameter (i.e., scale the -1.0 - 0.25 * ...
expression by drop_weight) so that when drop_weight is 0 the term contributes
nothing and when nonzero it applies the intended scaled penalty.

train.py (1)

390-401: ⚠️ Potential issue | 🟡 Minor

Remove the empty f-strings.

Ruff already flags Lines 390, 397, and 401 with F541, so these headings will keep failing lint until the unused f prefixes are dropped.

Suggested fix

-                print(f"\n=== AGENT LOST JOBS PER SAVED EURO ===")
+                print("\n=== AGENT LOST JOBS PER SAVED EURO ===")
@@
-                print(f"\n=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===")
+                print("\n=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===")
@@
-                print(f"\n=== PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===")
+                print("\n=== PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@train.py` around lines 390 - 401, The lint errors come from unused f-string
prefixes on heading prints; remove the leading f from the plain heading print
calls that contain no interpolations (the lines printing "=== AGENT LOST JOBS
PER SAVED EURO ===", "=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===",
and "=== PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===") while leaving
the other formatted print(...) calls (e.g., those using variables like
total_jobs_dropped, total_agent_prop_power_mwh, etc.) unchanged so only strings
without {} are regular string literals.

🧹 Nitpick comments (2)

src/job_management.py (1)

242-246: The num_dropped return slot looks stale now.

This helper initializes num_dropped at Line 246 and never mutates it, so the extra tuple element and the new caller contract just add bookkeeping noise. Consider returning (num_processed_jobs, next_empty_slot, next_job_id) and dropping the downstream queue_dropped / extra_dropped plumbing.

Possible simplification

-) -> tuple[int, int, int, int]:
+) -> tuple[int, int, int]:
@@
-    num_dropped = 0
@@
-    return num_processed_jobs, next_empty_slot, num_dropped, next_job_id
+    return num_processed_jobs, next_empty_slot, next_job_id

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/job_management.py` around lines 242 - 246, The helper currently declares
and returns a stale num_dropped variable that is never mutated; remove the
num_dropped local and stop returning it from the helper (return only
(num_processed_jobs, next_empty_slot, next_job_id) from the helper function),
then update all callers of that helper (those unpacking into queue_dropped /
extra_dropped or expecting a 4-tuple) to accept the 3-tuple and remove any
downstream bookkeeping/logic that records queue_dropped/extra_dropped; ensure
the helper's signature and any documentation/comments (and tests) are updated to
reflect the new 3-value contract.

src/metrics_tracker.py (1)

205-208: Use a strict zip for the new proportional-cost accumulator.

These episode lists are supposed to have one sample per step. Without strict=True, a missing append silently truncates baseline_prop_cost and hides the bookkeeping bug.

Suggested fix

         baseline_prop_cost: float = sum(
             (COST_IDLE_MW * MAX_NODES + _compute_delta_mw * (cores / CORES_PER_NODE)) * price
-            for cores, price in zip(self.episode_baseline_used_cores, self.episode_price_stats)
+            for cores, price in zip(self.episode_baseline_used_cores, self.episode_price_stats, strict=True)
         )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/metrics_tracker.py` around lines 205 - 208, The proportional-cost
accumulator using sum over zip(self.episode_baseline_used_cores,
self.episode_price_stats) can silently truncate if the two episode lists differ
in length; update the comprehension in the baseline_prop_cost calculation to use
zip(..., strict=True) so a length mismatch raises immediately (refer to
baseline_prop_cost, self.episode_baseline_used_cores, self.episode_price_stats,
and the expression using COST_IDLE_MW, MAX_NODES, _compute_delta_mw,
CORES_PER_NODE) and fail-fast on bookkeeping bugs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/reward_calculation.py`:
- Around line 364-367: The lost-job penalty is being assigned directly to
drop_penalty_weighted so it ignores the configured drop_weight; update the
calculation in the block that computes drop_penalty_weighted (which uses
num_dropped_this_step) to multiply the computed penalty by the drop_weight
parameter (i.e., scale the -1.0 - 0.25 * ... expression by drop_weight) so that
when drop_weight is 0 the term contributes nothing and when nonzero it applies
the intended scaled penalty.

In `@train.py`:
- Around line 390-401: The lint errors come from unused f-string prefixes on
heading prints; remove the leading f from the plain heading print calls that
contain no interpolations (the lines printing "=== AGENT LOST JOBS PER SAVED
EURO ===", "=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===", and "===
PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===") while leaving the other
formatted print(...) calls (e.g., those using variables like total_jobs_dropped,
total_agent_prop_power_mwh, etc.) unchanged so only strings without {} are
regular string literals.

---

Nitpick comments:
In `@src/job_management.py`:
- Around line 242-246: The helper currently declares and returns a stale
num_dropped variable that is never mutated; remove the num_dropped local and
stop returning it from the helper (return only (num_processed_jobs,
next_empty_slot, next_job_id) from the helper function), then update all callers
of that helper (those unpacking into queue_dropped / extra_dropped or expecting
a 4-tuple) to accept the 3-tuple and remove any downstream bookkeeping/logic
that records queue_dropped/extra_dropped; ensure the helper's signature and any
documentation/comments (and tests) are updated to reflect the new 3-value
contract.

In `@src/metrics_tracker.py`:
- Around line 205-208: The proportional-cost accumulator using sum over
zip(self.episode_baseline_used_cores, self.episode_price_stats) can silently
truncate if the two episode lists differ in length; update the comprehension in
the baseline_prop_cost calculation to use zip(..., strict=True) so a length
mismatch raises immediately (refer to baseline_prop_cost,
self.episode_baseline_used_cores, self.episode_price_stats, and the expression
using COST_IDLE_MW, MAX_NODES, _compute_delta_mw, CORES_PER_NODE) and fail-fast
on bookkeeping bugs.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d3a54a8c-8417-4fc5-99fa-382032527540

📥 Commits

Reviewing files that changed from the base of the PR and between 15b06f8 and b9609ca.

📒 Files selected for processing (11)

CLAUDE.md
src/baseline.py
src/callbacks.py
src/config.py
src/environment.py
src/evaluation_summary.py
src/job_management.py
src/metrics_tracker.py
src/plotter.py
src/reward_calculation.py
train.py

✅ Files skipped from review due to trivial changes (1)

CLAUDE.md

🚧 Files skipped from review as they are similar to previous changes (3)

src/evaluation_summary.py
src/config.py
src/baseline.py

rbx · 2026-03-26T14:35:42Z

src/reward_calculation.py

-        drop_penalty = min(0, PENALTY_DROPPED_JOB * num_dropped_this_step)
-        drop_penalty_weighted = weights.drop_weight * drop_penalty
-
+        # 6. penalty for lost jobs (aged out or rejected because queue/backlog was full)


How about we keep the original and simply bump (reduce) PENALTY_DROPPED_JOB to a (much) lower value? Then the reward scale stays compatible with the rest of the components, and drop_weight=0.0 correctly disables it.

coderabbitai

♻️ Duplicate comments (2)

train.py (1)

390-390: ⚠️ Potential issue | 🟡 Minor

Remove the no-op f prefixes from these section headers.

Lines 390, 397, and 401 are plain strings, so Ruff still raises F541 here and the lint job stays red.

✂️ Minimal fix

-                print(f"\n=== AGENT LOST JOBS PER SAVED EURO ===")
+                print("\n=== AGENT LOST JOBS PER SAVED EURO ===")
@@
-                print(f"\n=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===")
+                print("\n=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===")
@@
-                print(f"\n=== PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===")
+                print("\n=== PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===")

Also applies to: 397-397, 401-401

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@train.py` at line 390, Three print calls use unnecessary f-strings which
trigger Ruff F541; for the header prints (e.g., the lines printing "=== AGENT
LOST JOBS PER SAVED EURO ===" and the two other plain header strings in the same
block) remove the leading f so they are simple string literals (change
print(f"...") to print("...")); locate the statements in train.py by matching
those exact header string contents and update all three occurrences.

src/reward_calculation.py (1)

367-370: ⚠️ Potential issue | 🟠 Major

Apply the lost-job curve through weights.drop_weight.

As written, this branch bypasses the configured weight entirely, so --drop-weight=0 no longer disables the term and a single lost job injects a fixed penalty on a very different scale from the other normalized components. If the steeper curve is intentional, keep it as a raw penalty and multiply it by weights.drop_weight before adding it to reward.
⚖️ Keep the new shape, but respect the configured weight
-        drop_penalty_weighted = 0
+        drop_penalty_weighted = 0.0
         if self.ALLOW_DROP_PENALTY and num_dropped_this_step > 0:
-            drop_penalty_weighted = -1.0 - 0.25 * min(num_dropped_this_step - 1, 1000)  # harsher penalty for losing many jobs, capped at -251.0 for 1000+ jobs
+            raw_drop_penalty = -1.0 - 0.25 * min(num_dropped_this_step - 1, 1000)
+            drop_penalty_weighted = weights.drop_weight * raw_drop_penalty
Based on learnings: Configure reward components using weighted parameters: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), job-age-weight (default 0.0), and drop-weight (default 0.0)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/reward_calculation.py` around lines 367 - 370, The drop-penalty branch
currently sets drop_penalty_weighted to a raw negative value and ignores the
configured weight; modify the logic in the ALLOW_DROP_PENALTY branch so the
calculated penalty (based on num_dropped_this_step and the steeper curve) is
multiplied by weights.drop_weight before being used in the final reward, e.g.,
compute the negative penalty as now into drop_penalty_weighted and then scale it
by self.weights.drop_weight (or equivalent) so --drop-weight=0 disables the term
and configured weights control its magnitude when added to reward.

🧹 Nitpick comments (1)

src/metrics_tracker.py (1)

205-213: Make the new proportional-cost reducers fail fast on timeline mismatches.

These loops assume one baseline sample per price sample. Plain zip() silently truncates on the first short input, which would underreport the new proportional totals instead of surfacing the mismatch. strict=True (or an explicit length check) is safer here.

🔍 Safer aggregation

         baseline_prop_cost: float = sum(
             (COST_IDLE_MW * MAX_NODES + _compute_delta_mw * (cores / CORES_PER_NODE)) * price
-            for cores, price in zip(self.episode_baseline_used_cores, self.episode_price_stats)
+            for cores, price in zip(
+                self.episode_baseline_used_cores,
+                self.episode_price_stats,
+                strict=True,
+            )
         )
         baseline_off_prop_cost: float = sum(
             (COST_IDLE_MW * used + _compute_delta_mw * (cores / CORES_PER_NODE)) * price
-            for used, cores, price in zip(self.episode_baseline_used_nodes, self.episode_baseline_used_cores, self.episode_price_stats)
+            for used, cores, price in zip(
+                self.episode_baseline_used_nodes,
+                self.episode_baseline_used_cores,
+                self.episode_price_stats,
+                strict=True,
+            )
         )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/metrics_tracker.py` around lines 205 - 213, The proportional-cost
aggregations (baseline_prop_cost and baseline_off_prop_cost) use zip(...) which
silently truncates mismatched timelines; update the loops to fail fast by using
zip(..., strict=True) or perform explicit length checks between
self.episode_price_stats and self.episode_baseline_used_cores /
self.episode_baseline_used_nodes before computing baseline_prop_cost,
baseline_off_prop_cost (and then compute savings_prop_cost_vs_baseline from
agent_prop_cost) so any timeline length mismatch raises an error rather than
underreporting totals.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/reward_calculation.py`:
- Around line 367-370: The drop-penalty branch currently sets
drop_penalty_weighted to a raw negative value and ignores the configured weight;
modify the logic in the ALLOW_DROP_PENALTY branch so the calculated penalty
(based on num_dropped_this_step and the steeper curve) is multiplied by
weights.drop_weight before being used in the final reward, e.g., compute the
negative penalty as now into drop_penalty_weighted and then scale it by
self.weights.drop_weight (or equivalent) so --drop-weight=0 disables the term
and configured weights control its magnitude when added to reward.

In `@train.py`:
- Line 390: Three print calls use unnecessary f-strings which trigger Ruff F541;
for the header prints (e.g., the lines printing "=== AGENT LOST JOBS PER SAVED
EURO ===" and the two other plain header strings in the same block) remove the
leading f so they are simple string literals (change print(f"...") to
print("...")); locate the statements in train.py by matching those exact header
string contents and update all three occurrences.

---

Nitpick comments:
In `@src/metrics_tracker.py`:
- Around line 205-213: The proportional-cost aggregations (baseline_prop_cost
and baseline_off_prop_cost) use zip(...) which silently truncates mismatched
timelines; update the loops to fail fast by using zip(..., strict=True) or
perform explicit length checks between self.episode_price_stats and
self.episode_baseline_used_cores / self.episode_baseline_used_nodes before
computing baseline_prop_cost, baseline_off_prop_cost (and then compute
savings_prop_cost_vs_baseline from agent_prop_cost) so any timeline length
mismatch raises an error rather than underreporting totals.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1f367dd8-c24f-4870-9e45-a05be8e2d582

📥 Commits

Reviewing files that changed from the base of the PR and between b9609ca and a775fb1.

📒 Files selected for processing (11)

CLAUDE.md
src/baseline.py
src/callbacks.py
src/config.py
src/environment.py
src/evaluation_summary.py
src/job_management.py
src/metrics_tracker.py
src/plotter.py
src/reward_calculation.py
train.py

✅ Files skipped from review due to trivial changes (1)

CLAUDE.md

🚧 Files skipped from review as they are similar to previous changes (3)

src/config.py
src/evaluation_summary.py
src/job_management.py

Fix lost-job handling, queue aging, and evaluation reporting - age queued/backlogged jobs exactly once per step before refill/new arrivals - drop MAX_JOB_AGE expirations during that start-of-step aging pass - stop repeated same-step assignment attempts from aging waiting jobs multiple times - treat all lost jobs consistently in the reward signal, including queue-full rejections - replace the old >5-drop override with an explicit escalating hard loss penalty - clarify logs/metrics/docs that jobs_dropped is total lost jobs, while queue-full rejects remain tracked separately - expose lost-job aliases in summaries/logging for clearer reporting - guard proportional cost savings percentages with safe_ratio()/fmt_optional() - update bookkeeping tests for the new aging and loss-accounting behavior Fix: Remove experimental Drop-Penalty

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/environment.py (1)
364-372: ⚠️ Potential issue | 🟠 Major

Invalidate the pending-job cache when arrivals spill only into backlog.

This block only marks a mutation when jobs land in the fixed queue or when overflow is rejected. If add_new_jobs() pushes arrivals straight into self.backlog_queue, _update_pending_job_stats() keeps the previous backlog_size and pending_* values for the whole step.
🩹 Proposed fix
+        backlog_size_before = len(self.backlog_queue)
         new_jobs, self.next_empty_slot, backlog_dropped = add_new_jobs(
             job_queue_2d, new_jobs_count, new_jobs_durations,
             new_jobs_nodes, new_jobs_cores, self.next_empty_slot, self.backlog_queue
         )
-        if len(new_jobs) > 0:
+        if len(new_jobs) > 0 or len(self.backlog_queue) != backlog_size_before:
             queue_backlog_mutated = True
Based on learnings: Environment state space must include node counts, job queue, electricity prices, pending job statistics (count, core-hours, avg duration, max nodes), and backlog size.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/environment.py` around lines 364 - 372, Record the backlog size before
calling add_new_jobs (e.g., prev_backlog = len(self.backlog_queue)), then after
the call compare len(self.backlog_queue) to prev_backlog and set
queue_backlog_mutated = True if they differ; keep the existing checks for
len(new_jobs) > 0 and backlog_dropped > 0 and update metrics as before, and
ensure this change makes _update_pending_job_stats() see the updated
backlog_size and pending_* values when arrivals are pushed into
self.backlog_queue but not into the fixed queue.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@train.py`:
- Around line 390-403: Change the three header print calls that use f-strings
with no substitutions (e.g., print(f"\n=== AGENT LOST JOBS PER SAVED EURO ==="),
print(f"\n=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ==="), etc.) to use
literal strings (remove the leading f) to avoid Ruff F541; also update the power
header text to indicate these are proportional metrics (e.g., include
"PROPORTIONAL" or "PROPORTIONAL POWER & PRICE METRICS") because the block prints
*_prop_power_mwh and *_prop_mean_price values (see variables
total_agent_prop_power_mwh, total_baseline_prop_power_mwh,
total_baseline_off_prop_power_mwh and total_agent_prop_mean_price,
total_baseline_prop_mean_price, total_baseline_off_prop_mean_price). Ensure the
label change is applied only to the header string and not to the variable
formatting or fmt_optional calls.

---

Outside diff comments:
In `@src/environment.py`:
- Around line 364-372: Record the backlog size before calling add_new_jobs
(e.g., prev_backlog = len(self.backlog_queue)), then after the call compare
len(self.backlog_queue) to prev_backlog and set queue_backlog_mutated = True if
they differ; keep the existing checks for len(new_jobs) > 0 and backlog_dropped
> 0 and update metrics as before, and ensure this change makes
_update_pending_job_stats() see the updated backlog_size and pending_* values
when arrivals are pushed into self.backlog_queue but not into the fixed queue.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5b62862f-cfee-410d-8594-856ef11d1227

📥 Commits

Reviewing files that changed from the base of the PR and between a775fb1 and 47e1086.

📒 Files selected for processing (11)

CLAUDE.md
src/baseline.py
src/callbacks.py
src/config.py
src/environment.py
src/evaluation_summary.py
src/job_management.py
src/metrics_tracker.py
src/plotter.py
src/reward_calculation.py
train.py

✅ Files skipped from review due to trivial changes (2)

CLAUDE.md
src/reward_calculation.py

🚧 Files skipped from review as they are similar to previous changes (3)

src/callbacks.py
src/job_management.py
src/evaluation_summary.py

coderabbitai · 2026-03-26T16:42:35Z

train.py

+                print(f"\n=== AGENT LOST JOBS PER SAVED EURO ===")
+                print(f"  Total Lost Jobs (Agent): {total_jobs_dropped:,}")
+                print(f"  Total Lost Jobs (Baseline): {total_baseline_jobs_dropped:,}")
                print(f"  Vs Baseline:     {fmt_optional(total_dropped_jobs_per_saved_euro, 6)} jobs/€")
                print(f"  Vs Baseline_off: {fmt_optional(total_dropped_jobs_per_saved_euro_off, 6)} jobs/€")


                print(f"\n=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===")
-                print(f"  Agent:        Power={total_agent_power_mwh:,.1f} MWh, Mean Price={total_agent_mean_price:.2f} €/MWh")
-                print(f"  Baseline:     Power={total_baseline_power_mwh:,.1f} MWh, Mean Price={total_baseline_mean_price:.2f} €/MWh")
-                print(f"  Baseline_off: Power={total_baseline_off_power_mwh:,.1f} MWh, Mean Price={total_baseline_off_mean_price:.2f} €/MWh")
+                print(f"  Agent:        Power={total_agent_prop_power_mwh:,.1f} MWh, Mean Price={total_agent_prop_mean_price:.2f} €/MWh")
+                print(f"  Baseline:     Power={total_baseline_prop_power_mwh:,.1f} MWh, Mean Price={total_baseline_prop_mean_price:.2f} €/MWh")
+                print(f"  Baseline_off: Power={total_baseline_off_prop_power_mwh:,.1f} MWh, Mean Price={total_baseline_off_prop_mean_price:.2f} €/MWh")
+                print(f"\n=== PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===")
+                print(f"  Vs Baseline:     €{total_savings_prop_cost_vs_baseline:,.0f}, {fmt_optional(prop_savings_pct_vs_baseline, 1)}%")
+                print(f"  Vs Baseline_off: €{total_savings_prop_cost_vs_baseline_off:,.0f}, {fmt_optional(prop_savings_pct_vs_baseline_off, 1)}%")


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail python - <<'PY' import ast from pathlib import Path path = Path("train.py") tree = ast.parse(path.read_text()) for node in ast.walk(tree): if isinstance(node, ast.JoinedStr) and not any(isinstance(v, ast.FormattedValue) for v in node.values): print(f"{path}:{node.lineno}: constant f-string") PY

Repository: FairRootGroup/powersched

Length of output: 1603

Use literal strings for headers and label the power block as proportional.

Three header print statements use f-strings without any substitutions, triggering Ruff F541. Additionally, the header at line 397 should indicate these are proportional metrics, since the variables printed (*_prop_power_mwh, *_prop_mean_price) are proportional values.

🧹 Proposed fix

- print(f"\n=== AGENT LOST JOBS PER SAVED EURO ===") + print("\n=== AGENT LOST JOBS PER SAVED EURO ===") print(f" Total Lost Jobs (Agent): {total_jobs_dropped:,}") print(f" Total Lost Jobs (Baseline): {total_baseline_jobs_dropped:,}") print(f" Vs Baseline: {fmt_optional(total_dropped_jobs_per_saved_euro, 6)} jobs/€") print(f" Vs Baseline_off: {fmt_optional(total_dropped_jobs_per_saved_euro_off, 6)} jobs/€") - print(f"\n=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===") + print("\n=== PROPORTIONAL POWER & PRICE METRICS (TOTAL OVER EVALUATION) ===") print(f" Agent: Power={total_agent_prop_power_mwh:,.1f} MWh, Mean Price={total_agent_prop_mean_price:.2f} €/MWh") print(f" Baseline: Power={total_baseline_prop_power_mwh:,.1f} MWh, Mean Price={total_baseline_prop_mean_price:.2f} €/MWh") print(f" Baseline_off: Power={total_baseline_off_prop_power_mwh:,.1f} MWh, Mean Price={total_baseline_off_prop_mean_price:.2f} €/MWh") - print(f"\n=== PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===") + print("\n=== PROPORTIONAL COST SAVINGS (TOTAL OVER EVALUATION) ===") print(f" Vs Baseline: €{total_savings_prop_cost_vs_baseline:,.0f}, {fmt_optional(prop_savings_pct_vs_baseline, 1)}%") print(f" Vs Baseline_off: €{total_savings_prop_cost_vs_baseline_off:,.0f}, {fmt_optional(prop_savings_pct_vs_baseline_off, 1)}%")

🧰 Tools

🪛 Ruff (0.15.7)

[error] 390-390: f-string without any placeholders

Remove extraneous f prefix

(F541)

[error] 397-397: f-string without any placeholders

Remove extraneous f prefix

(F541)

[error] 401-401: f-string without any placeholders

Remove extraneous f prefix

(F541)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@train.py` around lines 390 - 403, Change the three header print calls that use f-strings with no substitutions (e.g., print(f"\n=== AGENT LOST JOBS PER SAVED EURO ==="), print(f"\n=== POWER & PRICE METRICS (TOTAL OVER EVALUATION) ==="), etc.) to use literal strings (remove the leading f) to avoid Ruff F541; also update the power header text to indicate these are proportional metrics (e.g., include "PROPORTIONAL" or "PROPORTIONAL POWER & PRICE METRICS") because the block prints *_prop_power_mwh and *_prop_mean_price values (see variables total_agent_prop_power_mwh, total_baseline_prop_power_mwh, total_baseline_off_prop_power_mwh and total_agent_prop_mean_price, total_baseline_prop_mean_price, total_baseline_off_prop_mean_price). Ensure the label change is applied only to the header string and not to the variable formatting or fmt_optional calls.

brute-forch drop penalty

59a05e7

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

enlorenz force-pushed the pr2_drop branch from 15b06f8 to b9609ca Compare March 26, 2026 11:32

enlorenz marked this pull request as draft March 26, 2026 11:32

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

enlorenz marked this pull request as ready for review March 26, 2026 12:04

rbx reviewed Mar 26, 2026

View reviewed changes

enlorenz force-pushed the pr2_drop branch from b9609ca to a775fb1 Compare March 26, 2026 14:55

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

enlorenz force-pushed the pr2_drop branch from a775fb1 to 47e1086 Compare March 26, 2026 16:30

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pr2 drop#44

Pr2 drop#44
enlorenz wants to merge 2 commits intoFairRootGroup:masterfrom
enlorenz:pr2_drop

enlorenz commented Mar 25, 2026

Uh oh!

coderabbitai bot commented Mar 25, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning, 2 inconclusive)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 25, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

rbx Mar 26, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		self.metrics.jobs_rejected_queue_full += backlog_dropped
		self.metrics.episode_jobs_rejected_queue_full += backlog_dropped

Conversation

enlorenz commented Mar 25, 2026

Uh oh!

coderabbitai bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning, 2 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

rbx Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Mar 25, 2026 •

edited

Loading