Simplify power functions to scalar interface#41
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughReplaces array- and state-derived power/reward computations with aggregate scalar inputs: functions now accept (num_on_nodes, total_used_cores). Call sites and logging updated in Changes
Sequence Diagram(s)sequenceDiagram
participant Env as Environment
participant RC as RewardCalculator
participant PC as power_consumption_mwh
participant PCost as power_cost
Env->>RC: calculate(num_on_nodes, num_used_cores, current_price, env_print)
RC->>PC: power_consumption_mwh(num_on_nodes, total_used_cores)
PC-->>RC: power_mwh
RC->>PCost: power_cost(num_on_nodes, total_used_cores, current_price)
PCost-->>RC: cost
RC-->>Env: (reward, energy_cost, utilization_reward, price_reward, ...)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/environment.py (1)
480-485: Keepstep_costsingle-sourced.
calculate()already returns the cost for these exact aggregates, but Line 490 overwrites it with a second copy of the formula. Reusing the returned value here will keep the reward path and metrics path from drifting if the cost model changes again.♻️ Proposed cleanup
step_reward, step_cost, eff_reward_norm, price_reward, idle_penalty_norm, job_age_penalty_norm = self.reward_calculator.calculate( num_used_nodes, num_idle_nodes, current_price, average_future_price, num_off_nodes, num_launched_jobs, num_node_changes, job_queue_2d, num_unprocessed_jobs, self.weights, num_dropped_this_step, self.env_print, num_on_nodes, num_used_cores, ) self.metrics.episode_reward += step_reward step_power_mwh = power_consumption_mwh(num_on_nodes, num_used_cores) - - step_cost = step_power_mwh * current_priceAlso applies to: 488-490
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/environment.py` around lines 480 - 485, The code recomputes the step cost after calling self.reward_calculator.calculate(...) which already returns step_cost; remove the second formula and reuse the step_cost value returned by reward_calculator.calculate so the reward and metrics paths stay consistent. Locate the call to self.reward_calculator.calculate(...) and any subsequent reassignment/explicit recomputation of step_cost (or a local variable named cost) and replace that recomputation by using the step_cost variable returned from calculate() wherever the cost is needed (metrics, logging, or downstream reward math).src/reward_calculation.py (1)
82-89: Trim the retired efficiency-bounds path or align it with the live reward.
calculate()now uses_reward_energy_efficiency_utilization_normalized(), so these price-dependent_min_efficiency_reward/_max_efficiency_rewardvalues no longer influence the active reward path in this file. Keeping_compute_bounds()wired to the old formula makes future tuning harder to reason about.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/reward_calculation.py` around lines 82 - 89, _compute_bounds currently computes _min_efficiency_reward and _max_efficiency_reward using the old _reward_efficiency path, but calculate() now uses _reward_energy_efficiency_utilization_normalized; update _compute_bounds to either remove these retired bounds or recompute them using _reward_energy_efficiency_utilization_normalized (e.g., call _reward_energy_efficiency_utilization_normalized for the extreme cases used before: idle vs fully utilized with MIN_PRICE/MAX_PRICE or corresponding utility inputs) so the stored _min_efficiency_reward/_max_efficiency_reward reflect the live reward path (or delete their assignments if they are no longer referenced anywhere). Ensure you modify the assignments to _min_efficiency_reward and _max_efficiency_reward inside _compute_bounds and keep references to MAX_NODES, CORES_PER_NODE and prices as needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/reward_calculation.py`:
- Around line 16-31: The power_consumption_mwh helper currently accepts invalid
scalar combinations (negative num_powered_nodes, negative total_used_cores, or
total_used_cores > num_powered_nodes * CORES_PER_NODE) and returns bogus values;
add input validation at the start of power_consumption_mwh to raise ValueError
on these impossible states. Specifically, in power_consumption_mwh check that
num_powered_nodes and total_used_cores are >= 0 and that total_used_cores <=
num_powered_nodes * CORES_PER_NODE, and if any check fails raise a clear
exception message referencing the offending values and the constants
(COST_IDLE_MW, COST_USED_MW, CORES_PER_NODE) so callers fail fast instead of
producing skewed energy/cost metrics.
---
Nitpick comments:
In `@src/environment.py`:
- Around line 480-485: The code recomputes the step cost after calling
self.reward_calculator.calculate(...) which already returns step_cost; remove
the second formula and reuse the step_cost value returned by
reward_calculator.calculate so the reward and metrics paths stay consistent.
Locate the call to self.reward_calculator.calculate(...) and any subsequent
reassignment/explicit recomputation of step_cost (or a local variable named
cost) and replace that recomputation by using the step_cost variable returned
from calculate() wherever the cost is needed (metrics, logging, or downstream
reward math).
In `@src/reward_calculation.py`:
- Around line 82-89: _compute_bounds currently computes _min_efficiency_reward
and _max_efficiency_reward using the old _reward_efficiency path, but
calculate() now uses _reward_energy_efficiency_utilization_normalized; update
_compute_bounds to either remove these retired bounds or recompute them using
_reward_energy_efficiency_utilization_normalized (e.g., call
_reward_energy_efficiency_utilization_normalized for the extreme cases used
before: idle vs fully utilized with MIN_PRICE/MAX_PRICE or corresponding utility
inputs) so the stored _min_efficiency_reward/_max_efficiency_reward reflect the
live reward path (or delete their assignments if they are no longer referenced
anywhere). Ensure you modify the assignments to _min_efficiency_reward and
_max_efficiency_reward inside _compute_bounds and keep references to MAX_NODES,
CORES_PER_NODE and prices as needed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: caec7276-9770-4135-a9b4-13b3d85e0d4e
📒 Files selected for processing (3)
src/baseline.pysrc/environment.pysrc/reward_calculation.py
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/environment.py (1)
480-485: Pass the new scalar pair by keyword.These last two arguments are both plain integers, and this PR already fixed one silent reversal in the scalar power path. Keywords would make this call self-documenting and harder to swap accidentally.
♻️ Suggested call-site tweak
step_reward, step_cost, eff_reward_norm, price_reward, idle_penalty_norm, job_age_penalty_norm = self.reward_calculator.calculate( num_used_nodes, num_idle_nodes, current_price, average_future_price, num_off_nodes, num_launched_jobs, num_node_changes, job_queue_2d, num_unprocessed_jobs, self.weights, num_dropped_this_step, self.env_print, - num_on_nodes, num_used_cores, + num_on_nodes=num_on_nodes, + total_used_cores=num_used_cores, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/environment.py` around lines 480 - 485, The call to reward_calculator.calculate currently passes the last two integer arguments positionally (num_on_nodes, num_used_cores) which makes them easy to swap; update the call site in the block where step_reward,... = self.reward_calculator.calculate(...) to pass those two as keywords (e.g. num_on_nodes=num_on_nodes, num_used_cores=num_used_cores) so the meaning is explicit and accidental reordering is prevented while keeping all other positional args the same.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/environment.py`:
- Around line 480-485: The call to reward_calculator.calculate currently passes
the last two integer arguments positionally (num_on_nodes, num_used_cores) which
makes them easy to swap; update the call site in the block where step_reward,...
= self.reward_calculator.calculate(...) to pass those two as keywords (e.g.
num_on_nodes=num_on_nodes, num_used_cores=num_used_cores) so the meaning is
explicit and accidental reordering is prevented while keeping all other
positional args the same.
The per-node power values sum linearly, so the numpy array computation reduces to two scalars. Replace the overloaded array/int signatures with power_consumption_mwh(num_powered_nodes: int, total_used_cores: int). - Remove _power_consumption_from_state_mwh and _used_cores_from_state - Remove overloaded np.ndarray | int type union - Replace optional nodes/cores_available kwargs in calculate() with required num_on_nodes/total_used_cores; drop old-model fallback branch - baseline_off passes num_used_nodes instead of num_on_nodes — no include_idle_nodes flag needed - int() casts at numpy boundary points (environment.py, baseline.py)
calculate() already returns total_cost as its second value; the immediately following step_cost = step_power_mwh * current_price was recomputing the same value from the same inputs.
0ed3b30 to
f17da87
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/reward_calculation.py (1)
256-266: Redundant variable assignment.Line 260
used_cores = total_used_coresis unnecessary —total_used_corescan be used directly in the subsequent logic.♻️ Suggested simplification
step_power_mwh = power_consumption_mwh(num_on_nodes, total_used_cores) if step_power_mwh <= 0.0: return 0.0 - used_cores = total_used_cores - if used_cores <= 0.0: + if total_used_cores <= 0: efficiency_ratio = 0.0 else: - efficiency_raw = used_cores / step_power_mwh # core-hours per MWh for this 1h step + efficiency_raw = total_used_cores / step_power_mwh # core-hours per MWh for this 1h step efficiency_max = float(CORES_PER_NODE) / COST_USED_MW efficiency_ratio = float(np.clip(efficiency_raw / efficiency_max, 0.0, 1.0))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/reward_calculation.py` around lines 256 - 266, The assignment used_cores = total_used_cores is redundant; remove the local used_cores variable and use total_used_cores directly in the conditional and computation (replace used_cores <= 0.0 check and efficiency_raw = used_cores / step_power_mwh with checks and calculation against total_used_cores), keeping the rest of the logic intact (functions/symbols: power_consumption_mwh, step_power_mwh, total_used_cores, efficiency_raw, efficiency_max, np.clip, CORES_PER_NODE, COST_USED_MW).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/reward_calculation.py`:
- Around line 256-266: The assignment used_cores = total_used_cores is
redundant; remove the local used_cores variable and use total_used_cores
directly in the conditional and computation (replace used_cores <= 0.0 check and
efficiency_raw = used_cores / step_power_mwh with checks and calculation against
total_used_cores), keeping the rest of the logic intact (functions/symbols:
power_consumption_mwh, step_power_mwh, total_used_cores, efficiency_raw,
efficiency_max, np.clip, CORES_PER_NODE, COST_USED_MW).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 36f0ca32-5920-48ef-826c-fde5e67165da
📒 Files selected for processing (5)
src/baseline.pysrc/environment.pysrc/reward_calculation.pytest/test_price_history.pytest/test_prices_cycling.py
🚧 Files skipped from review as they are similar to previous changes (2)
- src/environment.py
- src/baseline.py
The per-node power values sum linearly, so the numpy array computation over all 335 nodes reduces to two scalars: number of powered-on nodes and total used cores. Make that the interface:
power_consumption_mwh(nodes_or_num_used: np.ndarray | int, cores_or_num_idle: np.ndarray | int, include_idle_nodes=True)withpower_consumption_mwh(num_powered_nodes: int, total_used_cores: int)- a one-liner, no numpy_power_consumption_from_state_mwh,_used_cores_from_state, and the overloaded type unionnodes/cores_availablearray kwargs incalculate()with requirednum_on_nodes: int, total_used_cores: int, and drop the old-model fallback branchbaseline_offsimply passesnum_used_nodesinstead ofnum_on_nodes- noinclude_idle_nodesflag neededint()casts added at the two numpy boundary points (environment.py, baseline.py) to keep type annotations honest throughoutAlso fixes a latent bug in
_compute_bounds:The old scalar path had the arguments reversed for the min-efficiency case:
Under the new interface the intent is explicit:
The overloaded signature masked the error -
0andMAX_NODESwere silently interpreted asnum_used_nodesandnum_idle_nodesrespectively, which gave a different (lower) cost than the true all-idle scenario and therefore a looser normalization floor.