Skip to content

fix: restore over_time time slider behavior in examples#857

Closed
blooop wants to merge 3 commits intomainfrom
fix/over-time-slider-behavior
Closed

fix: restore over_time time slider behavior in examples#857
blooop wants to merge 3 commits intomainfrom
fix/over-time-slider-behavior

Conversation

@blooop
Copy link
Copy Markdown
Owner

@blooop blooop commented Mar 28, 2026

Problem

The recent commit e033fbe (which fixed panel labeling) inadvertently broke the expected behavior of over_time examples in published documentation:

  • Before: Over_time examples created single tabs with interactive time sliders for scrubbing through snapshots
  • After: Each plot_sweep() call with time_src created separate tabs with timestamp labels like over_time [2000-01-01 00:00:00]

This meant that meta-generated examples that were supposed to show interactive time sliders instead showed multiple disconnected tabs, breaking the user experience in published docs.

Solution

This PR restores the original behavior by implementing smart detection of over_time series continuations:

Changes Made

  1. Added _is_over_time_series_continuation() method: Detects when an over_time result is part of an existing time series with the same base title

  2. Modified append_result() logic: Only adds timestamp labels when results are NOT part of a time series continuation

  3. Preserved single-tab behavior: Multiple plot_sweep() calls with the same title now aggregate into one result with time snapshots

Result

  • ✅ Over_time examples now create single tabs with interactive time sliders (original behavior restored)
  • ✅ Multiple time snapshots aggregate correctly instead of creating separate tabs
  • ✅ Time labels only appear for genuinely separate over_time experiments
  • ✅ Published docs will show proper interactive functionality

Testing

  • Logic verification completed with unit tests
  • Continuation detection works correctly for same/different titles
  • No breaking changes to existing non-over_time functionality

Fixes the issue introduced by commit e033fbe while preserving its intended panel labeling improvements.

Summary by Sourcery

Restore correct aggregation behavior for over_time benchmark results so that multiple snapshots appear under a single tab with a time slider instead of separate timestamped tabs.

Bug Fixes:

  • Prevent over_time examples from generating multiple separate tabs with timestamped titles instead of a single interactive time slider tab.

Enhancements:

  • Add detection for over_time series continuations to decide when to aggregate results versus label them with timestamps.

Over_time examples were creating separate tabs per time snapshot instead of
single tabs with interactive time sliders, breaking the expected user experience
in published docs.

Changes:
- Add _is_over_time_series_continuation() to detect time series continuations
- Modify append_result() to only add time labels for non-continuation results
- Preserve single-tab behavior for plot_sweep() calls with same title
- Restore interactive time slider functionality for over_time examples

Fixes issue where commit e033fbe inadvertently changed over_time behavior
from aggregated time sliders to separate timestamped tabs.
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Mar 28, 2026

Reviewer's Guide

Restores the original single-tab over_time time slider behavior by detecting when new over_time results should be aggregated with existing series and only appending time labels for genuinely separate experiments.

Sequence diagram for append_result over_time continuation detection

sequenceDiagram
    participant Caller
    participant BenchReport
    participant BenchResult
    participant BenchCfg
    participant DataSet
    participant Coords
    participant Panel

    Caller->>BenchReport: append_result(bench_res)
    BenchReport->>BenchResult: get bench_cfg
    BenchResult-->>BenchReport: bench_cfg
    BenchReport->>BenchCfg: get title
    BenchCfg-->>BenchReport: title

    BenchReport->>BenchReport: _time_event_label(bench_res)
    BenchReport-->>BenchReport: label or null

    alt label is not null
        BenchReport->>BenchReport: _is_over_time_series_continuation(bench_res)
        BenchReport->>BenchResult: get bench_cfg
        BenchResult-->>BenchReport: bench_cfg
        BenchReport->>BenchCfg: read over_time
        BenchCfg-->>BenchReport: over_time
        BenchReport->>BenchResult: get ds
        BenchResult-->>BenchReport: ds
        BenchReport->>DataSet: get coords
        DataSet-->>BenchReport: coords
        BenchReport->>Coords: contains(over_time)
        Coords-->>BenchReport: bool
        alt over_time and coord present
            BenchReport->>BenchReport: iterate bench_results
            BenchReport-->>BenchReport: found existing_res with same title?
        end
        BenchReport-->>BenchReport: should_add_time_label
        alt should_add_time_label
            BenchReport->>BenchReport: title = title + " [" + label + "]"
        end
    end

    BenchReport->>BenchReport: bench_results.append(bench_res)

    BenchReport->>BenchResult: plot()
    BenchResult-->>BenchReport: pane
    BenchReport->>Panel: append_tab(pane, title)
    Panel-->>Caller: updated UI tab
Loading

Class diagram for updated BenchReport over_time aggregation logic

classDiagram
    class BenchReport {
        list~BenchResult~ bench_results
        _time_event_label(bench_res: BenchResult) str
        _is_over_time_series_continuation(bench_res: BenchResult) bool
        append_result(bench_res: BenchResult) void
        append_to_result(bench_res: BenchResult, pane: pn_panel) void
        append_tab(pane: pn_panel, title: str) void
    }

    class BenchResult {
        BenchCfg bench_cfg
        DataSet ds
        plot() pn_panel
    }

    class BenchCfg {
        str title
        bool over_time
    }

    class DataSet {
        Coords coords
    }

    class Coords {
        contains(key: str) bool
    }

    BenchReport "1" --> "*" BenchResult : bench_results
    BenchResult "1" --> "1" BenchCfg : bench_cfg
    BenchResult "1" --> "1" DataSet : ds
    DataSet "1" --> "1" Coords : coords
Loading

File-Level Changes

Change Details Files
Detect continuation of over_time series to aggregate related results into a single tab instead of creating separate timestamped tabs.
  • Introduce _is_over_time_series_continuation to identify over_time results that share a base title with an existing result and should be aggregated
  • Guard continuation detection by checking both bench_cfg.over_time and the presence of an 'over_time' coordinate on the dataset
  • Search existing bench_results for a result with the same base title to decide if the new result is a continuation
bencher/bench_report.py
Adjust result appending logic so time labels are only added for non-continuation over_time results, preserving single-tab slider behavior for series.
  • Compute the time label with _time_event_label, then determine should_add_time_label based on both the presence of a label and not being an over_time continuation
  • Apply the time label to the title only when should_add_time_label is True, otherwise keep the original title for aggregation
  • Append the BenchResult to bench_results after computing the final title and then create the corresponding panel tab with append_tab
bencher/bench_report.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The _is_over_time_series_continuation() implementation doesn’t fully match its docstring: it only checks the incoming bench_res for over_time/coords but doesn’t verify that any matching existing result is itself an over_time series, which can lead to unintended aggregation when a non-over_time result happens to share the same title.
  • Relying on bench_cfg.title equality as the sole key for series continuation may be brittle if titles are reused for logically distinct experiments; consider also keying on a more stable identifier (e.g., a run ID or config hash) to avoid accidental merging.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `_is_over_time_series_continuation()` implementation doesn’t fully match its docstring: it only checks the incoming `bench_res` for `over_time`/coords but doesn’t verify that any matching existing result is itself an `over_time` series, which can lead to unintended aggregation when a non-over_time result happens to share the same title.
- Relying on `bench_cfg.title` equality as the sole key for series continuation may be brittle if titles are reused for logically distinct experiments; consider also keying on a more stable identifier (e.g., a run ID or config hash) to avoid accidental merging.

## Individual Comments

### Comment 1
<location path="bencher/bench_report.py" line_range="85-94" />
<code_context>
             label = label[:57] + "..."
         return label

+    def _is_over_time_series_continuation(self, bench_res: BenchResult) -> bool:
+        """Check if this over_time result is a continuation of an existing time series.
+        
+        Returns True if there's already a result with the same title that has over_time,
+        indicating this should be part of the same aggregated time series.
+        """
+        if not bench_res.bench_cfg.over_time or "over_time" not in bench_res.ds.coords:
+            return False
+            
+        base_title = bench_res.bench_cfg.title
+        
+        # Check if we already have a result with this exact title (no time label)
+        for existing_res in self.bench_results:  # Check all existing results
+            if existing_res.bench_cfg.title == base_title:
+                return True
</code_context>
<issue_to_address>
**issue (bug_risk):** Filter by over_time when deciding if this is a continuation.

The implementation doesn’t match the docstring: it only checks for matching titles and never confirms that any `existing_res` is actually over_time. As a result, a non-over_time result with the same title will cause a later over_time result to be treated as a continuation. In the loop, also check something like `existing_res.bench_cfg.over_time` or `"over_time" in existing_res.ds.coords` so you only aggregate into true over_time series.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +85 to +94
def _is_over_time_series_continuation(self, bench_res: BenchResult) -> bool:
"""Check if this over_time result is a continuation of an existing time series.

Returns True if there's already a result with the same title that has over_time,
indicating this should be part of the same aggregated time series.
"""
if not bench_res.bench_cfg.over_time or "over_time" not in bench_res.ds.coords:
return False

base_title = bench_res.bench_cfg.title
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Filter by over_time when deciding if this is a continuation.

The implementation doesn’t match the docstring: it only checks for matching titles and never confirms that any existing_res is actually over_time. As a result, a non-over_time result with the same title will cause a later over_time result to be treated as a continuation. In the loop, also check something like existing_res.bench_cfg.over_time or "over_time" in existing_res.ds.coords so you only aggregate into true over_time series.

@github-actions
Copy link
Copy Markdown

Performance Report for 8656a74

Metric Value
Total tests 980
Total time 89.65s
Mean 0.0915s
Median 0.0020s
Top 10 slowest tests
Test Time (s)
test.test_bench_examples.TestBenchExamples::test_example_meta 22.902
test.test_over_time_save_perf::test_save_faster_without_aggregated_tab 5.425
test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool] 3.956
test.test_generated_examples::test_generated_example[result_types/result_image/result_image_to_video.py] 3.067
test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max 2.068
test.test_optuna_result.TestOptunaResult::test_collect_optuna_plots_with_repeats 1.054
test.test_optuna_result.TestOptunaReportRouting::test_optuna_plots_per_sweep_tab 1.042
test.test_bencher.TestBencher::test_bench_cfg_hash 0.959
test.test_result_bool.TestVolumeResult::test_volume_3float_multi_repeat 0.891
test.test_over_time_repeats.TestShowAggregatedTimeTab::test_curve_aggregated_tab_absent_when_disabled 0.854

Full report

Updated by Performance Tracking workflow

Apply pre-commit formatting fixes:
- Standardize indentation and whitespace
- Consolidate multi-line assignment
- Remove extra blank lines

These changes ensure CI pre-commit checks pass.
@github-actions
Copy link
Copy Markdown

Performance Report for 16c0602

Metric Value
Total tests 980
Total time 88.75s
Mean 0.0906s
Median 0.0020s
Top 10 slowest tests
Test Time (s)
test.test_bench_examples.TestBenchExamples::test_example_meta 22.871
test.test_over_time_save_perf::test_save_faster_without_aggregated_tab 4.990
test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool] 3.836
test.test_generated_examples::test_generated_example[result_types/result_image/result_image_to_video.py] 2.921
test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max 2.440
test.test_bench_runner.TestBenchRunner::test_benchrunner_unified_interface 1.197
test.test_optuna_result.TestOptunaResult::test_collect_optuna_plots_with_repeats 1.054
test.test_optuna_result.TestOptunaReportRouting::test_optuna_plots_per_sweep_tab 1.017
test.test_result_bool.TestVolumeResult::test_volume_3float_multi_repeat 0.860
test.test_bencher.TestBencher::test_combinations 0.825

Full report

Updated by Performance Tracking workflow

@github-actions
Copy link
Copy Markdown

Performance Report for 392f256

Metric Value
Total tests 1019
Total time 102.21s
Mean 0.1003s
Median 0.0010s
Top 10 slowest tests
Test Time (s)
test.test_bench_examples.TestBenchExamples::test_example_meta 22.934
test.test_over_time_save_perf::test_save_faster_without_aggregated_tab 5.393
test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool] 3.694
test.test_generated_examples::test_generated_example[advanced/advanced_cartesian_animation.py] 2.928
test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max 2.928
test.test_generated_examples::test_generated_example[result_types/result_image/result_image_to_video.py] 2.864
test.test_bench_runner.TestBenchRunner::test_benchrunner_unified_interface 1.289
test.test_optuna_result.TestOptunaResult::test_collect_optuna_plots_with_repeats 1.278
test.test_bencher.TestBencher::test_combinations_over_time 1.276
test.test_bencher.TestBencher::test_combinations 1.087

Full report

Updated by Performance Tracking workflow

@blooop blooop closed this Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant