Skip to content

OpenSTEF Meta V0.1#771

Open
Lars800 wants to merge 93 commits intorelease/v4.0.0from
research/HybridForecaster2.0
Open

OpenSTEF Meta V0.1#771
Lars800 wants to merge 93 commits intorelease/v4.0.0from
research/HybridForecaster2.0

Conversation

@Lars800
Copy link
Collaborator

@Lars800 Lars800 commented Nov 28, 2025

Introducting OpenSTEF-Meta, a one-stop-shop for all things meta learning.

This sub-package introduces four common meta learning algorithms:

Residual Forecaster

  • 2 stage Forecasting Model
  • Primary model is fitted as usual
  • Secondary model fitted on residuals

Stacking Forecaster

  • Multiple Base Forecasters
  • Single Regressor is fitted on Base Predictions

Learned Weights Forecaster

  • Multiple Base Forecasters
  • Classification model learns optimal model weights

Rules Forecaster

  • Multiple Base Forecasters
  • Pre-defined rules on how to combine base methods.

This is an initial implementation

  • The methods have been tested on the Liander 2024 Huggingface dataset.
  • The results are accurate and improve over existing LGBM, XGBoost and GBLinear models
  • The code can still be further optimized for efficiency

Lars800 and others added 30 commits November 7, 2025 15:48
commit 37089b8
Author: Egor Dmitriev <egor.dmitriev@alliander.com>
Date:   Mon Nov 17 15:29:59 2025 +0100

    fix(#728): Fixed parallelism stability issues, and gblinear feature pipeline. (#752)

    * fix(STEF-2475): Added loky as default option for parallelism since fork causes instabilities for xgboost results.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * fix(STEF-2475): Added better support for flatliners and predicting when data is sparse.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * fix(STEF-2475): Feature handing improvements for gblinear. Like imputation, nan dropping, and checking if features are available.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * fix(#728): Added checks on metrics to gracefully handle empty data. Added flatline filtering during evalution.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * fix(#728): Updated xgboost to skip scaling on empty prediction.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * fix(STEF-2475): Added parallelism parameters.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    ---------

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

commit a85a3f7
Author: Egor Dmitriev <egor.dmitriev@alliander.com>
Date:   Fri Nov 14 14:31:34 2025 +0100

    fix(STEF-2475): Fixed rolling aggregate adder by adding forward filling and stating support for only one horizon. (#750)

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

commit 4f0c664
Author: Egor Dmitriev <egor.dmitriev@alliander.com>
Date:   Thu Nov 13 16:54:15 2025 +0100

    feature: Disabled data cutoff by default to be consistent with openstef 3.  And other minor improvements. (#748)

commit 493126e
Author: Egor Dmitriev <egor.dmitriev@alliander.com>
Date:   Thu Nov 13 16:12:35 2025 +0100

    fix(STEF-2475) fix and refactor backtesting iction in context of backtestforecasting config for clarity. Added more colors. Fixed data split function to handle 0.0 splits. (#747)

    * fix: Fixed data collation during backtesting. Renamed horizon to prediction in context of backtestforecasting config for clarity. Added more colors. Fixed data split function to handle 0.0 splits.

    * fix: Formatting.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * fix: Formatting.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    ---------

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

commit 6b1da44
Author: Egor Dmitriev <egor.dmitriev@alliander.com>
Date:   Thu Nov 13 16:05:32 2025 +0100

    feature: forecaster hyperparams and eval metrics (#746)

    * feature(#729) Removed to_state and from_state methods in favor of builtin python state saving functions.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * feature(#729): Fixed issue where generic transform pipeline could not be serialized.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * feature(#729): Added more state saving tests

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * feature(#729): Added more state saving tests

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * feature(#729): Added more state saving tests

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * feature: standardized objective function. Added custom evaluation functions for forecasters.

    * fix: Formatting.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    ---------

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
Signed-off-by: Lars van Someren <lvsom1@gmail.com>
Signed-off-by: Lars van Someren <lvsom1@gmail.com>
* Added Lightgbm, LightGBM Linear Trees and Hybrid Stacking Forecasters

* Fixed small issues

* Ruff compliance

* fixed quality checks

* Fixed last issues, Signed-off-by: Lars van Someren <lars.vansomeren@sia-partners.com>

* fixed comments

* Refactor LightGBM to LGBM

* Update LGBM and LGBMLinear defaults, fixed comments

* Fixed comments

* Added SkopsModelSerializer

* Fixed issues

* Gitignore optimization and dev sandbox

* Added MultiQuantileAdapter Class

* small fix

* Hybrid V2

* Small fix

* Squashed commit of the following:

commit 37089b8
Author: Egor Dmitriev <egor.dmitriev@alliander.com>
Date:   Mon Nov 17 15:29:59 2025 +0100

    fix(#728): Fixed parallelism stability issues, and gblinear feature pipeline. (#752)

    * fix(STEF-2475): Added loky as default option for parallelism since fork causes instabilities for xgboost results.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * fix(STEF-2475): Added better support for flatliners and predicting when data is sparse.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * fix(STEF-2475): Feature handing improvements for gblinear. Like imputation, nan dropping, and checking if features are available.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * fix(#728): Added checks on metrics to gracefully handle empty data. Added flatline filtering during evalution.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * fix(#728): Updated xgboost to skip scaling on empty prediction.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * fix(STEF-2475): Added parallelism parameters.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    ---------

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

commit a85a3f7
Author: Egor Dmitriev <egor.dmitriev@alliander.com>
Date:   Fri Nov 14 14:31:34 2025 +0100

    fix(STEF-2475): Fixed rolling aggregate adder by adding forward filling and stating support for only one horizon. (#750)

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

commit 4f0c664
Author: Egor Dmitriev <egor.dmitriev@alliander.com>
Date:   Thu Nov 13 16:54:15 2025 +0100

    feature: Disabled data cutoff by default to be consistent with openstef 3.  And other minor improvements. (#748)

commit 493126e
Author: Egor Dmitriev <egor.dmitriev@alliander.com>
Date:   Thu Nov 13 16:12:35 2025 +0100

    fix(STEF-2475) fix and refactor backtesting iction in context of backtestforecasting config for clarity. Added more colors. Fixed data split function to handle 0.0 splits. (#747)

    * fix: Fixed data collation during backtesting. Renamed horizon to prediction in context of backtestforecasting config for clarity. Added more colors. Fixed data split function to handle 0.0 splits.

    * fix: Formatting.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * fix: Formatting.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    ---------

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

commit 6b1da44
Author: Egor Dmitriev <egor.dmitriev@alliander.com>
Date:   Thu Nov 13 16:05:32 2025 +0100

    feature: forecaster hyperparams and eval metrics (#746)

    * feature(#729) Removed to_state and from_state methods in favor of builtin python state saving functions.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * feature(#729): Fixed issue where generic transform pipeline could not be serialized.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * feature(#729): Added more state saving tests

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * feature(#729): Added more state saving tests

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * feature(#729): Added more state saving tests

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * feature: standardized objective function. Added custom evaluation functions for forecasters.

    * fix: Formatting.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    ---------

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

* set silence

* small fix

* Fix final learner

* fixed lgbm efficiency

* updated lgbm linear params

* Fixed type and quality issues

* remove depricated files

Signed-off-by: Lars van Someren <lvsom1@gmail.com>

* change: Fixed dependencies to align more with the current release.

Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

* change: Style fixes.

Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

---------

Signed-off-by: Lars van Someren <lvsom1@gmail.com>
Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
Co-authored-by: Egor Dmitriev <egor.dmitriev@alliander.com>
Signed-off-by: Lars van Someren <lvsom1@gmail.com>
Signed-off-by: Lars van Someren <lvsom1@gmail.com>
@MvLieshout
Copy link
Collaborator

Really nice additions to OpenSTEF, I have left some comments. Mostly small nitpicks.

Signed-off-by: Lars van Someren <lvsom1@gmail.com>
Signed-off-by: Lars van Someren <lvsom1@gmail.com>
Signed-off-by: Lars van Someren <lvsom1@gmail.com>
Squashed commit of the following:

commit 6d140bc
Author: Lars Schilders <123180911+lschilders@users.noreply.github.com>
Date:   Wed Dec 17 10:33:19 2025 +0100

    feature: add regex pattern matching in FeatureSelection and fix combine bug (#787)

commit 32a42bb
Author: Lars Schilders <123180911+lschilders@users.noreply.github.com>
Date:   Tue Dec 16 13:50:40 2025 +0100

    feature: Selector transform (#786)

    * feature: add Selector transform

    * add ForecastInputDataset testcases

    * add selected_features to presets

    * add doctest

commit d3977b1
Author: Egor Dmitriev <egor.dmitriev@alliander.com>
Date:   Mon Dec 15 09:17:38 2025 +0100

    feature: added tutorials for basic functionality. Added convenience method for simple openstef baselines. (#785)

    * feature: Added tutorial start.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * feature: Added example notebooks. First draft.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * chore(examples): add examples workspace project and register it in workspace; update lock

    * fix(lint): add missing docstrings in baselines package (D104, D103)

    * chore(examples): add examples workspace project and register it in workspace; update lock

    * chore(examples): Updated text in examples.

    ---------

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

commit 8a4097c
Author: Bart Pleiter <bart.pleiter@alliander.com>
Date:   Wed Dec 10 16:19:48 2025 +0100

    fix: exclude stdev column from quantile column checking. (#783)

    * fix: exclude stdev column from quantile column checking.

    Signed-off-by: Bart Pleiter <bart.pleiter@alliander.com>

    * fix: duplicate removed.

    Signed-off-by: Bart Pleiter <bart.pleiter@alliander.com>

    * fix: type

    Signed-off-by: Bart Pleiter <bart.pleiter@alliander.com>

    ---------

    Signed-off-by: Bart Pleiter <bart.pleiter@alliander.com>

commit 43987fc
Author: Egor Dmitriev <egor.dmitriev@alliander.com>
Date:   Wed Dec 10 10:24:32 2025 +0100

    fix(STEF-2549): Added none check for model end date from mlflow. Added experiment tags. (#782)

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

commit 1891009
Author: Lars Schilders <123180911+lschilders@users.noreply.github.com>
Date:   Tue Dec 9 14:40:40 2025 +0100

    feature: check for model config change and skip model selection (#781)

    * feature: check for model config change and skip model selection

    * changed checking model compatibility

    * check for tag compatibility only

    * fix tests

    * rename new methods in callback

commit c37ac92
Author: Lars Schilders <123180911+lschilders@users.noreply.github.com>
Date:   Tue Dec 9 09:22:58 2025 +0100

    fix: clip values of wind and solar components to below 0 (#779)

    * fix: clip values of wind and solar components to below 0

    * add test for not all components zero

commit 3eb7e69
Author: Egor Dmitriev <egor.dmitriev@alliander.com>
Date:   Mon Dec 8 15:55:16 2025 +0100

    feat(mlflow): suppress MLflow emoji URL logs (#780)

    * feat(mlflow): suppress MLflow emoji URL logs

    Add MLFLOW_SUPPRESS_PRINTING_URL_TO_STDOUT=true environment variable
    to prevent MLflow from printing 'View run...' messages with emojis
    that don't comply with ECS JSON logging format.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * feature: Style fixes.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    ---------

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

commit eca628e
Author: Lars Schilders <123180911+lschilders@users.noreply.github.com>
Date:   Fri Dec 5 16:27:53 2025 +0100

    feature: nonzero flatliner preset (#777)

    * add predict_nonzero_flatliner to presets

    * remove redundant validate_required_columns

commit 61e1699
Author: Lars Schilders <123180911+lschilders@users.noreply.github.com>
Date:   Fri Dec 5 15:37:05 2025 +0100

    feature: add standard devation column to ForecastDataset and add it in ConfidenceIntervalApplicator (#778)

    * feature: add standard devation column to ForecastDataset and add it in ConfidenceIntervalApplicator

    * simplify code for adding column

commit 4f70d00
Author: Lars Schilders <123180911+lschilders@users.noreply.github.com>
Date:   Fri Dec 5 10:41:23 2025 +0100

    chore: change radiation unit to Wm-2 (#776)

    * chore: change expected radiation unit to W/m-2

    * change values in test for radiation features adder

    * fix docs for dni/gti unit

    * formatting

commit 71ac428
Author: Bart Pleiter <bart.pleiter@alliander.com>
Date:   Wed Dec 3 09:45:34 2025 +0100

    feature: added use_median option to flatliner forecaster so it predic… (#773)

    * feature: added use_median option to flatliner forecaster so it predicts the median of the training data.

    Signed-off-by: Bart Pleiter <bart.pleiter@alliander.com>

    * feature: improved naming to predict_median.

    Signed-off-by: Bart Pleiter <bart.pleiter@alliander.com>

    ---------

    Signed-off-by: Bart Pleiter <bart.pleiter@alliander.com>

commit 45ca37f
Author: Lars Schilders <123180911+lschilders@users.noreply.github.com>
Date:   Wed Nov 26 15:45:07 2025 +0100

    fix: fixes in EvaluationPipeline and TimeSeriesPlotter (#769)

    * Remove target column from predictions to avoid duplication for lead_times

    * get sample_interval class attr

commit ee41442
Author: Egor Dmitriev <egor.dmitriev@alliander.com>
Date:   Wed Nov 26 14:01:16 2025 +0100

    fix: Improved mlflow to use run names and load proper models for reuse. Fixed time series plotter to use correct sample interval paramter. (#768)

    * feature: Improved mlflow to use run names and load proper models for reuse. Fixed time series plotter to use correct sample interval paramter.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    * feature(STEF-2551): Fixed path. Changed run_name to step_name in backtester.

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

    ---------

    Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>

commit 7deb69e
Author: Bart Pleiter <bart.pleiter@alliander.com>
Date:   Fri Nov 21 14:42:54 2025 +0100

    chore: replaced alliander emails with lfenergy email. (#767)

    Signed-off-by: Bart Pleiter <bart.pleiter@alliander.com>

Signed-off-by: Lars van Someren <lvsom1@gmail.com>
…ybridForecaster2.0

Signed-off-by: Lars van Someren <lvsom1@gmail.com>
Signed-off-by: Lars van Someren <lvsom1@gmail.com>
Signed-off-by: Lars van Someren <lvsom1@gmail.com>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

colsample_bytree not used in this file?

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces OpenSTEF-Meta v0.1, a new sub-package that provides meta-learning algorithms for ensemble forecasting. The package implements four forecasting approaches: Residual Forecaster (2-stage modeling on residuals), Stacking Forecaster (multiple models with meta-regressor), Learned Weights Forecaster (classification-based model selection), and Rules Forecaster (rule-based combination).

Changes:

  • New openstef-meta package with ensemble forecasting models and forecast combiners
  • Integration of ensemble models into existing OpenSTEF workflows and benchmarking infrastructure
  • Addition of predict_contributions method to forecasters for explainability support
  • New utility classes for decision trees, pinball loss calculation, and ensemble datasets

Reviewed changes

Copilot reviewed 55 out of 62 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
packages/openstef-meta/* New package implementing meta-learning algorithms
packages/openstef-models/src/openstef_models/workflows/* Integration support for ensemble models
packages/openstef-models/src/openstef_models/models/forecasting/* Added predict_contributions methods
packages/openstef-models/src/openstef_models/transforms/general/* New Flagger transform and fixes
packages/openstef-beam/src/openstef_beam/benchmarking/* Benchmark support for ensemble models
examples/benchmarks/* Example benchmarks for ensemble and residual models
pyproject.toml, uv.lock Dependency and workspace updates

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

{
"A": [1, 1],
"B": [0, 1],
"C": [0, 0], # Unchanged
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misleading comment: The comment on line 57 says "Unchanged" but column C is actually transformed like all other columns. The Flagger transform converts all selected features to binary flags (0 or 1) indicating whether values are within training range. The comment should be removed or clarified, for example: "C: [0, 0] - both values are outside training range [1.0, 3.0]".

Copilot uses AI. Check for mistakes.
train_dataset: TimeSeriesDataset,
test_dataset: TimeSeriesDataset,
):
"""Test fit and transform flags correctly leaves other columns unchanged."""
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misleading docstring: The test docstring on line 43 says "leaves other columns unchanged" but the Flagger transform actually changes ALL selected columns by converting them to binary flags. Consider updating the docstring to accurately describe the test behavior, such as "Test fit and transform correctly flags all features based on training ranges."

Copilot uses AI. Check for mistakes.
@MvLieshout MvLieshout changed the base branch from research/v4.1.0 to release/v4.0.0 February 17, 2026 10:02
@MvLieshout MvLieshout requested a review from a team February 17, 2026 10:02
@MvLieshout MvLieshout changed the base branch from release/v4.0.0 to research/v4.1.0 February 17, 2026 10:02
@MvLieshout MvLieshout changed the base branch from research/v4.1.0 to release/v4.0.0 February 17, 2026 10:05
Signed-off-by: Marnix van Lieshout <marnix.van.lieshout@alliander.com>
Signed-off-by: Marnix van Lieshout <marnix.van.lieshout@alliander.com>
Signed-off-by: Marnix van Lieshout <marnix.van.lieshout@alliander.com>
Signed-off-by: Marnix van Lieshout <marnix.van.lieshout@alliander.com>
Signed-off-by: Marnix van Lieshout <marnix.van.lieshout@alliander.com>
Signed-off-by: Marnix van Lieshout <marnix.van.lieshout@alliander.com>
Signed-off-by: Marnix van Lieshout <marnix.van.lieshout@alliander.com>
Signed-off-by: Marnix van Lieshout <marnix.van.lieshout@alliander.com>
Signed-off-by: Marnix van Lieshout <marnix.van.lieshout@alliander.com>
Signed-off-by: Marnix van Lieshout <marnix.van.lieshout@alliander.com>
Signed-off-by: Marnix van Lieshout <marnix.van.lieshout@alliander.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature New feature or request OpenSTEF 4.0 Work for OpenSTEF 4.0 OpenSTEF-Meta

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments