Add model-selection strategies and local BioModels batch mode#2
Open
nprzrosas wants to merge 1 commit into
Open
Add model-selection strategies and local BioModels batch mode#2nprzrosas wants to merge 1 commit into
nprzrosas wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR makes it easier to evaluate
data2sbmlon curated BioModels and adds more flexible ways to choose the finalinferred model.
There are two main improvements:
.sedmlfilecriteria
What changed
data2sbml.py:pysr_model_selectionglobal_rmseglobal_multiobjectivebiomodels_batch.pyso it can run directly on a local curated BioModels directory with--local-models- dir.--fallback-start--fallback-duration--fallback-points--require-sedmlfor cases where strict SED-ML-only behavior is preferred.biomodels_batch_summary.tsvwith simulation provenance and richer per-model metrics.Why this is useful
Previously, batch evaluation worked best when a model had both SBML and a usable SED-ML time-course definition.
In practice, the curated local BioModels set is not that uniform, so some models were hard to evaluate without
manual handling.
This PR makes the workflow more practical for benchmarking:
Validation
pytest testspasses locally (20 passed)BIOMD0000000001status = successrmse_mean = 9.036302267035591e-21Notes
MANUAL INSPECTION OF THE SIMULATED PLOTS COMPARED WITH THE ORIGINAL SIMULATIONS FOR BIOMD0000000001 INDICATES THAT THE SIMULATED SIGNALS DO NOT REPRODUCE THE ORIGINAL TRAJECTORIES. AS A RESULT, ADDITIONAL WORK IS NEEDED TO IMPROVE THIS PIPELINE.
This should still be treated as an evaluation-focused improvement, not a claim that the full BioModels set has been validated.
The new global selection strategies are in place, but they still need broader benchmarking across more models.