Conversation
13-task plan covering robustness fixes, DDA support, new DIA-NN params, InfinDIA groundwork, comprehensive documentation, and issue cleanup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Without pipefail, if the command before tee fails, tee returns 0 and the Nextflow task appears to succeed. This masked failures in generate_cfg, diann_msstats, samplesheet_check, and sdrf_parsing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These are the longest-running tasks and most susceptible to transient failures (OOM, I/O timeouts). The error_retry label enables automatic retry on signal exits (130-145, 104, 175). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Guard ch_searchdb and ch_experiment_meta with ifEmpty to fail fast with clear error messages instead of hanging indefinitely. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds conf/diann_versions/v2_3_2.config with ghcr.io/bigbio/diann:2.3.2 container. Use -profile diann_v2_3_2 to opt in. Default stays 1.8.1. Enables DDA support and InfinDIA features. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New param diann_dda (boolean, default: false) - Version guard: requires DIA-NN >= 2.3.2 - Passes --dda to all 5 DIA-NN modules when enabled - Accepts DDA acquisition method in SDRF when diann_dda=true - Added --dda to blocked lists in all modules Closes #5 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- test_dda: BSA dataset with diann_dda=true on DIA-NN 2.3.2 - test_dia_skip_preanalysis: tests previously untested skip path Both added to extended_ci.yml stage 2a. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- diann_light_models: 10x faster in-silico library generation - diann_export_quant: fragment-level parquet export - diann_site_ms1_quant: MS1 apex intensities for PTM quantification All require DIA-NN >= 2.0. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Experimental support for InfinDIA (DIA-NN 2.3.0+). Passes --infin-dia to library generation when enabled. Version guard enforces >= 2.3.0. No test config — InfinDIA requires large databases. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Complete reference for all ~70 pipeline parameters grouped by category with types, defaults, descriptions, and version requirements. Closes #1 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- DDA mode documentation with limitations - Missing param sections (preprocessing, extra_args scope, verbose output) - DIA-NN version selection guide - Parquet vs TSV output explanation - MSstats format section - pmultiqc citation added - README updated with version table and parameter reference link Closes #3, #9, #15 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add version guard for DIA-NN 2.0+ params (--light-models, --export-quant, --site-ms1-quant) to prevent crashes with 1.8.1 - Add *.site_report.parquet as optional output in FINAL_QUANTIFICATION for site-level PTM quantification Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. test_dda.config: Add diann_version = '2.3.2' so the version guard
doesn't reject DDA mode (default is 1.8.1, guard requires >= 2.3.2)
2. quantmsdiann.nf: Update branch condition to also match "dda"
acquisition method. Previously "dda".contains("dia") was false,
causing all DDA files to be silently dropped from processing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These flags exist in DIA-NN 1.8.x but were removed in 2.3.x, causing 'unrecognised option' warnings. Only pass them for versions < 2.3. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 Walkthrough📝 Walkthrough🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Bruker .d to mzML conversion via tdf2mzml is no longer needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Merge dev branch (version bump, tdf2mzml removal, lint fixes, DOI update) - Update test_dda.config to use PXD022287 HeLa DDA dataset with subset FASTA - Add test_dda profile to CI matrix in ci.yml Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The test_dda profile uses ghcr.io/bigbio/diann:2.3.2 which is a private container requiring authentication. Add Docker login step (matching merge_ci.yml) conditioned on test_dda profile. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove implementation plan from repo, add docs/plans/ to .gitignore - Add lib/VersionUtils.groovy for semantic version comparison (prevents string comparison bugs like '2.10.0' < '2.3') - Update all version guards in dia.nf and module scripts to use VersionUtils.versionLessThan/versionAtLeast Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DDA analysis support is a major feature warranting a major version bump. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rename output version→versions in sdrf_parsing/meta.yml - Add ch_ prefix to input_file→ch_input_file in input_check/meta.yml - Fix grammar in pmultiqc and diann_msstats meta.yml descriptions - Fix glob pattern in decompress_dotd/meta.yml (double-dot expansion) - Update CITATIONS.md to link published Nature Methods article - Fix schema_input.json error messages (source name, whitespace) - Standardize quantmsdiann keyword in utils meta.yml Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update the quantms-utils and pmultiqc images
Fix/insilico log diann 2.3.2
Since quantmsdiann is a DIA-NN-only pipeline, the diann_ prefix on parameters is redundant. Renamed all user-facing params: diann_debug -> debug_level diann_speclib -> speclib diann_extra_args -> extra_args diann_dda -> dda diann_light_models -> light_models diann_export_quant -> export_quant diann_site_ms1_quant -> site_ms1_quant diann_pre_select -> pre_select diann_report_decoys -> report_decoys diann_export_xic -> export_xic diann_normalize -> normalize diann_use_quant -> use_quant diann_tims_sum -> tims_sum diann_im_window -> im_window diann_channel_run_norm -> channel_run_norm diann_channel_spec_norm -> channel_spec_norm Removed diann_no_peptidoforms entirely — superseded by the new scoring_mode parameter (generic/proteoforms/peptidoforms). --no-peptidoforms remains in blocked flags lists. Note: diann_version is NOT renamed (used in profile names). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: rename FDR params and expose matrix-level q-value controls
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: add DIA-NN scoring mode parameter (Generic/Proteoforms/Peptidoforms)
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
modules/local/diann/diann_msstats/main.nf (1)
1-7: 🛠️ Refactor suggestion | 🟠 MajorAdd the required
diannprocess label.This DIA module process is missing
label 'diann', which breaks the expected label-based selection pattern.As per coding guidelines, `modules/local/diann/*/main.nf`: All DIA-NN process modules must include both `label 'process_'` and `label 'diann'` labels for container and resource selection.🏷️ Suggested fix
process DIANN_MSSTATS { tag "diann_msstats" label 'process_medium' + label 'diann'🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/local/diann/diann_msstats/main.nf` around lines 1 - 7, The DIANN_MSSTATS process is missing the required diann label which prevents label-based selection; update the process DIANN_MSSTATS declaration to include label 'diann' alongside the existing label 'process_medium' (i.e., ensure both label 'process_medium' and label 'diann' are present in the process block) so container and resource selection works as other DIA-NN modules expect.
♻️ Duplicate comments (2)
docs/usage.md (1)
98-104:⚠️ Potential issue | 🟡 MinorRemove the second
Preprocessing Optionsblock.This duplicates the earlier section at Line 32 and creates a second source of truth for the same parameters. Please fold any new content into the original section instead of reintroducing the heading here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/usage.md` around lines 98 - 104, Remove this duplicate "Preprocessing Options" block and merge any unique flags or descriptions here (e.g., --reindex_mzml, --mzml_statistics, --mzml_features, --convert_dotd) into the original "Preprocessing Options" section that appears earlier in the doc; delete this repeated heading and its bullet items so there is only one authoritative Preprocessing Options section containing the consolidated flags and defaults.workflows/dia.nf (1)
76-95:⚠️ Potential issue | 🟠 MajorPropagate the resolved DDA mode to every DIA-NN step.
ch_is_ddafolds inparams.dda, but onlyINSILICO_LIBRARY_GENERATIONconsumes that resolved boolean. The later DIA-NN modules still decide frommeta.acquisition_method, so--dda truecan make library generation run in DDA mode while PRELIMINARY/ASSEMBLE/INDIVIDUAL/FINAL stay in DIA mode.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@workflows/dia.nf` around lines 76 - 95, ch_is_dda computed from ch_experiment_meta/params.dda is not being propagated to downstream DIA-NN steps (only INSILICO_LIBRARY_GENERATION uses it), causing later modules to re-evaluate meta.acquisition_method; update the PRELIMINARY, ASSEMBLE, INDIVIDUAL, FINAL (and any other DIA-NN) invocations to consume the resolved ch_is_dda value instead of reading meta.acquisition_method directly — e.g., pass ch_is_dda as an input channel or wire it into those process calls so their logic uses the boolean from ch_is_dda (and remove or override any meta.acquisition_method-based checks inside those processes).
🧹 Nitpick comments (3)
conf/tests/test_latest_dia.config (1)
14-15: Consider updating "latest" profile to DIA-NN 2.3.2.The profile name and description reference "latest DIA (2.2.0)", but the PR includes support for DIA-NN 2.3.2 which adds DDA support (per
conf/diann_versions/v2_3_2.config). If 2.3.2 is intended to be the latest supported version, consider updating this test profile.Also applies to: 41-42
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@conf/tests/test_latest_dia.config` around lines 14 - 15, Update the "latest DIA" test profile strings to reflect DIA-NN v2.3.2: change config_profile_name and config_profile_description to mention "latest DIA (2.3.2)" and update the description to note DDA support; ensure these edits align with the new version file conf/diann_versions/v2_3_2.config and also update the duplicate occurrences referenced at lines 41-42.conf/modules/dia.config (1)
13-31: Optional: consolidate duplicatedext.argsselectors.These five blocks are functionally identical and can be merged into one regex selector to reduce config drift risk.
♻️ Suggested consolidation
process { - - withName: ".*:DIA:INSILICO_LIBRARY_GENERATION" { - ext.args = { params.extra_args ?: '' } - } - - withName: ".*:DIA:PRELIMINARY_ANALYSIS" { - ext.args = { params.extra_args ?: '' } - } - - withName: ".*:DIA:ASSEMBLE_EMPIRICAL_LIBRARY" { - ext.args = { params.extra_args ?: '' } - } - - withName: ".*:DIA:INDIVIDUAL_ANALYSIS" { - ext.args = { params.extra_args ?: '' } - } - - withName: ".*:DIA:FINAL_QUANTIFICATION" { + withName: ".*:DIA:(INSILICO_LIBRARY_GENERATION|PRELIMINARY_ANALYSIS|ASSEMBLE_EMPIRICAL_LIBRARY|INDIVIDUAL_ANALYSIS|FINAL_QUANTIFICATION)" { ext.args = { params.extra_args ?: '' } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@conf/modules/dia.config` around lines 13 - 31, The five identical selectors (withName: ".*:DIA:INSILICO_LIBRARY_GENERATION", ".*:DIA:PRELIMINARY_ANALYSIS", ".*:DIA:ASSEMBLE_EMPIRICAL_LIBRARY", ".*:DIA:INDIVIDUAL_ANALYSIS", ".*:DIA:FINAL_QUANTIFICATION") all set ext.args = { params.extra_args ?: '' } and should be consolidated into a single selector using a combined regex (e.g., ".*:DIA:(INSILICO_LIBRARY_GENERATION|PRELIMINARY_ANALYSIS|ASSEMBLE_EMPIRICAL_LIBRARY|INDIVIDUAL_ANALYSIS|FINAL_QUANTIFICATION)") so that ext.args is defined once; update the withName block to use that combined regex and remove the five duplicate blocks, keeping the ext.args assignment as-is.docs/parameters.md (1)
55-64: Document--ddain one place to avoid drift.This parameter is now described in both the general DIA-NN table and the dedicated DDA section. A single canonical entry plus a short cross-reference would be easier to keep in sync.
Also applies to: 124-133
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/parameters.md` around lines 55 - 64, Consolidate the `--dda` parameter documentation by keeping a single canonical entry in the DIA‑NN parameters table (the `--dda` row) and remove the duplicate detailed description from the dedicated DDA section; in that DDA section replace the duplicate text with a short cross-reference pointing to the parameters table (e.g., "See `--dda` in the DIA‑NN parameters table"). Apply the same consolidation for the other duplicate at the later occurrence (lines referenced in the review), ensuring the unique symbol `--dda` is the single source of truth and the DDA section only links to it.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/parameters.md`:
- Line 30: The docs list flags `--convert_dotd` and `--empirical_assembly_log`
but those parameters are missing from nextflow_schema.json; either add these
params to the pipeline config and then regenerate nextflow_schema.json with
`nf-core pipelines schema build` so the schema includes `--convert_dotd` and
`--empirical_assembly_log`, or remove these entries from docs/parameters.md (and
any other occurrences noted) to keep docs and schema in sync; update the
pipeline parameter definitions where `convert_dotd` and `empirical_assembly_log`
are declared so they match the names/types documented before rebuilding the
schema.
In `@modules/local/diann/final_quantification/main.nf`:
- Around line 49-56: The blocked-flag list named blocked in main.nf is missing
the channel-normalization CLI switches owned by params.channel_run_norm and
params.channel_spec_norm; add the corresponding flags (e.g. --channel-run-norm
and --channel-spec-norm) to the blocked array where it’s defined and also add
them to the second blocked list referenced later (the block at lines ~78-83) so
task.ext.args cannot silently pass those switches through.
In `@modules/local/diann/insilico_library_generation/meta.yml`:
- Around line 22-24: The metadata description for the boolean flag is_dda in
meta.yml mentions the wrong CLI flag name (--diann_dda); update that description
to reference the actual runtime flag (--dda) so it matches the module command
path and runtime behavior, keeping the rest of the description intact.
In `@nextflow_schema.json`:
- Around line 324-344: The q-value fields precursor_qvalue, matrix_qvalue, and
matrix_spec_q are currently unbounded numbers; update each property's schema to
constrain values to the valid probability range by adding "minimum": 0 and
"maximum": 1 (retaining "type": "number") so values outside [0,1] (e.g., -1 or
1.5) will fail validation before reaching DIA-NN.
- Around line 479-513: The CI lacks test matrix entries for new DIA-NN
feature/version combos; update the test-features-matrix in merge_ci.yml to add
matrix entries for test_light_models with versions 2.1.0 and 2.2.0, add a new
test_infin_dia profile pinned to 2.3.2, and include test_dda pinned to 2.3.2
(matching the extended_ci.yml profile) so the workflow gating in
workflows/dia.nf is exercised for light_models, enable_infin_dia/pre_select, and
dda minimum supported versions.
In `@nextflow.config`:
- Line 380: The pipeline manifest currently sets version = '2.0.0dev' which
marks published metadata as a development build; change the version value in
nextflow.config (the version variable) from '2.0.0dev' to the final release
string '2.0.0' so the manifest advertises the correct release version.
In `@subworkflows/local/create_input_channel/main.nf`:
- Around line 103-104: The code collects fixedMods via rows.collect and then
unconditionally sets meta.fixedmodifications to the first unique value, which
silently ignores conflicting values; change this to mirror the enzyme
validation: after computing fixedMods (the unique non-empty list), if
fixedMods.size() == 1 set meta.fixedmodifications to that value, if
fixedMods.size() == 0 set it to null, and if fixedMods.size() > 1 throw an error
(or pipeline exit) with a clear message referencing the affected file/rows;
update the logic around the fixedMods variable and meta.fixedmodifications to
perform this consistency check instead of taking fixedMods[0].
---
Outside diff comments:
In `@modules/local/diann/diann_msstats/main.nf`:
- Around line 1-7: The DIANN_MSSTATS process is missing the required diann label
which prevents label-based selection; update the process DIANN_MSSTATS
declaration to include label 'diann' alongside the existing label
'process_medium' (i.e., ensure both label 'process_medium' and label 'diann' are
present in the process block) so container and resource selection works as other
DIA-NN modules expect.
---
Duplicate comments:
In `@docs/usage.md`:
- Around line 98-104: Remove this duplicate "Preprocessing Options" block and
merge any unique flags or descriptions here (e.g., --reindex_mzml,
--mzml_statistics, --mzml_features, --convert_dotd) into the original
"Preprocessing Options" section that appears earlier in the doc; delete this
repeated heading and its bullet items so there is only one authoritative
Preprocessing Options section containing the consolidated flags and defaults.
In `@workflows/dia.nf`:
- Around line 76-95: ch_is_dda computed from ch_experiment_meta/params.dda is
not being propagated to downstream DIA-NN steps (only
INSILICO_LIBRARY_GENERATION uses it), causing later modules to re-evaluate
meta.acquisition_method; update the PRELIMINARY, ASSEMBLE, INDIVIDUAL, FINAL
(and any other DIA-NN) invocations to consume the resolved ch_is_dda value
instead of reading meta.acquisition_method directly — e.g., pass ch_is_dda as an
input channel or wire it into those process calls so their logic uses the
boolean from ch_is_dda (and remove or override any meta.acquisition_method-based
checks inside those processes).
---
Nitpick comments:
In `@conf/modules/dia.config`:
- Around line 13-31: The five identical selectors (withName:
".*:DIA:INSILICO_LIBRARY_GENERATION", ".*:DIA:PRELIMINARY_ANALYSIS",
".*:DIA:ASSEMBLE_EMPIRICAL_LIBRARY", ".*:DIA:INDIVIDUAL_ANALYSIS",
".*:DIA:FINAL_QUANTIFICATION") all set ext.args = { params.extra_args ?: '' }
and should be consolidated into a single selector using a combined regex (e.g.,
".*:DIA:(INSILICO_LIBRARY_GENERATION|PRELIMINARY_ANALYSIS|ASSEMBLE_EMPIRICAL_LIBRARY|INDIVIDUAL_ANALYSIS|FINAL_QUANTIFICATION)")
so that ext.args is defined once; update the withName block to use that combined
regex and remove the five duplicate blocks, keeping the ext.args assignment
as-is.
In `@conf/tests/test_latest_dia.config`:
- Around line 14-15: Update the "latest DIA" test profile strings to reflect
DIA-NN v2.3.2: change config_profile_name and config_profile_description to
mention "latest DIA (2.3.2)" and update the description to note DDA support;
ensure these edits align with the new version file
conf/diann_versions/v2_3_2.config and also update the duplicate occurrences
referenced at lines 41-42.
In `@docs/parameters.md`:
- Around line 55-64: Consolidate the `--dda` parameter documentation by keeping
a single canonical entry in the DIA‑NN parameters table (the `--dda` row) and
remove the duplicate detailed description from the dedicated DDA section; in
that DDA section replace the duplicate text with a short cross-reference
pointing to the parameters table (e.g., "See `--dda` in the DIA‑NN parameters
table"). Apply the same consolidation for the other duplicate at the later
occurrence (lines referenced in the review), ensuring the unique symbol `--dda`
is the single source of truth and the DDA section only links to it.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: bb1c960b-6643-4ccd-a116-e58a337476fb
📒 Files selected for processing (37)
AGENTS.mdREADME.mdconf/diann_versions/v2_1_0.configconf/diann_versions/v2_2_0.configconf/diann_versions/v2_3_2.configconf/modules/dia.configconf/pride_codon_slurm.configconf/tests/test_dda.configconf/tests/test_dia.configconf/tests/test_dia_2_2_0.configconf/tests/test_dia_dotd.configconf/tests/test_dia_parquet.configconf/tests/test_dia_quantums.configconf/tests/test_dia_skip_preanalysis.configconf/tests/test_full_dia.configconf/tests/test_latest_dia.configdocs/parameters.mddocs/usage.mdmodules/local/diann/assemble_empirical_library/main.nfmodules/local/diann/assemble_empirical_library/meta.ymlmodules/local/diann/diann_msstats/main.nfmodules/local/diann/final_quantification/main.nfmodules/local/diann/final_quantification/meta.ymlmodules/local/diann/generate_cfg/main.nfmodules/local/diann/individual_analysis/main.nfmodules/local/diann/individual_analysis/meta.ymlmodules/local/diann/insilico_library_generation/main.nfmodules/local/diann/insilico_library_generation/meta.ymlmodules/local/diann/preliminary_analysis/main.nfmodules/local/diann/preliminary_analysis/meta.ymlmodules/local/pmultiqc/main.nfmodules/local/samplesheet_check/main.nfmodules/local/utils/mzml_statistics/main.nfnextflow.confignextflow_schema.jsonsubworkflows/local/create_input_channel/main.nfworkflows/dia.nf
✅ Files skipped from review due to trivial changes (7)
- modules/local/diann/assemble_empirical_library/meta.yml
- conf/diann_versions/v2_2_0.config
- modules/local/diann/preliminary_analysis/meta.yml
- modules/local/diann/individual_analysis/meta.yml
- modules/local/diann/generate_cfg/main.nf
- modules/local/diann/final_quantification/meta.yml
- conf/tests/test_dda.config
🚧 Files skipped from review as they are similar to previous changes (3)
- AGENTS.md
- conf/diann_versions/v2_3_2.config
- conf/tests/test_dia_skip_preanalysis.config
| - is_dda: | ||
| type: boolean | ||
| description: Whether DDA mode is enabled (auto-detected from SDRF or set via --diann_dda) |
There was a problem hiding this comment.
Update flag name in is_dda metadata description.
Line 24 mentions --diann_dda, but the module command path uses --dda; this description should match runtime behavior.
📝 Suggested fix
- description: Whether DDA mode is enabled (auto-detected from SDRF or set via --diann_dda)
+ description: Whether DDA mode is enabled (auto-detected from SDRF or set via --dda)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - is_dda: | |
| type: boolean | |
| description: Whether DDA mode is enabled (auto-detected from SDRF or set via --diann_dda) | |
| - is_dda: | |
| type: boolean | |
| description: Whether DDA mode is enabled (auto-detected from SDRF or set via --dda) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modules/local/diann/insilico_library_generation/meta.yml` around lines 22 -
24, The metadata description for the boolean flag is_dda in meta.yml mentions
the wrong CLI flag name (--diann_dda); update that description to reference the
actual runtime flag (--dda) so it matches the module command path and runtime
behavior, keeping the rest of the description intact.
| "precursor_qvalue": { | ||
| "type": "number", | ||
| "description": "Precursor-level q-value filtering threshold for the DIA-NN main report. Maps to --qvalue.", | ||
| "default": 0.01, | ||
| "fa_icon": "fas fa-filter", | ||
| "help_text": "Controls how strictly precursor identifications are filtered in the DIA-NN main report. For proteogenomics with variant databases, the standard 0.01 (1%) is recommended." | ||
| }, | ||
| "matrix_qvalue": { | ||
| "type": "number", | ||
| "description": "Q-value threshold for DIA-NN output matrices (pr_matrix, pg_matrix, etc.). Maps to --matrix-qvalue.", | ||
| "default": 0.01, | ||
| "fa_icon": "fas fa-filter", | ||
| "help_text": "Controls the global q-value filtering applied when generating the quantification matrices. Default matches DIA-NN's built-in default of 1%." | ||
| }, | ||
| "matrix_spec_q": { | ||
| "type": "number", | ||
| "description": "Run-specific protein q-value filter for protein/gene matrices. Maps to --matrix-spec-q.", | ||
| "default": 0.05, | ||
| "fa_icon": "fas fa-filter", | ||
| "help_text": "An additional run-specific protein-level FDR filter applied to the protein and gene matrices. Default matches DIA-NN's built-in default of 5%. For proteogenomics/variant detection, consider setting to 1.0 to retain variant proteins that lack sufficient unique peptides for protein-level confidence." | ||
| }, |
There was a problem hiding this comment.
Constrain q-value parameters to the valid probability range.
These three thresholds are currently any number, so values like -1 or 1.5 will pass schema validation and reach DIA-NN. Please bound them to [0, 1].
Suggested schema patch
"precursor_qvalue": {
"type": "number",
+ "minimum": 0,
+ "maximum": 1,
"description": "Precursor-level q-value filtering threshold for the DIA-NN main report. Maps to --qvalue.",
"default": 0.01,
"fa_icon": "fas fa-filter",
"help_text": "Controls how strictly precursor identifications are filtered in the DIA-NN main report. For proteogenomics with variant databases, the standard 0.01 (1%) is recommended."
},
"matrix_qvalue": {
"type": "number",
+ "minimum": 0,
+ "maximum": 1,
"description": "Q-value threshold for DIA-NN output matrices (pr_matrix, pg_matrix, etc.). Maps to --matrix-qvalue.",
"default": 0.01,
"fa_icon": "fas fa-filter",
"help_text": "Controls the global q-value filtering applied when generating the quantification matrices. Default matches DIA-NN's built-in default of 1%."
},
"matrix_spec_q": {
"type": "number",
+ "minimum": 0,
+ "maximum": 1,
"description": "Run-specific protein q-value filter for protein/gene matrices. Maps to --matrix-spec-q.",
"default": 0.05,
"fa_icon": "fas fa-filter",
"help_text": "An additional run-specific protein-level FDR filter applied to the protein and gene matrices. Default matches DIA-NN's built-in default of 5%. For proteogenomics/variant detection, consider setting to 1.0 to retain variant proteins that lack sufficient unique peptides for protein-level confidence."
},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "precursor_qvalue": { | |
| "type": "number", | |
| "description": "Precursor-level q-value filtering threshold for the DIA-NN main report. Maps to --qvalue.", | |
| "default": 0.01, | |
| "fa_icon": "fas fa-filter", | |
| "help_text": "Controls how strictly precursor identifications are filtered in the DIA-NN main report. For proteogenomics with variant databases, the standard 0.01 (1%) is recommended." | |
| }, | |
| "matrix_qvalue": { | |
| "type": "number", | |
| "description": "Q-value threshold for DIA-NN output matrices (pr_matrix, pg_matrix, etc.). Maps to --matrix-qvalue.", | |
| "default": 0.01, | |
| "fa_icon": "fas fa-filter", | |
| "help_text": "Controls the global q-value filtering applied when generating the quantification matrices. Default matches DIA-NN's built-in default of 1%." | |
| }, | |
| "matrix_spec_q": { | |
| "type": "number", | |
| "description": "Run-specific protein q-value filter for protein/gene matrices. Maps to --matrix-spec-q.", | |
| "default": 0.05, | |
| "fa_icon": "fas fa-filter", | |
| "help_text": "An additional run-specific protein-level FDR filter applied to the protein and gene matrices. Default matches DIA-NN's built-in default of 5%. For proteogenomics/variant detection, consider setting to 1.0 to retain variant proteins that lack sufficient unique peptides for protein-level confidence." | |
| }, | |
| "precursor_qvalue": { | |
| "type": "number", | |
| "minimum": 0, | |
| "maximum": 1, | |
| "description": "Precursor-level q-value filtering threshold for the DIA-NN main report. Maps to --qvalue.", | |
| "default": 0.01, | |
| "fa_icon": "fas fa-filter", | |
| "help_text": "Controls how strictly precursor identifications are filtered in the DIA-NN main report. For proteogenomics with variant databases, the standard 0.01 (1%) is recommended." | |
| }, | |
| "matrix_qvalue": { | |
| "type": "number", | |
| "minimum": 0, | |
| "maximum": 1, | |
| "description": "Q-value threshold for DIA-NN output matrices (pr_matrix, pg_matrix, etc.). Maps to --matrix-qvalue.", | |
| "default": 0.01, | |
| "fa_icon": "fas fa-filter", | |
| "help_text": "Controls the global q-value filtering applied when generating the quantification matrices. Default matches DIA-NN's built-in default of 1%." | |
| }, | |
| "matrix_spec_q": { | |
| "type": "number", | |
| "minimum": 0, | |
| "maximum": 1, | |
| "description": "Run-specific protein q-value filter for protein/gene matrices. Maps to --matrix-spec-q.", | |
| "default": 0.05, | |
| "fa_icon": "fas fa-filter", | |
| "help_text": "An additional run-specific protein-level FDR filter applied to the protein and gene matrices. Default matches DIA-NN's built-in default of 5%. For proteogenomics/variant detection, consider setting to 1.0 to retain variant proteins that lack sufficient unique peptides for protein-level confidence." | |
| }, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nextflow_schema.json` around lines 324 - 344, The q-value fields
precursor_qvalue, matrix_qvalue, and matrix_spec_q are currently unbounded
numbers; update each property's schema to constrain values to the valid
probability range by adding "minimum": 0 and "maximum": 1 (retaining "type":
"number") so values outside [0,1] (e.g., -1 or 1.5) will fail validation before
reaching DIA-NN.
| "dda": { | ||
| "type": "boolean", | ||
| "description": "Explicitly enable DDA mode. Normally auto-detected from the SDRF acquisition method column. Use only when SDRF lacks this column. Requires DIA-NN >= 2.3.2.", | ||
| "fa_icon": "fas fa-flask", | ||
| "default": false | ||
| }, | ||
| "light_models": { | ||
| "type": "boolean", | ||
| "description": "Enable --light-models for 10x faster in-silico library generation (DIA-NN >= 2.0).", | ||
| "fa_icon": "fas fa-bolt", | ||
| "default": false | ||
| }, | ||
| "export_quant": { | ||
| "type": "boolean", | ||
| "description": "Enable --export-quant for fragment-level parquet data export (DIA-NN >= 2.0).", | ||
| "fa_icon": "fas fa-file-export", | ||
| "default": false | ||
| }, | ||
| "site_ms1_quant": { | ||
| "type": "boolean", | ||
| "description": "Enable --site-ms1-quant to use MS1 apex intensities for PTM site quantification (DIA-NN >= 2.0).", | ||
| "fa_icon": "fas fa-crosshairs", | ||
| "default": false | ||
| }, | ||
| "enable_infin_dia": { | ||
| "type": "boolean", | ||
| "description": "Enable InfinDIA for ultra-large search spaces (DIA-NN >= 2.3.0). Experimental.", | ||
| "fa_icon": "fas fa-infinity", | ||
| "default": false | ||
| }, | ||
| "pre_select": { | ||
| "type": "integer", | ||
| "description": "Set --pre-select N precursor limit for InfinDIA pre-search.", | ||
| "fa_icon": "fas fa-filter" | ||
| }, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '== DIA-NN feature flag call sites ==\n'
rg -n -C2 \
--iglob '*.nf' \
--iglob '*.config' \
--iglob '*.yml' \
'(\-\-(dda|light-models|export-quant|site-ms1-quant|infin-dia|pre-select))|(params\.(dda|light_models|export_quant|site_ms1_quant|enable_infin_dia|pre_select))|diann_version' .
printf '\n== CI workflow matrix ==\n'
fd -i 'extended_ci\.yml|merge_ci\.yml' .github/workflows -x sed -n '1,220p' {}Repository: bigbio/quantmsdiann
Length of output: 32767
🏁 Script executed:
# Check if modules have conditional version-gating at the module level
rg -n "light_models|export_quant|site_ms1_quant|enable_infin_dia" \
modules/local/diann/*/main.nf \
-A 1 -B 1
# Verify merge_ci.yml includes DDA/2.3.2 tests
grep -A 50 "test-features-matrix:" .github/workflows/merge_ci.yml | head -60Repository: bigbio/quantmsdiann
Length of output: 3378
Add CI matrix entries to test new DIA-NN features at their minimum supported versions.
The workflow gating in workflows/dia.nf prevents unsupported version+feature combinations from reaching the CLI, but merge_ci.yml lacks test coverage for light_models, enable_infin_dia/pre_select (2.3.0+), and DDA (2.3.2+). Extend the test-features-matrix in merge_ci.yml to include:
test_light_models× {2.1.0, 2.2.0}test_infin_dia× 2.3.2 (new test profile)test_dda× 2.3.2 (from extended_ci.yml, add to merge path)
This ensures version guards are validated during the merge-to-master gate.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nextflow_schema.json` around lines 479 - 513, The CI lacks test matrix
entries for new DIA-NN feature/version combos; update the test-features-matrix
in merge_ci.yml to add matrix entries for test_light_models with versions 2.1.0
and 2.2.0, add a new test_infin_dia profile pinned to 2.3.2, and include
test_dda pinned to 2.3.2 (matching the extended_ci.yml profile) so the workflow
gating in workflows/dia.nf is exercised for light_models,
enable_infin_dia/pre_select, and dda minimum supported versions.
Replace duplicated blocked-flags logic (10+ lines x 5 modules) with
a single centralized registry in lib/BlockedFlags.groovy. Each module
now uses one line: `args = BlockedFlags.strip('MODULE_NAME', args, log)`
Fixes from PR #36 review:
- Add --no-prot-inf to ASSEMBLE, INDIVIDUAL, FINAL blocked lists
- Add --channel-run-norm, --channel-spec-norm to FINAL blocked list
- Add --var-mod, --fixed-mod, --channels to INSILICO (via COMMON)
- Add --relaxed-prot-inf, --pg-level to ASSEMBLE blocked list
- Add version guard for channel normalization flags (require >= 2.0)
- Add warning when --normalize false conflicts with channel norm flags
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add comments explaining: - Why blocked flags exist (prevent silent DIA-NN flag conflicts) - Why they live in a Groovy class, not config files (safety — can't be overridden by user configs) - How to add new blocked flags (edit one file, no module changes) - In each module: point developers to lib/BlockedFlags.groovy Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ASSEMBLE_EMPIRICAL_LIBRARY doesn't set --no-prot-inf, --relaxed-prot-inf, or --pg-level in its command, so blocking them would strip user values without providing a replacement. Users should be able to pass these to the assembly step if needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each blocked flag now has a comment explaining WHY it's blocked: - "Pipeline-managed": the pipeline sets it from params/SDRF/metadata - "No-effect guard": the flag has no effect in this step but is blocked to prevent users from wrongly believing it does This prevents future contributors (human or AI) from removing flags without understanding the intent behind the block. Reverts the accidental removal of protein inference flags from ASSEMBLE_EMPIRICAL_LIBRARY — they are intentionally blocked as a no-effect guard since --gen-spec-lib produces a spectral library, not a quantified report. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
refactor: centralize blocked flags + fix missing guards from PR #36 review
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
chore: bump version to 2.0.0 (Rome) and update CHANGELOG
Minor update
- New v2_5_0.config version profile (ghcr.io/bigbio/diann:2.5.0) - Add diann_v2_5_0 profile to nextflow.config - Block --parent flag in BlockedFlags.groovy COMMON list (container-managed, overriding breaks model discovery) - Document fine-tuning workflow in docs/usage.md: - How to generate tuning libraries - How to fine-tune RT/IM/fragment models - How to use fine-tuned models via --extra_args - Update version tables in docs/usage.md and AGENTS.md New DIA-NN 2.5.0 CLI flags (passable via --extra_args): --tokens, --rt-model, --fr-model, --im-model (model selection) --aa-eq (amino acid equivalence for reannotation) --tune-lib, --tune-rt, --tune-im, --tune-fr (fine-tuning) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document all new DIA-NN 2.5.0 flags available via --extra_args: model selection (--tokens, --rt-model, --fr-model, --im-model), fine-tuning (--tune-lib, --tune-rt, --tune-im, --tune-fr, etc.), and --aa-eq for amino acid equivalence. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add aa_eq param (default: false) that maps to DIA-NN's --aa-eq flag. When enabled, I&L, Q&E, N&D are treated as equivalent amino acids during reannotation — essential for entrapment FDR benchmarks. - Added to nextflow.config, nextflow_schema.json, docs/parameters.md - Passed to all 5 DIA-NN modules - Added to BlockedFlags.groovy COMMON list (pipeline-managed) - Moved from "via extra_args" to proper pipeline parameter in docs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… for Vadim Rewrite the fine-tuning documentation to: - Explain what tokens/dict.txt are (neural network encoding of modifications) - Clarify that --tune-lib cannot be combined with --gen-spec-lib (separate DIA-NN invocations, confirmed in DIA-NN #1499) - Document the full two-run workflow with concrete commands - Propose the future integrated FINE_TUNE_MODELS step with a question for @vdemichev about the right integration point Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: add DIA-NN 2.5.0 support with model fine-tuning documentation
PR checklist
nf-core pipelines lint).nextflow run . -profile test,docker --outdir <OUTDIR>).nextflow run . -profile debug,test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).Summary by CodeRabbit
Documentation
New Features
Bug Fixes
Chores