Skip to content

feat: add msisensorpro/baseline module; clarify pro -d dual semantics#11190

Draft
nh13 wants to merge 2 commits intonf-core:masterfrom
nh13:msisensorpro-baseline
Draft

feat: add msisensorpro/baseline module; clarify pro -d dual semantics#11190
nh13 wants to merge 2 commits intonf-core:masterfrom
nh13:msisensorpro-baseline

Conversation

@nh13
Copy link
Copy Markdown
Member

@nh13 nh13 commented Apr 14, 2026

Draft for discussion. Closes #11188, overlaps with #6007 (and closed #6350).

What

  1. New module msisensorpro/baseline wrapping msisensor-pro baseline. Builds a trained baseline microsatellite file from a panel of normal samples, for use with msisensor-pro pro -d in tumor-only MSI calling. The configure file consumed by baseline -i is built internally from the staged *_all files so callers can feed a collected channel of msisensorpro/pro outputs directly (rather than hand-writing a configure file that references Nextflow work paths).
  2. msisensorpro/pro/meta.yml: expand the -d input description to document both supported modes (raw scan list with hard threshold via -i, vs trained baseline). No behavior change; pro's main.nf already accepts either file type and -i already flows through $args.

Why

msisensor-pro pro -d accepts both a scan list and a baseline file per ProUsage() in cpp/distribution.cpp L199-L206. The wiki Best-Practices documents only the baseline workflow for tumor-only, and the maintainer recommends it in xjtu-omics/msisensor-pro#77. Until now the baseline subcommand had no module, so pipelines couldn't produce a baseline within nf-core/modules and defaulted to the scan-list-with-hard-threshold fallback mode.

TODOs before ready for review

  • Verify the live test under -profile docker against real fixture data. The chr21 test fasta may not contain enough microsatellites for baseline to produce a usable output. If so, either locate richer fixtures or drop to stub-only testing and open a follow-up for fixtures.
  • nf-core modules test msisensorpro/baseline --profile {docker,singularity,conda}
  • Snapshot (tests/main.nf.test.snap) once live test passes
  • Migrate msisensorpro/pro's test to chain through the new baseline module so CI exercises mode 2 (recommended mode)
  • Maintainer input on whether to rename pro's second input from list to something like microsat_file (would be a breaking doc-only change)

Happy to iterate on any of the above. Specifically wanted to surface the design of the baseline module's input shape (internally-constructed configure file) before doing more work.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests! (live test present but unverified, see TODOs)
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Broadcast software version numbers to topic: versions
  • nf-core modules test <MODULE> --profile docker
  • nf-core modules test <MODULE> --profile singularity
  • nf-core modules test <MODULE> --profile conda
  • Remove all TODO statements. (none in module code; TODOs above are PR-level followups)

nh13 added 2 commits April 14, 2026 13:45
Wraps `msisensor-pro baseline`, which builds a trained baseline
microsatellite file from a panel of normal samples. The resulting
baseline is consumed by `msisensorpro/pro -d` to enable the per-site
trained thresholds recommended by the msisensor-pro maintainers for
tumor-only MSI calling (xjtu-omics/msisensor-pro#77,
https://github.com/xjtu-omics/msisensor-pro/wiki/Best-Practices).

The configure file required by `msisensor-pro baseline -i` is built
internally from the staged `*_all` files so pipelines can feed the
output of `msisensorpro/pro` on each normal directly into this module
as a collected channel, without hand-writing a configure file that
references Nextflow work paths.
…line

`msisensor-pro pro -d` documents two supported invocations
(cpp/distribution.cpp `ProUsage()` L199-L206): mode 1 with a raw scan
list and a hard threshold set via `-i`, and mode 2 with a trained
baseline. The module's input description only hinted at mode 1 by
calling the file a "micro-satellite list". Expand the description to
cover both modes, point to the baseline workflow recommended by the
maintainers, and mention that the hard threshold is configurable via
`ext.args`.

No behavior change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] msisensorpro/pro silently runs in hard-threshold fallback mode; baseline module missing

1 participant