Skip to content

Modkit CpG pileup problems with modkit v0.6.0 compared to v0.5.0 #545

@laulambr

Description

@laulambr

Hi,

I’m running into issues when generating cpg pileups using the latest modkit v0.6.0 release. Running the same BAM file through v0.5.0 produces expected results, but I’m unable to replicate these using v0.6.0. I am not sure if I missed something or made a mistake but I would gladly get some feedback

1. Expected behavior using modkit v0.5.0

Using v0.5.0, the following command (with either --motif CG 0 or --cpg) generates a combined-strand bedmethyl output as expected:

Command with modkit v0.5.0 (switching --motif CG 0 with --cpg outputs same results as expected), generates a bedmethyl file where info on CpG sites is combined.

modkit pileup $bam $modbam_output/reads_WT-cpg.modkit050.bedmethyl\
    --ref ${ref} \
    --header \
    --region ${region} \
    --threads ${threads} \
    --log-filepath $modbam_output/log_reads_WT_modkitv050-cpg.txt \
    --motif CG 0 \
    --filter-threshold 0.9 \
    --combine-strands 

2. modkit v0.6.0 with --motif CG 0
With v0.6.0, running the same command produces a bedmethyl file but strand information is not combined, even when --combine-strands is supplied:

  $modkit pileup $bam $modbam_output/reads_WT-cpg.modkitv060.bedmethyl \
    --ref ${ref} \
    --header \
    --region ${region} \
    --threads ${threads} \
    --log-filepath $modbam_output/log_reads_WT_modkitv060-cpg.txt \
    --filter-threshold 0.9 \
    --motif CG 0 \
    --combine-strands

3. modkit v0.6.0 with --cpg
Alternatively, I tried the --cpg flag instead of the motif, but this requires specifying --modified-bases, so I used --modified-bases 5mC as described (https://nanoporetech.github.io/modkit/intro_pileup.html#narrowing-output-to-cpg-dinucleotides).
However, this produces an empty file and the following log messages:

$modkit pileup $bam $modbam_output/reads_WT-cpg.modkitv060-cpg.bedmethyl \
    --ref ${ref} \
    --header \
    --region ${region} \
    --threads ${threads} \
    --log-filepath $modbam_output/log_reads_WT_modkitv060-cpg.txt \
    --filter-threshold 0.9 \
    --cpg \
    --modified-bases 5mC

Following error output:

[modkit-core/src/pileup/subcommand.rs::743][2025-12-02 18:29:46][INFO] discarded 0 contigs with zero aligned reads
[modkit-core/src/pileup/subcommand.rs::518][2025-12-02 18:29:46][INFO] parsed 1 base modification(s). Base modifictions other than 'C:m' will be counted as 'N_other'.
[modkit-core/src/pileup/subcommand.rs::868][2025-12-02 18:29:46][DEBUG] using standard file writer
[modkit-core/src/pileup/subcommand.rs::1106][2025-12-02 18:29:46][INFO] using optimized workers for A,C all-context
[modkit-core/src/interval_chunks.rs::390][2025-12-02 18:29:46][DEBUG] there is a single contig to work on (in 1 parts)
[modkit-core/src/pileup/subcommand.rs::1366][2025-12-02 18:29:46][ERROR] ~7 failed processing
[modkit-core/src/pileup/subcommand.rs::1373][2025-12-02 18:29:46][INFO] Done, processed 0 rows.

I have added the output bedmethyl files from the above commands to this message.

reads_WT-cpg.modkitv060.bedmethyl.txt
reads_WT-cpg.modkitv060-cpg.bedmethyl.txt
reads_WT-cpg.modkit050.bedmethyl.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions