Implement skani as fastani alternative#30
Conversation
implement fastani clusterer
wwood
left a comment
There was a problem hiding this comment.
Yeh I think that's roughly the idea
| num_kmers: 1000, | ||
| kmer_length: 21, | ||
| }, | ||
| &crate::skani::SkaniClusterer { threshold: 99.0 }, |
There was a problem hiding this comment.
Higher than for fastani. Skani gives ANI of >98% for all pairs measured.
|
Few steps closer. Still need to add to cli. |
|
Are you thinking per-disconnected component after the preclusterer? Or just in total? If the latter then no point in preclustering, I think. |
|
Not sure. Both are possibilities, though skani doesn't recommend comparing genomes with <82% ANI, so we would have to deal with that if we skip preclustering, right? Though it says "If the resulting aligned fraction for the two genomes is < 15%, no output is given.", so maybe <82% just doesn't give an answer, rather than giving an unreliable answer. |
options: fastani, skani
| fastani_min_aligned_threshold, | ||
| fastani_fraglen, | ||
| ), | ||
| Preclusterer::Dashing { min_ani, threads } => match self.clusterer { |
There was a problem hiding this comment.
This is getting a bit unwieldy. I tried a let preclusterer = match, but it doesn't work since they are different types. Should the Preclusterer/Clusterer enum's be defined using the underlying structs instead of dummy ones?
There was a problem hiding this comment.
So the issue is that here we have to define the behaviour for every combination of clusterer and preclusterer, right?
I think the answer is yes, well enum or dyn, up to you
|
Also, I get this warning on compile: |
|
|
|
I didn't go through every line, but seems about good. I think you need to add skani to the conda yml, and can you enable runs on PR using on: [push, pull_request] in the actions yml please? |
You added that argument, so won't show up until docs are redployed from main/release. |
Add skani as fastani alternative
newmethod?)find_representativesandfind_membershipsback intoclusterer.rsClustererthrough above functions so it needs only implementcalculate_aniget_thresholdmethod?calculate_skani