-
Notifications
You must be signed in to change notification settings - Fork 41
Description
I have noticed a possible bug with allele dosages on Cassavabase.
I am looking at data in the:
Gentoyping Protocol: IITA DArT-GBS 08 Aug 2021
Allele dosages in downloaded data appear to count the REF allele.
According to the header of the VCF:
##FORMAT=<ID=DS,Number=A,Type=Float,Description="estimated ALT dose [P(RA) + P(AA)]">
However, DS field in the VCF, which I downloaded using the Wizard, as well as corresponding tab-separated dosage file both appear to count the REF allele. Can someone confirm this? I could be crazy :) ? Or it could only be this dataset.
I think the DS field is calculated by Cassavabase, is that correct? I guess o because the source VCF for the genotyping protocol (which I provided) does not have the DS field. I very much appreciate that the DB computes dosages and adds them to the VCF! Just need to tweak it.
My use case / why this matters:
I am doing predictions that use the phased haplotypes extracted from the VCF file. I need the genomic-predicted marker-effects (computed using the dosages as predictors in an RR-BLUP model) to measure the effect of the "1" allele in the VCF (the ALT allele). To ensure this, in the past, I have used the haplotypes extracted from the VCF to manually compute my own dosages (summing the two haplotypes for each individual).
Related to: #3853
Hope this makes sense, happy to clarify! Thanks!