-A StrandBiasBySample: It directly outputs counts of read depth per allele (both ref and alt) for each strand orientation.
Strand bias is a type of sequencing bias in which one DNA strand is
favored over the other, which can result in incorrect evaluation of the
amount of evidence observed for one allele vs. the other. The
StrandBiasBySample annotation is produces read counts per allele and per
strand that are used by other annotation modules (FisherStrand and
StrandOddsRatio) to estimate strand bias using statistical approaches.
This annotation produces 4 values, corresponding to the number of reads that support the following (in that order):
This annotation produces 4 values, corresponding to the number of reads that support the following (in that order):
- the reference allele on the forward strand
- the reference allele on the reverse strand
- the alternate allele on the forward strand
- the alternate allele on the reverse strand
Example
GT:AD:GQ:PL:SB 0/1:53,51:99:1758,0,1835:23,30,33,18
In this example, the reference allele is supported by 23 forward reads and 30 reverse reads, the alternate allele is supported by 33 forward reads and 18 reverse reads.
Command line example
java \
-Xmx${MEM} \
-Djava.io.tmpdir=${JAVA_TMPDIR} \
-jar ${GATK} \
-T GenotypeGVCFs \
-R ${REF_SEQ} \
-A Coverage \
-A FisherStrand \
-A StrandBiasBySample \
-D $SNP_DBSNP \
-o ${SMPL_NAME}.vcf \
-nt $PROCS \
-V samples.vcf.list
Annotation about VCF INFO column:
MLEAC: maximum likelihood expectation of allele count
MLEAF: maximum likelihood expectation of allele frequency
##FORMAT=<ID=MLEAC,Number=A,Type=Integer,Description="Maximum likelihood expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed, for this pool">
##FORMAT=<ID=MLEAF,Number=A,Type=Float,Description="Maximum likelihood expectation (MLE) for the allele frequency (not necessarily the same as the AF), for each ALT allele, in the same order as listed, for this pool">
No comments:
Post a Comment