Skip to content

False positive het candidates with shallow matched normal #14

@jalberge

Description

@jalberge

After manual review of weird SNPs in het_coverage.normal.tsv (based on suspicious small regions of CNLoH), I found suspicious SNPs have these characteristics and sites turn out as false positive heterozygous SNP (should be HOM ALT):

  1. (hg38) chr2:189728585, total coverage 8, ALT has 7, REF has 1 (QV11). QV 11 should not pass filter and this site should be 100% VAF ALT. (HOM ALT) (this is a case of false positive ref base)
  2. (hg38) chr2:189731263, total coverage 7, ALT has 6, REF has 1 (QV11). QV 11 should not pass filter and this site should be 100% VAF ALT. (HOM ALT) (similar; see Fig. 1)
  3. (hg38) chr2: 189734985, 2 reads (missed PCR) chimeric at pseudo-palindromic regions (this is more specific to the enzymatic shearing protocol we use and I need to think more about how to filter those, maybe during bam alignment).
  4. (hg38) chr4: 39565930, 1 read is MAPQ0 (Fig. 3). I think sites with MAPQ0 should be treated as suspicious and not used for genotyping.

Solutions:

  • Do not use MAPQ0 for genotyping (branch filter_mapq0)
  • Rely only on high quality bases for genotyping
  • Sequence deeper whenever possible

Fig. 1
Image

Fig. 2

Image

Fig. 3
Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions