Skip to content

ASE for Synonymous + Non-synonymous Variants  #93

@JPFinnigan

Description

@JPFinnigan

The current implementation outputs VAR and REF read counts for non-synonymous variants only. I would be great, as a user to have the option to output read-support counts for all variants. I've used Varlens to get around this current limitation in Isovar, but that route has it's own limitations which I'll discuss below.

Per a conversation w/ Alex:

Hey John,

I looked a little bit and found that on line 67 of isovar.effect_prediction I'm doing the following:

nonsynonymous_coding_effects = effects.drop_silent_and_noncoding()

Do you want me to make this optional for the purposes of counting variant reads and assembling variant sequences?

If so, can you file an issue on the repo? https://github.com/openvax/isovar/issues

Eliminating the hard filter for non-synonymous variants affords the user a bit of added flexibility, but would necessitate additional descriptors for each variant to enable filtering to variant classes of interest. I think two additional columns, "Effect_Class" and "Effect" would solve the filtering problem and make working with the isovar output relatively easy.

I believe two columns may be required largely because of my experience working with Varlens. The Varlens output has an "effect" column that describes the specific coding effect of a variant (e.g. p.G12D). However, I've found this to be difficult to work-with in practice as AFAIK there is no easy way to parse non-synonymous SNVs ("p.G12D"), in-frame INDEL ("p.HDVPS811del") and framshifts (p.A117fs). It may be better to have separate columns for effect class ("Exon, non-synonymous") separated from the descriptor of the specific effect (p.G12D).

Ideally an effect class column would provide the same filtering as the current hard-coded isovar filters, or use the standard Ensembl classes.

  • 3' UTR
  • 5' UTR
  • exonic-splice-site
  • Incomplete
  • Intergenic
  • Intragenic
  • Intronic
  • intronic-splice-site
  • non-coding-transcript
  • Silent
  • splice-acceptor
  • Splice-donor
  • Stop-loss
  • Stop-gain
  • Exon, Non-synonymous

The specific use case I have in mind is counting the number of variants, the number of variants with RNA read-support; and finally how the latter category breaks down by variant type (e.g. SNV, SNV w/ coding effect, Indel, etc).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions