REVUE: Repository for Variants with Unexpected Effects

Clinical sequencing of tumor samples has emerged as a component of routine cancer care. By identifying genomic alterations that contribute to tumor initiation or progression, clinical sequencing may be used to identify predictive biomarkers of drug response, refine patient cancer diagnoses, assess heritable cancer risk, or inform patient prognosis. Most genomic alterations are accurately annotated with tools such as the Variant Effect Predictor (VEP)1 that infer their effects on the mRNA and protein by following basic rules of transcription, mRNA post-transcriptional processing, and translation. For example, most single nucleotide changes in coding regions of the DNA lead to missense mutations, nonsense mutations, frameshift insertions or deletions at the protein level, which may hyperactivate or inactivate the protein. However, some genomic variants are not as easily captured by these rules, which can cause inappropriate or unclear annotation of the protein effect. For example, a variant that alters an existing splice site or creates a new one can inactivate or activate the protein and, similarly, specific mutations in the non-coding promoter region of a gene may de-regulate gene expression. Collectively, we term these variants “variants with unexpected effects (VUE)”. While many of these are functionally characterized and documented in the literature, there is currently no centralized database that identifies, curates, and programmatically stores these events, enabling their annotation during routine clinical cancer genomic sequencing. Certain VUEs may have therapeutic implications, which, if mis-annotated may lead to suboptimal treatment decisions for individual patients with cancer. For example, in-frame KIT exon 11 deletions are recurrent in patients with gastrointestinal stromal tumors (GIST) and are biomarkers for the use of imatinib as first-line therapy. About 2-3% of GIST tumors harbor KIT exon 11 deletions that extend into the non-coding intron between exons 10 and 11. Although these events have been shown to cause in-frame deletions, they are typically misclassified as inactivating splice site mutations, precluding patients carrying these mutations from receiving standard care imatinib. To address this unmet clinical need, we are building a novel bioinformatic application, REVUE, a REpository for Variants with Unexpected Effects.  The application will curate and store all VUE relevant information, be freely accessible via an intuitive website and through an application programming interface (API), and will accept feedback and submission of new VUEs from the cancer genomics community. It will be an important resource for clinical bioinformatic pipelines to accurately annotate all DNA mutations, with important implications for the subset of patients with cancers harboring VUEs.

Final Report:

Clinical sequencing of tumor samples has emerged as a component of routine cancer care. By identifying genomic alterations that contribute to tumor initiation or progression, clinical cancer genomic sequencing may be used to identify predictive biomarkers of drug response, refine patient cancer diagnoses, assess heritable cancer risk, or inform patient prognosis. Most genomic alterations are accurately annotated with tools such as the Variant Effect Predictor (VEP) that infer variant effects on the mRNA and protein by following basic rules of transcription, mRNA post-transcriptional processing, and translation. However, a select subset of variants’ effects cannot be predicted as easily by these rules. While many of these “variants with unexpected effects (VUE)” are functionally characterized and documented in the literature, these VUEs are often mis-annotated during routine clinical cancer genomic sequencing. Importantly, certain VUEs may have therapeutic implications, which, if mis-annotated may lead to suboptimal treatment decisions for individual patients with cancer. In our two-year project, we created a centralized database resource, reVUE, which is available to the scientific community, that curates and programmatically stores VUEs to enable the annotation of these variants during routine clinical cancer genomic sequencing.

Summary of Work

Our overall objective was to develop a database that keeps track of genomic variants in cancer that have unexpected consequences on the protein: a repository for Variants with Unexpected Effects (reVUE). Our specific goals were two-fold: first, we created the software and database infrastructure to store VUEs, and second, we curated VUEs via literature review, public data mining, and community suggestions. We successfully implemented the reVUE database (https://cancerrevue.org) and curated 109 VUEs, spanning 22 genes from 31 articles. This is the first time, to our knowledge, that sequencing data has been exhaustively mined to prospectively identify candidate VUEs in parallel to literature curation of these variants. Several curated VUEs were associated with clinical treatment implications, including VUEs in KIT, MET, ATM, EGFR, and BRCA1/2. The website homepage has a summary listing of all VUEs and gene pages with additional details about individual VUEs. Community crowdsourcing is also available through surveys accessible on the website. The reVUE database has also been integrated into the publicly available bioinformatic ecosystem of cancer variant annotation and interpretation tools that currently includes Genome Nexus, OncoKB, and cBioPortal. Therefore, users of these platforms will be enabled to view the correct protein effects of VUEs with data and annotation provided by reVUE. Importantly, all the software and data are publicly available for incorporation into other genomics tools and clinical or research pipelines.

The website, code, and data are available at: