Improving Bioinformatics Methods for Analysis of Virus-Associated Cancers

During the past century, the association between viruses and cancer has remained a focal topic in cancer research due to the large number of tumors associated with a viral origin. According to the world health organization, an estimated 5% of cancers are caused by indirect viral mechanisms (e.g., host immunosuppression), and as many as 15% of all cancers are directly caused by viruses known as “tumor viruses”. Mechanistically, while tumor viruses express viral oncoproteins that are associated with tumor development, most tumor viruses also commonly insert their own genome into the target host genome, and some will also form new viral-host fusion genes. Unfortunately, the role of viral genomic integration in tumor development has been difficult to study due to a lack of robust and generalizable viral analysis pipelines. Further, given the complexity of the novel viral-host fusion genes that are formed, functional prediction and annotation also remains a problem, again due to a lack of robust informatics tool kits. Consequently, pipelines that facilitate the detection of viral integration sites in the host genome will likely be crucial for both understanding viral-driven disease genetics and building genetics-based companion diagnostics in the future. Here we propose to address this gap in our field through the following aims: 1) To develop a pipeline capable of bridging multiple next-generation sequencing (NGS) technologies that accurately detects virus-host integration and fusion sites in virus-associated cancers, and 2) To integrate and re-analyze the public data related to virus-associated cancers in the context of viral-host integration and fusion events in order to facilitate functional annotation. Through these aims, we now have the opportunity to integrate and re-annotate publicly available data in the context of virus integration-induced alterations in order to leverage public resources to improve the functional annotation of viral-host events.