Novel algorithms to account for uncertainties in the sequencing of genetic material with skewed abundance

Date

2022

Contributor

Instructor

Depositor

Speaker

Researcher

Consultant

Interviewer

Narrator

Transcriber

Annotator

Journal Title

Journal ISSN

Volume Title

Publisher

Volume

Number/Issue

Starting Page

Ending Page

Alternative Title

Abstract

The sequencing of genetic material (microbial DNA or RNA) is essential in biological experiments. However, while the cost of sequencing has decreased substantially, the highly skewed distribution of genetic material makes it challenging to accurately represent the genetic content of a sample. For instance, in DNA-based metagenomic experiments, DNA fragments are randomly sampled and used to identify and quantify organisms present in an environmental sample. Rare species are sampled less frequently, thus challenging subsequent bioinformatic analyses. Given the prevalence and the drastic implications of the uneven distribution of genetic material on bioinformatic analyses, our research focuses on new graph- and deep learning-based methods to address these issues in three different contexts. Specifically, we propose (1) an imputation method that can accurately recover the abundance of under-represented genetic material in single-cell RNA-seq experiments (2) a binning method to reduce genome fragmentation in viral metagenome sequencing experiments, and (3) a tool to explore and cluster viral populations based on their genomic structure. Our contributions focus on three popular biological contexts for which the issue of abundance hampers the bioinformatic analyses. Furthermore, the last two chapters focus on understanding viral diversity and modeling the genesis of novel virus strains through recombinations. Despite being at the core of the current COVID-19 crisis, the issue of recombination remains understudied, and few tools exist to model how viral populations evolve through recombination.

Description

Keywords

Bioinformatics

Citation

Extent

125 pages

Format

Geographic Location

Time Period

Related To

Related To (URI)

Table of Contents

Rights

All UHM dissertations and theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission from the copyright owner.

Rights Holder

Local Contexts

Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.