Novel algorithms to account for uncertainties in the sequencing of genetic material with skewed abundance
Date
2022
Authors
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Narrator
Transcriber
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
Volume
Number/Issue
Starting Page
Ending Page
Alternative Title
Abstract
The sequencing of genetic material (microbial DNA or RNA) is essential in biological experiments. However, while the cost of sequencing has decreased substantially, the highly skewed distribution of genetic material makes it challenging to accurately represent the genetic content of a sample. For instance, in DNA-based metagenomic experiments, DNA fragments are randomly sampled and used to identify and quantify organisms present in an environmental sample. Rare species are sampled less frequently, thus challenging subsequent bioinformatic analyses. Given the prevalence and the drastic implications of the uneven distribution of genetic material on bioinformatic analyses, our research focuses on new graph- and deep learning-based methods to address these issues in three different contexts. Specifically, we propose (1) an imputation method that can accurately recover the abundance of under-represented genetic material in single-cell RNA-seq experiments (2) a binning method to reduce genome fragmentation in viral metagenome sequencing experiments, and (3) a tool to explore and cluster viral populations based on their genomic structure. Our contributions focus on three popular biological contexts for which the issue of abundance hampers the bioinformatic analyses. Furthermore, the last two chapters focus on understanding viral diversity and modeling the genesis of novel virus strains through recombinations. Despite being at the core of the current COVID-19 crisis, the issue of recombination remains understudied, and few tools exist to model how viral populations evolve through recombination.
Description
Keywords
Bioinformatics
Citation
Extent
125 pages
Format
Geographic Location
Time Period
Related To
Related To (URI)
Table of Contents
Rights
All UHM dissertations and theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission from the copyright owner.
Rights Holder
Local Contexts
Collections
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.