DATA SCIENCE FOR MOLECULAR GENETICS AND COMMUNICATION IN THE NATURAL SCIENCES
Date
2022
Authors
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Narrator
Transcriber
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
Volume
Number/Issue
Starting Page
Ending Page
Alternative Title
Abstract
“By 2025, it’s estimated that 463 exabytes of data will be created each day globally – that’s the equivalent of 212,765,957 DVDs per day!” -World Economic Forum
Data science refers to the study of increasingly large and complex datasets. Data that are too large for standard tools (e.g., Excel, Google Sheets) to analyze are often referred to as “big data.” While big data exists across many areas and is thought to be the path to answering many questions, there is still no consensus on the fundamental principles and skills needed to interact with big data. Further, skills to study big data are not universally taught systematically at the college level–the resulting gap in skills leaves students unable to analyze the same big data that are touted as the way to answer complex questions. This dissertation proposes a plan to close the big data knowledge gap by incorporating data science principles from diverse disciplines into a biology curriculum. Specifically, essential information was distilled from three independent study systems in cancer diagnostics, plant genomics, and academic publishing. Each study system contributed a different perspective on skills and knowledge from analyzing big data. From these systems, I identified three critical areas that are central to using big data effectively.
From these diverse perspectives, I developed a model to assist instructors in constructing curricula that will work in many different biological contexts. I piloted the use of these principles in a summer course. I found that by incorporating instruction developed across knowledge areas, meaningful data science instruction can occur in any curriculum at any student level.
Description
Keywords
Bioinformatics
Citation
Extent
115 pages
Format
Geographic Location
Time Period
Related To
Related To (URI)
Table of Contents
Rights
All UHM dissertations and theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission from the copyright owner.
Rights Holder
Local Contexts
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.