Automated topic analysis for restricted scope health corpora: methodology and comparison with human performance

Date
2021-01-05
Authors
Maeder, Anthony
Tieman, Jennifer
Naveda, Bertha
Champion, Stephanie
Agnew, Tamara
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Journal Title
Journal ISSN
Volume Title
Publisher
Volume
Number/Issue
Starting Page
775
Ending Page
Alternative Title
Abstract
This paper addresses the problem of identifying topics which describe information content, in restricted size sets of scientific papers extracted from publication databases. Conventional computational approaches, based on natural language processing using unsupervised classification algorithms, typically require large numbers of papers to achieve adequate training. The approach presented here uses a simpler word-frequency-based approach coupled with context modeling. An example is provided of its application to corpora resulting from a curated literature search site for COVID-19 research publications. The results are compared with a conventional human-based approach, indicating partial overlap in the topics identified. The findings suggest that computational approaches may provide an alternative to human expert topic analysis, provided adequate contextual models are available.
Description
Keywords
Text Analytics, topic analysis, natural language processing, keyword extraction, term frequency
Citation
Extent
7 pages
Format
Geographic Location
Time Period
Related To
Proceedings of the 54th Hawaii International Conference on System Sciences
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International
Rights Holder
Collections
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.