Please use this identifier to cite or link to this item: http://hdl.handle.net/10125/70706

Automated topic analysis for restricted scope health corpora: methodology and comparison with human performance

File Size Format  
0077.pdf 389.7 kB Adobe PDF View/Open

Item Summary

Title:Automated topic analysis for restricted scope health corpora: methodology and comparison with human performance
Authors:Maeder, Anthony
Tieman, Jennifer
Naveda, Bertha
Champion, Stephanie
Agnew, Tamara
Keywords:Text Analytics
topic analysis
natural language processing
keyword extraction
term frequency
Date Issued:05 Jan 2021
Abstract:This paper addresses the problem of identifying topics which describe information content, in restricted size sets of scientific papers extracted from publication databases. Conventional computational approaches, based on natural language processing using unsupervised classification algorithms, typically require large numbers of papers to achieve adequate training. The approach presented here uses a simpler word-frequency-based approach coupled with context modeling. An example is provided of its application to corpora resulting from a curated literature search site for COVID-19 research publications. The results are compared with a conventional human-based approach, indicating partial overlap in the topics identified. The findings suggest that computational approaches may provide an alternative to human expert topic analysis, provided adequate contextual models are available.
Pages/Duration:7 pages
URI:http://hdl.handle.net/10125/70706
ISBN:978-0-9981331-4-0
DOI:10.24251/HICSS.2021.095
Rights:Attribution-NonCommercial-NoDerivatives 4.0 International
https://creativecommons.org/licenses/by-nc-nd/4.0/
Appears in Collections: Text Analytics


Please email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.

This item is licensed under a Creative Commons License Creative Commons