Automated topic analysis for restricted scope health corpora: methodology and comparison with human performance

dc.contributor.authorMaeder, Anthony
dc.contributor.authorTieman, Jennifer
dc.contributor.authorNaveda, Bertha
dc.contributor.authorChampion, Stephanie
dc.contributor.authorAgnew, Tamara
dc.date.accessioned2020-12-24T19:08:00Z
dc.date.available2020-12-24T19:08:00Z
dc.date.issued2021-01-05
dc.description.abstractThis paper addresses the problem of identifying topics which describe information content, in restricted size sets of scientific papers extracted from publication databases. Conventional computational approaches, based on natural language processing using unsupervised classification algorithms, typically require large numbers of papers to achieve adequate training. The approach presented here uses a simpler word-frequency-based approach coupled with context modeling. An example is provided of its application to corpora resulting from a curated literature search site for COVID-19 research publications. The results are compared with a conventional human-based approach, indicating partial overlap in the topics identified. The findings suggest that computational approaches may provide an alternative to human expert topic analysis, provided adequate contextual models are available.
dc.format.extent7 pages
dc.identifier.doihttps://doi.org/10.24251/HICSS.2021.095
dc.identifier.isbn978-0-9981331-4-0
dc.identifier.urihttp://hdl.handle.net/10125/70706
dc.language.isoEnglish
dc.relation.ispartofProceedings of the 54th Hawaii International Conference on System Sciences
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectText Analytics
dc.subjecttopic analysis
dc.subjectnatural language processing
dc.subjectkeyword extraction
dc.subjectterm frequency
dc.titleAutomated topic analysis for restricted scope health corpora: methodology and comparison with human performance
prism.startingpage775

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
0077.pdf
Size:
389.7 KB
Format:
Adobe Portable Document Format

Collections