Mining and Representing Unstructured Nicotine Use Data in a Structured Format for Secondary Use

Ngwenya, Mandlenkosi; Bankole, Felix

Mining and Representing Unstructured Nicotine Use Data in a Structured Format for Secondary Use

Files

0371.pdf (533.89 KB)

Date

2019-01-08

Authors

Ngwenya, Mandlenkosi

Bankole, Felix

Abstract

The objective of this study was to use rules, NLP and machine learning for addressing the problem of clinical data interoperability across healthcare providers. Addressing this problem has the potential to make clinical data comparable, retrievable and exchangeable between healthcare providers. Our focus was in giving structure to unstructured patient smoking information. We collected our data from the MIMIC-III database. We wrote rules for annotating the data, then trained a CRF sequence classifier. We obtained an f-measure of 86%, 72%, 69%, 80%, and 12% for substance smoked, frequency, amount, temporal, and duration respectively. Amount smoked yielded a small value due to scarcity of related data. Then for smoking status we obtained an f-measure of 94.8% for non-smoker class, 83.0% for current-smoker, and 65.7% for past-smoker. We created a FHIR profile for mapping the extracted data based on openEHR reference models, however in future we will explore mapping to CIMI models.

Keywords

Big Data on Healthcare Application, Information Technology in Healthcare, Big data, Conditional Random Fields, FHIR profiles, Natural language processing, Unstructured health data

URI

http://hdl.handle.net/10125/59811

Extent

10 pages

Related To

Proceedings of the 52nd Hawaii International Conference on System Sciences

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International

Collections

Big Data on Healthcare Application

Full item page

Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.

Mining and Representing Unstructured Nicotine Use Data in a Structured Format for Secondary Use

Files

Date

Authors

Contributor

Advisor

Department

Instructor

Depositor

Speaker

Researcher

Consultant

Interviewer

Narrator

Transcriber

Annotator

Journal Title

Journal ISSN

Volume Title

Publisher

Volume

Number/Issue

Starting Page

Ending Page

Alternative Title

Abstract

Description

Keywords

Citation

URI

Extent

Format

Geographic Location

Time Period

Related To

Related To (URI)

Table of Contents

Rights

Rights Holder

Local Contexts

Collections