Robust Optimization for Inference on Machine Learning Generated Variables

Schecter, Aaron; Li, Weifeng

Robust Optimization for Inference on Machine Learning Generated Variables

dc.contributor.author	Schecter, Aaron
dc.contributor.author	Li, Weifeng
dc.date.accessioned	2023-12-26T18:36:42Z
dc.date.available	2023-12-26T18:36:42Z
dc.date.issued	2024-01-03
dc.identifier.doi	10.24251/HICSS.2024.132
dc.identifier.isbn	978-0-9981331-7-1
dc.identifier.other	59f0118d-0bb1-4a9e-88de-3f7832274c75
dc.identifier.uri	https://hdl.handle.net/10125/106509
dc.language.iso	eng
dc.relation.ispartof	Proceedings of the 57th Hawaii International Conference on System Sciences
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject	Data Science and Machine Learning to Support Business Decisions
dc.subject	bias correction
dc.subject	machine learning
dc.subject	measurement error
dc.subject	regression
dc.subject	robust optimization
dc.title	Robust Optimization for Inference on Machine Learning Generated Variables
dc.type	Conference Paper
dc.type.dcmi	Text
dcterms.abstract	Leveraging supervised machine learning (SML) algorithms to operationalize constructs from unstructured data like text or images is becoming common in practice and research. As a result, variables generated through SML are used in regression models to make inferences and test theories. However, variables produced by SML will have measurement errors compared to the underlying construct. We propose using robust optimization to reduce the negative impact of these errors and produce less biased coefficient estimates while conducting more accurate hypothesis testing. To extend the burgeoning literature on this issue, our proposed method focuses on the generalized research setting where a flexible number of dependent and independent variables are measured by SML algorithms. We combine recent robust optimization techniques to fit a linear regression model in the presence of uncertain measurement error. We theoretically demonstrate the consistency and efficiency of the robust approach. Through simulations, we demonstrate the effectiveness of our approach.
dcterms.extent	10 pages
prism.startingpage	1100

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 0107.pdf
Size:: 344.74 KB
Format:: Adobe Portable Document Format

Download

Collections

Data Science and Machine Learning to Support Business Decisions