Using Isolation Forest and Alternative Data Products to Overcome Ground Truth Data Scarcity for Improved Deep Learning-based Agricultural Land Use Classification Models

dc.contributor.author García Pereira, Agustín
dc.contributor.author Porwol, Lukasz
dc.contributor.author Ojo, Adegboyega
dc.date.accessioned 2022-12-27T19:16:44Z
dc.date.available 2022-12-27T19:16:44Z
dc.date.issued 2023-01-03
dc.description.abstract High-quality labelled datasets represent a cornerstone in the development of deep learning models for land use classification. The high cost of data collection, the inherent errors introduced during data mapping efforts, the lack of local knowledge, and the spatial variability of the data hinder the development of accurate and spatially-transferable deep learning models in the context of agriculture. In this paper, we investigate the use of Isolation Forest (IF), an anomaly detection algorithm, to reduce noise in a large-scale, low-resolution alternative ground truth dataset used to train land use deep learning models. We use a modest-size, high-resolution and high-fidelity manually collected ground-truth dataset to calibrate Isolation Forest parameters and evaluate our approach, highlighting the relatively low cost of the methodology. Our data-centric methodology demonstrates the efficacy of deep learning methods coupled with IF to create mid-resolution land-use models and map products for agriculture using an alternative ground-truth dataset. Moreover, we compare our deep learning approach with a traditional algorithm used in remote sensing and evaluate the spatial transferability of the created models. Finally, we reflect upon the lessons learnt and future work.
dc.format.extent 10
dc.identifier.doi 10.24251/HICSS.2023.610
dc.identifier.isbn 978-0-9981331-6-4
dc.identifier.uri https://hdl.handle.net/10125/103244
dc.language.iso eng
dc.relation.ispartof Proceedings of the 56th Hawaii International Conference on System Sciences
dc.rights Attribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.uri https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject Geospatial Big Data Analytics
dc.subject agriculture
dc.subject data
dc.subject datasets
dc.subject deep learning
dc.subject gis
dc.subject ground truth
dc.subject isolation forest
dc.title Using Isolation Forest and Alternative Data Products to Overcome Ground Truth Data Scarcity for Improved Deep Learning-based Agricultural Land Use Classification Models
dc.type.dcmi text
prism.startingpage 4978
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
0485.pdf
Size:
787.97 KB
Format:
Adobe Portable Document Format
Description: