Using Isolation Forest and Alternative Data Products to Overcome Ground Truth Data Scarcity for Improved Deep Learning-based Agricultural Land Use Classification Models

dc.contributor.authorGarcía Pereira, Agustín
dc.contributor.authorPorwol, Lukasz
dc.contributor.authorOjo, Adegboyega
dc.date.accessioned2022-12-27T19:16:44Z
dc.date.available2022-12-27T19:16:44Z
dc.date.issued2023-01-03
dc.description.abstractHigh-quality labelled datasets represent a cornerstone in the development of deep learning models for land use classification. The high cost of data collection, the inherent errors introduced during data mapping efforts, the lack of local knowledge, and the spatial variability of the data hinder the development of accurate and spatially-transferable deep learning models in the context of agriculture. In this paper, we investigate the use of Isolation Forest (IF), an anomaly detection algorithm, to reduce noise in a large-scale, low-resolution alternative ground truth dataset used to train land use deep learning models. We use a modest-size, high-resolution and high-fidelity manually collected ground-truth dataset to calibrate Isolation Forest parameters and evaluate our approach, highlighting the relatively low cost of the methodology. Our data-centric methodology demonstrates the efficacy of deep learning methods coupled with IF to create mid-resolution land-use models and map products for agriculture using an alternative ground-truth dataset. Moreover, we compare our deep learning approach with a traditional algorithm used in remote sensing and evaluate the spatial transferability of the created models. Finally, we reflect upon the lessons learnt and future work.
dc.format.extent10
dc.identifier.doi10.24251/HICSS.2023.610
dc.identifier.isbn978-0-9981331-6-4
dc.identifier.urihttps://hdl.handle.net/10125/103244
dc.language.isoeng
dc.relation.ispartofProceedings of the 56th Hawaii International Conference on System Sciences
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectGeospatial Big Data Analytics
dc.subjectagriculture
dc.subjectdata
dc.subjectdatasets
dc.subjectdeep learning
dc.subjectgis
dc.subjectground truth
dc.subjectisolation forest
dc.titleUsing Isolation Forest and Alternative Data Products to Overcome Ground Truth Data Scarcity for Improved Deep Learning-based Agricultural Land Use Classification Models
dc.type.dcmitext
prism.startingpage4978

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
0485.pdf
Size:
787.97 KB
Format:
Adobe Portable Document Format