The Effect of Training Set Timeframe on the Future Performance of Machine Learning-based Malware Detection Models

dc.contributor.authorGalen, Colin
dc.contributor.authorSteele, Robert
dc.date.accessioned2020-12-24T19:09:05Z
dc.date.available2020-12-24T19:09:05Z
dc.date.issued2021-01-05
dc.description.abstractThe occurrence of previously unseen malicious code or malware is an implicit and ongoing issue for all software-based systems. It has been recognized that machine learning, applied to features statically extracted from binary executable files, offers a number of promising benefits, such as its ability to detect malware that has not been previously encountered. Nevertheless it is understood that these models will not continue to perform equally well over time as new and potentially less recognizable malwares occur. In this study, we have applied a range of machine learning models to the features extracted from a large collection of software executables in Portable Executable format ordered by the date the binary was first encountered, consisting of both malware and benign examples, whilst considering different training set configurations and timeframes. We analyze and quantify the relative performance deterioration of these machine learning models on future test sets of these features, and discuss some insights into the characteristics and rate of machine learning-based malware detection performance deterioration and training set selection.
dc.format.extent10 pages
dc.identifier.doi10.24251/HICSS.2021.105
dc.identifier.isbn978-0-9981331-4-0
dc.identifier.urihttp://hdl.handle.net/10125/70717
dc.language.isoEnglish
dc.relation.ispartofProceedings of the 54th Hawaii International Conference on System Sciences
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectAccountability, Evaluation, and Obscurity of AI Algorithms
dc.subjectartificial intelligence
dc.subjectmachine learning
dc.subjectmalware
dc.subjectmodel maintenance
dc.titleThe Effect of Training Set Timeframe on the Future Performance of Machine Learning-based Malware Detection Models
prism.startingpage857

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
0085.pdf
Size:
1.15 MB
Format:
Adobe Portable Document Format