Phishing Sites Detection from a Web Developer’s Perspective Using Machine Learning

Zhou, Xin; Verma, Rakesh

Phishing Sites Detection from a Web Developer’s Perspective Using Machine Learning

dc.contributor.author	Zhou, Xin
dc.contributor.author	Verma, Rakesh
dc.date.accessioned	2020-01-04T08:31:54Z
dc.date.available	2020-01-04T08:31:54Z
dc.date.issued	2020-01-07
dc.description.abstract	The Internet has enabled unprecedented communication and new technologies. Concomitantly, it has brought the bane of phishing and exacerbated vulnerabilities. In this paper, we propose a model to detect phishing webpages from a web developer’s perspective. From this standpoint, we design 120 novel features based on content from a webpage, four time-based and two search-based novel features, plus we use 34 other content-based and 11 heuristic features to optimize the model. Moreover, we select Random Committee (Base learner: Random Tree) for our framework since it has the best performance after comparing with six other algorithms: Hellinger Distance Decision Tree, SVM, Logistic Regression, J48, Naive Bayes, and Random Forest. In real-time experiments, the model achieved 99.4% precision and 98.3% MCC with 0.1% false positive rate in 5-fold crossvalidation using the realistic scenario of an unbalanced dataset.
dc.format.extent	10 pages
dc.identifier.doi	10.24251/HICSS.2020.794
dc.identifier.isbn	978-0-9981331-3-3
dc.identifier.uri	http://hdl.handle.net/10125/64536
dc.language.iso	eng
dc.relation.ispartof	Proceedings of the 53rd Hawaii International Conference on System Sciences
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject	Machine Learning and Cyber Threat Intelligence and Analytics
dc.subject	machine learning
dc.subject	phishing website
dc.subject	random committee
dc.title	Phishing Sites Detection from a Web Developer’s Perspective Using Machine Learning
dc.type	Conference Paper
dc.type.dcmi	Text

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 0641.pdf
Size:: 689.61 KB
Format:: Adobe Portable Document Format

Download

Collections

Machine Learning and Cyber Threat Intelligence and Analytics