Please use this identifier to cite or link to this item: http://hdl.handle.net/10125/64536

Phishing Sites Detection from a Web Developer’s Perspective Using Machine Learning

File Size Format  
0641.pdf 689.61 kB Adobe PDF View/Open

Item Summary

Title:Phishing Sites Detection from a Web Developer’s Perspective Using Machine Learning
Authors:Zhou, Xin
Verma, Rakesh
Keywords:Machine Learning and Cyber Threat Intelligence and Analytics
machine learning
phishing website
random committee
Date Issued:07 Jan 2020
Abstract:The Internet has enabled unprecedented communication and new technologies. Concomitantly, it has brought the bane of phishing and exacerbated vulnerabilities. In this paper, we propose a model to detect phishing webpages from a web developer’s perspective. From this standpoint, we design 120 novel features based on content from a webpage, four time-based and two search-based novel features, plus we use 34 other content-based and 11 heuristic features to optimize the model. Moreover, we select Random Committee (Base learner: Random Tree) for our framework since it has the best performance after comparing with six other algorithms: Hellinger Distance Decision Tree, SVM, Logistic Regression, J48, Naive Bayes, and Random Forest. In real-time experiments, the model achieved 99.4% precision and 98.3% MCC with 0.1% false positive rate in 5-fold crossvalidation using the realistic scenario of an unbalanced dataset.
Pages/Duration:10 pages
URI:http://hdl.handle.net/10125/64536
ISBN:978-0-9981331-3-3
DOI:10.24251/HICSS.2020.794
Rights:Attribution-NonCommercial-NoDerivatives 4.0 International
https://creativecommons.org/licenses/by-nc-nd/4.0/
Appears in Collections: Machine Learning and Cyber Threat Intelligence and Analytics


Please email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.

This item is licensed under a Creative Commons License Creative Commons