A Hybrid Approach For Phishing Website Detection Using Machine Learning.

EOI: 10.11242/viva-tech.01.04.111

Download Full Text here


Mr. Harsh Kansagara, Mr. Vandan Raval, Mr. Faiz Shaikh, Prof. Saniket Kudoo, "A Hybrid Approach For Phishing Website Detection Using Machine Learning.", VIVA-IJRI Volume 1, Issue 4, Article 111, pp. 1-6, 2021. Published by Computer Engineering Department, VIVA Institute of Technology, Virar, India.


In this technical age there are many ways where an attacker can get access to people’s sensitive information illegitimately. One of the ways is Phishing, Phishing is an activity of misleading people into giving their sensitive information on fraud websites that lookalike to the real website. The phishers aim is to steal personal information, bank details etc. Day by day it’s getting more and more risky to enter your personal information on websites fearing that it might be a phishing attack and can steal your sensitive information. That’s why phishing website detection is necessary to alert the user and block the website. An automated detection of phishing attack is necessary one of which is machine learning. Machine Learning is one of the efficient techniques to detect phishing attack as it removes drawback of existing approaches. Efficient machine learning model with content based approach proves very effective to detect phishing websites. Our proposed system uses Hybrid approach which combines machine learning based method and content based method. The URL based features will be extracted and passed to machine learning model and in content based approach, TF-IDF algorithm will detect a phishing website by using the top keywords of a web page. This hybrid approach is used to achieve highly efficient result. Finally, our system will notify and alert user if the website is Phishing or Legitimate.


Content-based approach, Machine learning, Phishing detection, Random Forest, TF-IDF.


  1. P. Yang, G. Zhao and P. Zeng, "Phishing Website Detection Based on Multidimensional Features Driven by Deep Learning" IEEE Access, vol. 7, 2019, pp. 15196-15209.
  2. M. H. Alkawaz, S. J. Steven and A. I. Hajamydeen, "Detecting Phishing Website Using Machine Learning," IEEE International Colloquium on Signal Processing & Its Applications (CSPA),, 2020, pp. 111-114.
  3. H. Yuan, X. Chen, Y. Li, Z. Yang and W. Liu, "Detecting Phishing Websites and Targets Based on URLs and Webpage Links," International Conference on Pattern Recognition (ICPR), 2018, pp. 3669-3674.
  4. F. C. Dalgic, A. S. Bozkir and M. Aydos, "Phish-IRIS: A New Approach for Vision Based Brand Prediction of Phishing Web Pages via Compact Visual Descriptors," International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), 2018, pp. 1-8.
  5. S. Patil and S. Dhage, "A Methodical Overview on Phishing Detection along with an Organized Way to Construct an Anti-Phishing Framework," International Conference on Advanced Computing & Communication Systems (ICACCS), 2019,pp. 588-593.
  6. S. Parekh, D. Parikh, S. Kotak and S. Sankhe, "A New Method for Detection of Phishing Websites: URL Detection," International Conference on Inventive Communication and Computational Technologies (ICICCT), 2018, pp. 949-952.
  7. J. Mao, W. Tian, P. Li, T. Wei and Z. Liang, "Phishing-Alarm: Robust and Efficient Phishing Detection via Page Component Similarity," IEEE Access, vol.5, 2017, pp. 17020-17030.
  8. T. Nathezhtha, D. Sangeetha and V. Vaidehi, "WC-PAD: Web Crawling based Phishing Attack Detection," International Carnahan Conference on Security Technology (ICCST), 2019, pp. 1-6.
  9. Taha, Altyeb. “Phishing Websites Classification using Hybrid SVM and KNN Approach” International Journal of Advanced Computer Science and Applications , Volume 8 Issue 6, 2017.
  10. S. Haruta, H. Asahina and I. Sasase, "Visual Similarity-Based Phishing Detection Scheme Using Image and CSS with Target Website Finder," IEEE Global Communications Conference, 2017, pp. 1-6.
  11. Christou, O.; Pitropakis, N.; Papadopoulos, P.; McKeown, S. and Buchanan, "Phishing URL Detection Through Top-level Domain Analysis: A Descriptive Approach", International Conference on Information Systems Security and Privacy, Volume 1: ICISSP, 2020, pp. 289-298.
  12. S. Roopak, A. P. Vijayaraghavan and T. Thomas, "On Effectiveness of Source Code and SSL Based Features for Phishing Website Detection," International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE), 2019, pp. 172-175.
  13. H. Yuan, X. Chen, Y. Li, Z. Yang and W. Liu, "Detecting Phishing Websites and Targets Based on URLs and Webpage Links," International Conference on Pattern Recognition (ICPR), 2018, pp. 3669-3674.
  14. M. Sameen, K. Han and S. O. Hwang, "PhishHaven—An Efficient Real-Time AI Phishing URLs Detection System," IEEE Access, vol. 8, 2020, pp. 83425-83443.
  15. H. Chapla, R. Kotak and M. Joiser, "A Machine Learning Approach for URL Based Web Phishing Using Fuzzy Logic as Classifier," International Conference on Communication and Electronics Systems , 2019, pp. 383-388.