Phishing attacks are a major cyber threat that continue to evolve and become more sophisticated, causing billions of dollars in losses each year according to the recent Internet Crime Report. However, traditional offline or inline phishing detection engines are limited in how they can detect evasive phishing pages. Due to the performance requirements of inline solutions, they can only target specific campaigns and, at best, act as a basic static analyzer. On the other hand, offline detection solutions heavily rely on crawlers to collect content and update a blocklist to enforce blocking phishing pages. They, too, are limited; offline tools cannot detect evasive phishing pages that present benign content to a crawler or short-lived campaigns that abandon the URL before it gets updated in the blocklist.
At Netskope, we have developed a deep learning-based Inline Phishing Detection Engine (DL-IPD) that can address all of these limitations. It allows us to inspect web content and block phishing web pages in real time, complementing our offline analyzers and blocklists. The model continuously learns the dynamic behavior patterns of phishing pages and can identify patient-zero and evasive phishing attacks. Similar to generative pre-training of large language models, such as BERT and ChatGPT/GPT-4, we use a large number of web pages to train the HTML encoder and then use it to build a phishing classifier. We have been awarded three U.S. patents for our innovative approach to phishing detection.
The Inline Phishing Detection Engine (DL-IPD) is available to all Intelligent SSE and Netskope Next Gen-SWG customers.. In this blog post, we will share some of the patient-zero and evasive phishing examples that our engine has detected.
Patient-zero threat
To detect phishing pages, traditional offline solutions typically crawl URLs flagged in traffic or collected from the security community. Once a page is identified as phishing, it is added to a blocklist database used for URL filtering to prevent future access. However, this approach can result in delays due to the time it takes to collect features and update the database, which can allow the first victim and phishing attempts to go undetected, potentially compromising the entire organization. Inline analysis offers an advantage by inspecting the actual content that users see, thereby preventing patient-zero attacks. In the following section, we present examples of patient-zero campaigns that have been detected.
Credential and credit card theft
Credential theft phishing pages are designed to look like legitimate websites, such as online services or e-commerce websites, and trick users into entering their login credentials and credit card information. The following is an example of a patient-zero credential theft phishing page that was detected by DL-IPD on April 11, 2023. The campaign mimics the login page of CTT Correios de Portugal, the national postal service of Portugal, and requests victims to enter their personal and credit card information.
Health scam
Since the deployment of Netskope Advanced Inline Phishing Detection, we were able to track various forms of health scams, including fake cures for serious illnesses, unproven health supplements and treatments, and bogus medical equipment and devices. The following is an example of a health scam that was detected by Netskope on April 10, before any other vendor (based on VirusTotal records). This scam involves tricking the victim into purchasing a health product package.
Malicious adware pages
Malicious adware sites display unwanted or harmful ads to users. These websites often employ deceptive tactics to entice users to click on ads, which may redirect them to malicious sites or download malware onto their devices.
DL-IPD detected various types of malicious adware pages that ask for permission to show notifications in the browser. Following the user’s approval, the pages bombard users with ads that redirect them to fake antivirus scams, free iPad scams, and other types of scams. Following is a patient-zero adware page detected by our Inline Phishing Detection Engine.
Detection of evasive phishing pages
Phishing attacks are becoming more sophisticated with the use of cloaking, URL rotation, obfuscation, and dynamic code generation. These techniques make it challenging for traditional phishing detection tools relying on signature-based or classic feature extraction techniques to detect evasive pages.
DL-IPD has the capability to detect evasive pages; Firstly, as an inline engine, it has access to the actual content that will be delivered to the user, making server-side evasion techniques ineffective. Moreover, unlike traditional signature-based inline tools or classic feature extraction techniques, this classifier uses deep learning to learn complex signatures and can detect signs of obfuscation attempts or codes that attempt to generate dynamic content.
In the ongoing battle between phishing attackers and phishing solutions, attackers are always coming up with new ways to bypass security solutions. However, this tool makes it one step harder for the malicious pages to evade detection. In the following section, we will provide examples of evasive pages detected by DL-IPD.
Example 1 – Phishing with dynamic content
Attackers may use JavaScript to dynamically generate the phishing page on the fly. This means that the phishing page is not visible in the page source code, making it harder for detection tools to accurately detect the page. The following phishing page detected by Netskope uses Javascript to add all the DOM elements after getting loaded on the victim’s browser.
The HTML code’s body initially checks whether the browser is capable of running JavaScript or not. If it can’t, then no content will be displayed. The inability to run JavaScript is already a sign that the page is being analyzed. Furthermore, the body of the code is mostly empty, containing only two obfuscated JavaScript codes that dynamically add the page’s elements when loaded in a real browser, creating a Microsoft login page.
Even though the code to generate the phishing page is obfuscated, the structure of the landing HTML page is already suspicious since the entire body consists of only two scripts, no visual elements, and a warning to enable Javascript that shows an attempt to dynamically generate the code. Our DL-based classifier learned such page structures and patterns and blocked them.
Example 2 – Use of images instead of HTML code
Phishing pages sometimes use images instead of writing HTML code to visually imitate a legitimate website. By doing so, attackers can evade static analyzers or phishing solutions that analyze only the HTML code of the page. At the same time, imaged-based phishing pages may appear more genuine to victims and require less effort to create.
The following phishing page on a compromised host was detected by Advanced Inline Phishing Detection. The entire HTML code of this page consists of a Base64 encoded image rendered in the background as an authentication form.
Summary
Traditional inline and offline phishing detection approaches have limitations in detecting sophisticated phishing attacks that use evasion techniques. In this blog we demonstrated how Netskope DL-based Inline Phishing Detection (DL-IPD), has been able to fill the gap and detect previously unknown phishing campaigns. The ability to continuously learn and mine complicated patterns that could signal evasion tactics enhances the coverage of traditional solutions. Advanced Inline Phishing Detection has been globally enabled for all the Netskope Standard Threat Protection inline customers since March 2023.
The author would like to acknowledge the significant contributions from Christopher Talampas, Aries De Vera, Maela Angeles and Emmanuel De Vera on this project.
IOC Links
- http://lsodeo[.]dbe[.]gov.mm/wp-admin/.Adminser/CTT/signin[.]php
- http://lsodeo[.]dbe[.]gov[.]mm/wp-admin/.Adminser/CTT/wallet[.]php?a8475af7b320248b7381898ca5fbe81485268095
- http://fimdafraquezamuscular[.]site/fu/h50/tab/cdv01/
- https://doutornature[.]com/fu/h50/native/ques1-m[.]php?pagina=cdv01&click_area=obteragora2&_ga=2.180014756.1601786821.1681389141-532695057.1681389141&sf_vid=d3f8521623b974db16e73fd21ae2b714_57&sf_l=690368343
- https://doutornature[.]com/fu/h50/native/presell1-d.php?sf_vid=d3f8521623b974db16e73fd21ae2b714_57&_ga=2.254421640.1601786821.1681389141-532695057.1681389141
- https://pay[.]doutornature[.]com/go/g7w7g?ckp_src=690368620&_ga=2.254421640.1601786821.1681389141-532695057.1681389141&sf_vid=d3f8521623b974db16e73fd21ae2b714_57&sf_l=690368620
- rondureblog[.]com/VD7C6Qr3eJDbumg8kfdzo2LrRqz0mxHtk6qcu2tKCRw/
- https://oneadvupfordesign[.]com/HAj6yqps_tKQK-kQDgSpA3_QiJwwlXAPPwPq1YtK1dM/?cid=ZD4yd7Ing3sAFao7AA9q0wBVNZAAAAAA&sid=46739
- https://clouddoc-authorize[.]firebaseapp[.]com/……xx…/xx…/robots.txt.html
- https://eiinspire[.]com/wp-includes/Text/Diff/Renderer/Diff/index[.]html