Paper Info
- Paper Name: Catching Transparent Phish: Analyzing and Detecting MITM Phishing Toolkits
- Conference: CCS ‘21
- Author List: Brian Kondracki, Babak Amin Azad, Oleksii Starov, Nick Nikiforakis
- Link to Paper: here
- Food: Pear and hazelnut torta Caprese
Paper Summary
MITM phishing toolkit is a new type of phishing toolkit that serves as a malicious reverse proxy between victims and impersonated servers. Since the toolkits behave as reverse proxies, attackers can see and steal victims’ sensitive information, such as cookies, from the communication between victims and servers. Compared with traditional phishing toolkits, MITM phishing toolkit offers a new feature that enables victims to browse the impersonated servers freely even after the sensitive information is extracted, which makes it harder for victims to detect anomalies. Thus, they are more stealthy compared with traditional phishing toolkits.
This paper is the first work to systematically investigate MITM phishing toolkits. It identifies the fundamental nature of MITM phishing toolkits: they are reverse proxies and turns the problem of identifying MITM phishing websites into identifying websites behaving like reverse proxies based on the assumption that few benign websites use this architecture. The authors find two general features for this purpose: unusual RTT ratio at network stack and TLS libraries used in the websites. They then collect abundant data and build a machine-learning model to identify MITM phishing websites. The machine learning model demonstrates its ability to identify MITM phishing websites with high accuracy and precision. Since the features used in the ML model are at Transport and IP layers, their approach does not rely on the content returned back from the websites. As a result, the approach is not affected by the cloaking mechanism widely used by phishing toolkits.
The authors deployed their tools, called PHOCA, to detect MITM phishing websites in the wild. In a year’s investigation, PHOCA detected 1220 verified MITM phishing websites. The authors then investigated their behaviors and discovered their unique characteristics such as that the servers tend to be reused and they tend to survive longer before being marked as phishing websites. The authors then collaborated with a company: Palo Alto Networks(PAN), and discovered realworld MITM phishing attacks against its users. It also demonstrated that PAN’s infrastructures cannot detect 42.4% MITM phishing websites discovered by PHOCA. One of the reasons was the cloaking mechanism used by the phishing websites. Recognizing PHOCA’s capability, PAN is integrating PHOCA into its infrastructure in order to discover MITM phishing websites in the future.
Discussion
The discussion was mostly highlighting the positives of the paper and the methods proposed. One concern that was raised was the number of sites that may have been missed from the data set. Based on the approach proposed in the paper, attackers can hide phishing websites by avoiding appearing on any phishing black lists/CA/Certificate transparency logs. This can be done in a variety of ways including avoiding https domains, generating benign domains with malicious sub-domains etc. However, this question of coverage does not affect the method proposed in the paper.
Another point of discussion was regarding the publication of the detection methods. An attacker could possibly read the paper and cloak their phishing toolkits accordingly to avoid detection. The paper does address this scenario by mentioning that, even after removing the top 150 most important features, their detections maintained an accuracy greater than 97%. Also, similar to what the attacker does, the detection mechanism can also be cloaked by inserting delays to the fingerprint collection or even distributing the process across different machines.
A soft criticism of the paper that was raised mentioned the absence of a list of the top features that the model uses. It would be interesting to note whether the TLS features or network timing features are stronger.
Another question raised a point about the ground truth of the detections. For example, some phishing websites could be generated by companies to pen-test their own employees or honeypots set up by other researchers.
From the Authors
The authors began their project from exploring open-source phishing toolkits and how they’re actually used by attackers. To do so they needed a way to detect them and ultimately took inspiration from a master’s thesis they found.
Initially they were able to detect if a phishing kit was being used purely using network features, but they needed to add in extra features to differentiate between different phishing toolkits. They found features that are unique to each tool-kit (e.g. muraena will use a random 5-letter string for the cookie name, evilginx can be used as an open proxy). However, reviewers didn’t like this due to a contradiction that they didn’t initially see. They described network-level features as difficult to bypass so their detection was solid, but were using easily changeable features, such as the muraena random cookie, to fingerprint these toolkits. Ultimately, they decided to fall back to a generalized classifier that determines whether a site is a phishing site or not.
However, they still had pushback from the reviewers. An important realization was in how they spin the paper. In this case it was how they described the toolkits. Originally they described the toolkits as “2fA toolkits” which confused the reviews, who were convinced that the “2fa” was the phishing angle and not the MITM. The reviewers were skeptical as to whether they were properly detecting the 2fa phishing mechanism, so they completely removed the 2fa aspect of the paper and only mentioned it as a benefit of the toolkits for a better spin.
One question that came up was about alerting the phishers via the suspicious TLS requests that are made. Out of the box it generates a lot of connections to determine what library a site is using, which is what the paper uses, but you could easily filter this down to the libraries that the phishing toolkits are using.
There were several challenging aspects of this paper. The author briefly mentioned that creating a fingerprinting method for the different toolkits was challenging. Dealing with reviewer feedback also proved difficult. Some reviewers would misunderstand aspects of the paper such as the use of Facebook’s API to gather phishing urls which a reviewer had thought was a detection API for phishing sites. Having a company’s support helped ground the paper. Reviewers didn’t think this was a real problem initially with a small number of detected sites, but once provided Palo Alto’s data corroborating the findings, the reviewers changed their minds.
An overlooked, but important part of the paper was creating a name for the framework. It was done at the end of the paper process, but the first two submissions had a generic name making it difficult to use in future works before ultimately creating one.