TL/DR: Crowdsource hacker Luke “hakluke” Stephens documents a tool for discovering the origin host behind a reverse proxy which is useful for bypassing WAFs and other reverse proxies.
We’ve all been there; you settle down into your lovely comfy office chair with a perfectly warm coffee, ready to start hacking a new web application. You instinctively throw ‘”><img src=x onerror=alert()> into the search bar, and are immediately greeted with this familiar page.
You have been blocked by a Web Application Firewall (WAF). In this case, Akamai. What now? Do you quit? No! Hackers don’t quit, they hack!
Common WAF Implementation
The most common WAF implementation that I see is simply implementing the WAF as a reverse proxy, as shown in the diagram below:
The redirection of traffic through the WAF is usually achieved using DNS. The hostname will typically be a CNAME or A record that points to the WAF, and then the WAF can determine which origin (web server) to send the request to based on the Host header.
Here’s the problem, if the WAF needs to access the web server, then it needs to be accessible over the internet by the WAF. Oftentimes, when an origin web server is set up, it can be accessed directly by anyone on the internet, not just the WAF. In this case, all we need to bypass the WAF is the direct IP address of the origin web server, then the traffic flow would look something like this:
Finding the IP address
There are a bunch of different ways to potentially find the origin IP address including:
- Scouring historical DNS records from a service such as SecurityTrails
- Abusing a SSRF in the web application
- Information disclosure (for example, in error messages)
In this blog post, we’ll be exploring a different method. For this example, we will be using tesla.com as the target – but I will not be revealing any sensitive information. A quick dig of the target reveals a few IP addresses.
hakluke$ dig tesla.com +short
Passing these IP addresses to IPInfo reveals that the IP addresses are owned by, you guessed it, Akamai.
If I was looking to find the origin IP address of the tesla.com web server, the first thing I would do is get a list of IP addresses associated with the organization. Again, there are many ways to do this, but one such way is to look up the ASN details of the organization. For demonstration purposes, I used the HackerTarget ASN lookup tool at https://hackertarget.com/as-ip-lookup/. There are a few different organizations with “Tesla” in the name, but “Tesla, US” is the one we are after.
The results reveal a series of IP addresses associated with Tesla.
Great! Now we have a list of IP addresses, one of which might be the origin server of tesla.com! The next step is to systematically check all of these IP addresses to see if they return the tesla.com website. Unfortunately, there are a few things that make this difficult to do, namely:
- Navigating to the IP address directly may not actually return the correct website because many web servers employ virtual hosts.
- There are thousands of IP addresses to sort through, it would take too long to do this manually.
- We can’t directly compare the original response with the IP response because many pages will return slightly different responses on every load (for example, nonces).
Fortunately, there are solutions to all of these problems!
- We can add the Host header to every request containing the original hostname, which should return the correct website, even if we are accessing the IP address directly.
- We can write a tool to do this for us over thousands of hosts (I already have!)
- Instead of comparing responses byte-for-byte, we can use the Levenshtein algorithm to determine similarity!
Hakoriginfinder is a golang tool for discovering the origin host behind a reverse proxy, it is useful for bypassing WAFs and other reverse proxies. You supply it with a list of IP addresses (via stdin) along with a hostname, and it will make HTTP and HTTPS requests to every IP address, attempting to find the origin host by comparing the responses with the response of the real website, and finding similar responses by using the Levenshtein algorithm.
You can see the tool here: https://github.com/hakluke/hakoriginfinder
The best remediation to this type of WAF bypass is to whitelist the IP addresses of your WAF provider on the web server. The origin server should not be accessible from anywhere except the WAF, which forces everyone to use the application through the WAF even if they know the origin server’s IP address.
WAF providers will usually list the IP addresses that need to be whitelisted in their documentation. For reference, here are the relevant links for Akamai, Imperva and CloudFlare.
Luke Stephens a.ka. hakluke. Currently living on the Sunshine Coast, in Australia, I recently resigned from my role as the Manager of Training and Quality Assurance for Bugcrowd to start my own consultancy, Haksec. I do a lot of penetration testing and bug bounties and create content for hackers. Check out my Youtube channel.
Detectify is building web app security solutions that are automated and crowd-based. By collaborating with ethical hackers, business-critical security research is put into the hands of those who need it most to bring safer web apps to market. Curious to see what we will find in your live web apps? Start a free 2-week trial today.
Want to join the Crowdsource community? You’ll have to pass this challenge.