What is a Prototype Pollution vulnerability and how does page-fetch help?

DetectifyJun 08, 2021

Prototype Pollution is a problem that can affect JavaScript applications. That means both applications running in web browsers, and under Node.js on the server-side, but today we’re going to focus on the web side of things.

We’ll also take a look at page-fetch: a new open source tool released by the Detectify Security Research team that can, amongst other things, help you hunt for prototype pollution issues in the wild! Jump to tool.

Prototypes

Before we can talk about Prototype Pollution, we should probably start with what a Prototype is. JavaScript, like many languages, has objects: a set of keys and values grouped together:

var myObject = {
    name: "My super awesome object",
    isAwesome: true
    rating: 10,
}

Those values can be basic types like numbers or strings, but also functions, arrays, or other objects. It’s pretty common in the world of Object Oriented Programming to want one object to be a “descendant” of another; the descendant inheriting the properties of its parent. Let’s look at an example:

var webPage = {
    title: "Detectify Labs",
    navigation: ["/home", "/search", "/contact"],
    content: "Welcome to Detectify Labs!"
}

var blogPost = {
    title: "Detectify Labs",
    navigation: ["/home", "/search", "/contact"],
    content: "This is a super interesting blog post...",
    comments: ["Nice work", "Best thing I ever read!"]
}

These two objects, webPage and blogPost, have a lot in common. The title and navigation properties are both identical! Wouldn’t it be nice if that data didn’t have to be duplicated? The good news is that JavaScript lets us “connect” one object to another.

Let’s rewrite the code above and make it so that the webPage object is the prototype of the blogPost object:

var webPage = {
    title: "Detectify Labs",
    navigation: ["/home", "/search", "/contact"],
    content: "Welcome to Detectify Labs!"
}

var blogPost = {
    content: "This is a super interesting blog post...",
    comments: ["Nice work", "Best thing I ever read!"]
}

blogPost.__proto__ = webPage

Note that we didn’t include the title or navigation parts in the blogPost object this time, but we did do this extra bit:

blogPost.__proto__ = webPage

That __proto__ property is special! It’s a link to the “prototype” of an object, and by assigning the webPage object to it, the webPage object becomes the prototype of the blogPost object. That means if we try to access properties like title or navigation on the blogPost object, JavaScript realises that they don’t exist and so looks at the prototype object’s properties instead:

// This prints "Detectify Labs" to the console - the value came from
// the blogPost object's prototype: the webPage object!
console.log(blogPost.title)

One thing that’s worth noting here: we “overrode” the “content” property. So if we look at the content property of the blogPost object, we’ll see:

// This prints "This is a super interesting blog post..." to the console
console.log(blogPost.content)

Now, if you happen to be a professional JavaScript developer you might be cringing a little, or perhaps even penning an email to let me know that this isn’t the right way to do things – and you’d be right!

MDN covers this better than I could hope to here, but this functionality is technically deprecated, and not really the right way to write modern JavaScript code. For our purposes though, knowing just this way of doing things is what we need in order to understand Prototype Pollution.

The Global Object

When JavaScript runs in a web browser, there’s always this thing lurking in the background called the Global Object, and its name is “window”.

When you make a new variable, like this one:

var myVariable = 123

It’s actually a property of the global object:

var myVariable = 123

// This prints "123" to the console
console.log(window.myVariable)

This is relevant to our interests because by default all objects share the same prototype, including the global object. The interesting part for us is: that means if we are able to set properties on the prototype of one object, we can affect the properties of all other objects, the window object included!

As an example, if we set a property on the prototype of an empty object, that property becomes available on the window object:

var emptyObject = {}
emptyObject.__proto__.myProperty = "my value"

// Prints "my value" to the console
console.log(window.myProperty)

// Also prints "my value" to the console
console.log(myProperty)

// Even this prints "my value" to the console too!
var newObject = {}
console.log(newObject.myProperty)

Having control over the properties of other objects, especially the window object, gives us a great deal of possibilities for vulnerabilities to occur – and we’ll look at some examples in a while – but first: how could this happen anyway?

Vulnerable Code

One of the most common places for Prototype Pollution to rear its ugly head is in processing the query string. You’re probably fairly used to seeing query strings like this one:

?id=456123&theme=dark&food=cheese

They’re used on just about every web application there is to provide user input. Often they are processed by something like PHP or Ruby on the server-side, but they can be just as easily processed by JavaScript running in the browser too.

A query string like the one above would often be turned into a JavaScript object like this one so the application can more easily refer to the values:

{
    id: 456123,
    theme: "dark",
    food: "cheese"
}

That’s all fine, but sometimes values are a little more complex than that. You might have seen a query string that looks more like this:

?user[id]=456123&user[food]=cheese&theme=dark

This notation lets us group values together; in this case the id and food parameters are “grouped” so that they pertain to a user. Ideally we’d like a query string like that to be turned into a JavaScript object like this:

{
    user: {
        id: 456123,
        food: "cheese"
    },
    theme: "dark"
}

Here’s some code that performs that very transformation:

// This is the object we wish to create :)
const query = {}

const u = new URL(location)
for (const [key, value] of u.searchParams){

	if (!key.includes('[')){
		query[key] = value
		continue
	}

	// We have a key/value pair like k1[k2]=value
	const [k1, k2] = key.split('[').map(kn => kn.replace(']', ''))

	if (query[k1] == undefined){
		query[k1] = {}
	}

	// This could be a problem!
	query[k1][k2] = value
}

Don’t worry too much if you get a little lost in the code. The part we really need to pay attention to is this:

// This could be a problem!
query[k1][k2] = value

Given a query string key/value pair like user[id]=456123, k1 becomes user, k2 becomes id, and value becomes 456123, equivalent to this:

query["user"]["id"] = "456123"

You may be able to see how this could be an issue. If we provided a query string that swapped user for __proto__, the id property would be set on the prototype of the query object – the same one shared by all other objects by default, including the global window object.

As for why that could be a problem: let’s add a little more code:

const messages = {
	error: "Error: something went wrong",
	success: "Everything worked as expected :)"
}

if (query.message != undefined){
	document.querySelector('#message').innerHTML = messages[query.message]
}

This is a fairly normal looking bit of code: use a user supplied value to decide which of a predefined list of messages to display. Under normal circumstances it would be perfectly secure too, but that is not the case in the presence of Prototype Pollution.

Because the messages object shares a prototype with the query object, we can pollute it with any value we desire – say, an XSS payload perhaps.

If we were to provide a query string that looked like this for example, we would have the ability to run arbitrary JavaScript in the context of the webpage in question:

?__proto__[payload]=<img%20src%20onerror=alert(document.domain)>&message=payload

The payload parameter pollutes the prototype, making it available on the messages object, and the message parameter is used to choose our payload to be displayed and ultimately have our JavaScript executed.

This was just a small example, but there are quite a few JavaScript libraries in the wild that are vulnerable to Prototype Pollution. We’d like to give a big shout-out to Sergey Bobrov for collating a really great list of some of them here.

Detection

“Traditional” web vulnerability scanning (if there is such a thing) tends to work by sending requests to web servers, and analysing the response – be it HTML, JSON, XML, or something else entirely. This can work just fine for vulnerabilities such as reflected XSS, where a user-supplied parameter is reflected in the response without adequate output escaping, because the response changes when the user input changes. This is often not the case for client-side vulnerabilities like DOM XSS and Prototype Pollution.

To detect client-side vulnerabilities we need to use a browser to actually run the JavaScript in a page. It would be fairly time-consuming to do that manually, but thankfully it’s relatively easy to automate such things with a headless browser.

Detectify Deep Scan has security tests to find Prototype Pollution vulnerabilities in your web apps.

Tools for prototype pollution and client-side vulnerabilities

Of course, Detectify can already find these issues for you if you’re running the Deep Scan DAST scanner, and we’ve already found hundreds of prototype pollution issues for our customers!

If you’re a pentester, bug bounty hunter, or security researcher you’re in luck too. We’ve just released a tool called page-fetch that you can, amongst other things, use to help you look for prototype pollution and other client-side issues. This tool follows up the web scanner, Ugly Duckling, which Detectify released for the ethical hacker community.

Page-fetch

Page-fetch is a fairly simple tool. It takes a list of URLs as its input, fetches them using a headless Chrome browser, and stores a copy of every response it saw – all the JavaScript files, CSS files, images, API requests etc.

Having a local copy of these resources makes it a great deal easier to search through them, use them to build custom word lists or whatever else you might want to do. There’s filters to exclude third-party requests, save only third party requests, and to include or exclude requests based on their content-type.

The bit that’s interesting to us here though, is that you can also provide a snippet of JavaScript that will be executed on every web page that is fetched. The return value of that JavaScript snippet will be displayed in the tool’s output.

Let’s look at how to install the tool and use it to detect a prototype pollution vulnerability.

How it works

Page-fetch is written in Go. The easiest way to install it is with go get:

▶ go get github.com/detectify/page-fetch

Provided you have Go set up properly, you shouldn’t have to do anything else other than making sure you have a Chrome or Chromium browser installed.

The most simple way to use the tool is to echo a single URL into it, like this:

▶ echo https://detectify.com | page-fetch 
GET https://detectify.com/ 200 text/html; charset=utf-8 
GET https://detectify.com/site/themes/detectify/css/detectify.css?v=1619009098 200 text/css 
GET https://detectify.com/site/themes/detectify/img/detectify_logo_black.svg 200 image/svg+xml 
GET https://fonts.googleapis.com/css?family=Merriweather:300i 200 text/css; charset=utf-8 
...

The output shows every request that was made while rendering the input URL, and the response for each request is stored in a directory called out by default:

▶ tree out
out
├── detectify.com
│   ├── assets
│   │   └── images
│   │       ├── customerlogos
│   │       │   ├── epi_logo_startpage.svg
│   │       │   ├── epi_logo_startpage.svg.meta
│   │       │   ├── grammarly_logo_startpage.svg
│   │       │   ├── grammarly_logo_startpage.svg.meta
...

For each file stored there’s also a .meta file that contains the request and response headers, the HTTP method that was used etc.

To run JavaScript on each page, we can use the -j or --javascript option. As an example, let’s use it to pull out the page title on detectify.com:

▶ echo https://detectify.com | page-fetch -j 'document.querySelector("title").innerText' | grep ^JS
JS (https://detectify.com): Website vulnerability scanner - scan web assets | Detectify

The return value is printed on a line starting with JS so that it’s easy to grep for.

To look for prototype pollution then, all we need to do is pick a payload to try in the query string of our input URL, and then test to see if the value was set as we expected:

▶ echo 'http://poc-tools-storage.s3-website.eu-west-2.amazonaws.com/pp.html?__proto__[testparam]=testval' | page-fetch -j 'window.testparam == "testval"? "vulnerable" : "not vulnerable"' | grep ^JS
JS (http://poc-tools-storage.s3-website.eu-west-2.amazonaws.com/pp.html?__proto__[testparam]=testval): vulnerable

Our test code just checks to see if window.testparam is equal to testval, and if it is: returns the string vulnerable, and returns not vulnerable otherwise. Let’s check a known non-vulnerable page to make sure we aren’t seeing a false-positive:

▶ echo 'https://example.com?__proto__[testparam]=testval' | page-fetch -j 'window.testparam == "testval"? "vulnerable" : "not vulnerable"' | grep ^JS
JS (https://example.com?__proto__[testparam]=testval): not vulnerable

All looks to be well! Example.com doesn’t load any JavaScript at all so we can be pretty confident that no prototype pollution vulnerabilities exist there!

Hopefully you’ve now got a pretty good idea of what prototype pollution is, why it matters, and how it can be detected. Until next time!

Detectify

Complete External Attack Surface Management for AppSec and ProdSec teams.

Check out more content

Security guidance

Reducing the attack surface in AWS

The smaller our attack surface, the fewer things we need to worry about. An excellent way of reducing the attack surface (and our cognitive load) is using AWS Service Control Policies (SCPs.) In this post, I’ll describe how we approached it.

March 26, 2024

Security guidance

SSRF vulnerabilities and where to find them

It’s no secret that cloud architectures have several characteristics that make SSRF attacks challenging to defend against. While SSRFs are not a new threat vector, …

September 23, 2022

Security guidance

Leveraging AWS QuickSight dashboards to visualize recon data

TL/DR: AWS QuickSight makes it easy to build visualizations, perform ad-hoc analysis, and quickly get business insights from their data, anytime, on any device. Hacker …

May 30, 2022

Security guidance

How I found the Grafana zero-day Path Traversal exploit that gave me access to your logs

TL/DR: On December 2, open-source analytics solution Grafana released an emergency security patch for critical zero-day Path Traversal vulnerability CVE-2021-43798, after proof-of-concept code to exploit …

December 15, 2021