What is Detectify?

Modern PHP Security Part 1: bug classes

August 13, 2020

In this blog, security researcher and Detectify Crowdsource hacker Thomas Chauchefoin (@swapgs) and fellow security researcher Lena David (@_lemeda) discuss modern bug classes in PHP. Both researchers are working at Synacktiv, a French company highly skilled in the field of offensive security (penetration testing, reverse engineering, etc). This is part 1 in their guest blog series.

Part 2 covers breaching and hardening the PHP Engine.

Introduction

Despite its reputation, PHP still carries a very active ecosystem (packages, enterprise-grade CMS, huge performance gains on last versions). If you are a security auditor, you will most likely have to compromise PHP applications often deployed the DevOps way.

The goal of this blog series is to show you how to describe and provide actionable mitigations regarding few modern bug classes, circumvent PHP built-in security features, and finally harden your deployments.

Modern vulnerabilities classes

SQL injection

This class of vulnerability can be encountered when user-controlled input can be injected into SQL queries in a way that make them deviate from their intended behavior. It can have various consequences that range from the sole retrieval of database records to the ability to execute commands on the underlying system. Although the use of PDO and of the ORMs that come along with frameworks has made SQLIs less common than they used to be, they have not fallen into complete decay just yet, as shown by recent examples on Magento (double preparing of part of a query using specific conditions, cf. https://www.ambionics.io/blog/magento-sqli) and Laravel Query Builder (see https://stitcher.io/blog/unsafe-sql-functions-in-laravel and https://freek.dev/1317-an-important-security-release-for-laravel-query-builder).

This latter example is explained into more detail in the linked articles, but the principle is as follows: Laravel Query Builder provides a handy arrow notation to make it easy to query JSON data, and deals with transforming the statements using that notation into the corresponding SQL queries. As an example, statements such as the following:

$query = DB::table('users')->addSelect('biography->en');

Will be transformed into a query looking like:

SELECT json_extract(`biography`, '$."en"') FROM users;

The code responsible for carrying out this transformation relies on Laravel functions among which the following:

protected function wrapJsonPath($value, $delimiter = '->')
{
    return '\'$."'.str_replace($delimiter, '"."', $value).'"\'';
}

In particular, one can note that the above function does not take care for escaping single quotes. Thus, a single quote in arrow notation can be used to close the json_extractstatement and alter the semantics of the resulting query.

If the query, as defined in the PHP code, is something like the following:

$query = DB::table('users')->addSelect("biography->$_GET['lang']");

By setting the lang parameter to **"'), (select @@version) FROM users#, the resulting SQL query becomes:

SELECT json_extract('users', '$."**"'), (select @@version) FROM users#"') FROM users;

Which is tantamount to the following:

SELECT json_extract('users', '$."**"'), (select @@version) FROM users;

As a fix, further steps are now taken in the query builder by only allowing alphanumeric characters, and _  when dealing with column names (https://github.com/spatie/laravel-query-builder/commit/3aa483b63c79d9fabcb8653fe837a7736eb93bea).

This has also been fixed in Laravel 5.8.11 by ensuring that single quotes are properly escaped in the affected code snippets (cf https://github.com/laravel/framework/commit/be1896cbeb2e413615fb61791101f8b199e1bf3d). Additionally, a warning has been added to their documentation:

“PDO does not support binding column names. Therefore, you should never allow user input to dictate the column names referenced by your queries, including “order by” columns, etc. If you must allow the user to select certain columns to query against, always validate the column names against a white-list of allowed columns.”

The generic way of avoiding SQLIs in one’s application consists of never using user input directly when building queries. Instead you resort to stored procedures or prepared statements and variable binding, to ensure the provided parameters are considered as data rather than as SQL code and thus preventing them from affecting the query’s intended behavior.

Deserialization via phar://

This bug class is not new but was made popular by Sam Thomas (@_s_n_t ) when he spoke at BlackHat US 2018. It is based on the fact that PHP Phar archives headers (think of JAR archives but for PHP) have a field containing data serialized with PHP’s native format, as described on Phar File Format

phar archive

This serialization format allows encoding object instances. During the deserialization process, the engine will call the “magic” method __wakeup  on the fresh instance, and then __desctruct once the last reference to the instance is dropped. This behavior can sometimes be used to craft call chains with interesting behavior (see https://www.ei.ruhr-uni-bochum.de/media/emma/veroeffentlichungen/2014/09/10/POPChainGeneration-CCS14.pdf for a great paper about such chains).

It is very unlikely to find cases where an application would load an arbitrary Phar archive using Phar::loadPhar, but that’s where the magic happens. Internally, the engine will often use an abstraction named streams in I/O-related functions. It eases the work with remote resources, compression, etc in a unified and extensible way.

The available stream handlers can be listed with stream_get_wrappers(), and we can notice the presence of phar://:

php > var_dump(stream_get_wrappers());
array(12) {
  [0]=>
  string(5) "https"
  [1]=>
  string(4) "ftps"
  [2]=>
  string(13) "compress.zlib"
  [3]=>
  string(14) "compress.bzip2"
  [4]=>
  string(3) "php"
  [5]=>
  string(4) "file"
  [6]=>
  string(4) "glob"
  [7]=>
  string(4) "data"
  [8]=>
  string(4) "http"
  [9]=>
  string(3) "ftp"
  [10]=>
  string(4) "phar"
  [11]=>
  string(3) "zip"
}

Passing a URL starting with phar:// (and pointing to an actual archive planted on the server, even if the extension is not .phar) to any function internally using streams (eg. a function calling php_stream_open_wrapper_ex()) will allow to trigger the deserialization process when loading the archive. Such functions are very common, even where you wouldn’t expect them; below are some examples of vulnerable code patterns:

if(file_exists($_GET['file'] .'tpl')) { ... }

if(filesize($_GET['file']) > 1000) { ... }

if(md5_file($_GET['file']) == ...) { ... }

To be clear, the very requirement is to control the prefix of the path passed to one of the vulnerable functions to force the use of phar://, while the suffix won’t be an issue.

For a nice example of real-life exploitation of such bugs, check out the report of HackerOne submission #410237 by @stevenseeley. An attentive reader will notice that SMB shares can be used on Windows servers to avoid having to plant a file on the local filesystem first 😉

If your environment does not require Phar archives (eg. you use composer using the CLI but not through the web server), you can call stream_wrapper_unregister('phar');to prevent the use of this wrapper by attackers.

Arbitrary instantiation

Meta-programming allows, among other things, instantiating a class by its name. This feature is very powerful and is often used in application routers to map parts of the URL to controllers and automatically call a given method (eg. reaching /foo/bar will instantiate the class Foo and call the method bar on it).

It is not obvious to know if meta-programming can lead to security issues. In fact, it can, and the engine even exposes interesting classes by default!

The exploitation of arbitrary instantiations is very similar to arbitrary deserializations, but instead you should look for interesting chains starting with __construct and __destruct instead of __wakeup and __destruct (other magic methods may be involved depending on method calls, use of attributes, etc.) Beware of the number of arguments passed to the constructor, though, as it may raise an exception.

A first interesting class is finfo, a wrapper around libmagic, which is used to guess one file’s format. The constructor takes two arguments, the second pointing to the magic database to use. If this parameter does not point to a valid database, its contents will be printed with PHP Notices:

php> new finfo(0,'/etc/passwd');
PHP Notice: finfo::finfo(): Warning: offset `root:x:0:0:root:/root:/bin/bash' invalid in php shell code on line 1 
PHP Notice: finfo::finfo(): Warning: offset`daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin' invalid in php shell code on line 1
(...)

SimpleXMLElement is another interesting finding. It accepts between 1 and 5 arguments, and the first one will automatically be parsed as a XML document by the constructor. Parsing XML documents is notoriously unsafe if the processing of external entities is not disabled. For PHP, it mostly depends on libxml’s version linked to the interpreter. This is still actionable in real-life scenarios, as proved by RIPSTech for their Shopware <= 5.3.4 exploit (https://blog.ripstech.com/2017/shopware-php-object-instantiation-to-blind-xxe/).

Depending on the interactions on the instance, you may be able to find other chains. For instance, SplFileObject takes between 1 and 4 arguments and will allow disclosing the first line of a file if the instance is echo’ed:

php> echo new SplFileObject('/etc/passwd');
root:x:0:0:root:/root:/bin/bash

It should be noted that SplFileObject internally uses a stream, so stream wrappers (php://, phar://, …) can be used, like mentioned in this write up by pwntester: https://www.pwntester.com/blog/2014/01/17/hackyou2014-web400-write-up/.

It would be really nice to have a tool like PHPGGC but for arbitrary instantiation vulnerabilities. This is “left as an exercise to the reader” 😉

As often, the best way to avoid introducing such vulnerabilities is to list the explicitly allowed classes before trying to instantiate them.

Server-side request forgery

Server-side request forgery occurs when a third party controls a URL that will be accessed by the application. This commonly results in the user being able to either scan and map the internal network or reach internal resources not intended to be public.

In the context of cloud environments, another potential consequence consists in the user being able to access information about the server’s configuration. Indeed, metadata about the running instance can often be accessed using endpoints available solely on a link-local IP address — or, put differently, solely from the running instance and in particular not from any user’s computer.

For instance, such endpoints include the following in the case of Amazon EC2:

  • http://169.254.169.254/latest/meta-data/
  • http://169.254.169.254/latest/user-data/

Similarly, metadata about a DigitalOcean Droplet are available at: http://169.254.169.254/metadata/v1.json.

An SSRF on a cloud-hosted application will typically allow retrieving elements such as NPM secrets or SSH keys — and, depending on the specific context, various other secrets which are, needless to say, intended to remain private — thus defeating the very purpose of exposing the aforementioned endpoints only locally. It should be noted that lately, some cloud hosts like AWS have implemented new means to make SSRF attacks far more difficult to carry out.

Another interesting situation occurs when an application suffers from an SSRF and is run using FastCGI (as in, most PHP applications). A typical setup consists of using nginx alongside PHP-FPM, as already described last year in another Detectify Blog post. The latter listens either on a local port (9001) or on a UNIX socket (/var/run/php-fpm*.sock). The general idea is that when the web server receives a FastCGI request, it encodes it in the FastCGI format — which basically consists of headers, the request’s context, and then its body, forwards it to PHP-FPM, which starts a PHP worker, runs the script pointed by the request, and sends back the result of the execution within a FastCGI response to the web server.

Now, if an SSRF affects the server, it becomes possible to reach the listening service directly, and thus to have an arbitrary PHP script executed. For instance, this could be achieved using the following snippet (Client.php can be found in https://github.com/adoy/PHP-FastCGI-Client) to generate a valid FastCGI payload than will call the script /var/www/index.php (use nc -lvp 1234 to dump the request):

<?php require ‘Client.php’; $client = new Client('localhost:1234', -1); echo $client->request(
    	array(
        	'GATEWAY_INTERFACE' => 'FastCGI/1.0',
        	'REQUEST_METHOD' => ‘GET’,
        	'SCRIPT_FILENAME' => '/var/www/index.php',
        	'SERVER_ADDR' => '127.0.0.1',
        	'SERVER_PORT' => '80',
        	'SERVER_NAME' => 'test',
        	'SERVER_PROTOCOL' => 'HTTP/1.1',
        	'CONTENT_LENGTH' => 0,
    	),
    	null,
	);

It is also possible to put php.ini directives in the context of the FastCGI request, and in particular to set auto_prepend_file to php://input and to put PHP code directly in the request’s body to have it executed on the server. An example can be found at https://github.com/tarunkant/Gopherus/blob/master/scripts/FastCGI.py. It should however be noted that the entry SCRIPT_FILENAME has to point to an existing file.

Practically, the feasibility of this depends on the ability for the used wrapper to handle raw data. For instance, the FastCGI.py script linked above relies on the gopher wrapper, which should be fine with any data submitted. Conversely, this may not be the case with the http:// wrapper, which may have issues handling payloads containing characters such as new lines (%0a).

Avoiding SSRFs involve different approaches depending on the kind of resource that is to be fetched. More specifically, if the requested resource is intended to be internal — for instance, another application on an adjacent server — a list can be used to define which resources can be accessed. If conversely, the requested resource is intended to be external, it will likely not be feasible to use an allowlist.

In that case, one should at least ensure that the URL or IP address provided by the user does not resolve to an internal host; even though this is generally not considered as optimal, a blocklist may be considered here. Something important to note here is that it is not sufficient to resolve the requested domain only once, before actually issuing the request. Indeed, a second resolving can occur when the request is sent, and the resulting IP address may be different the second time.

Server-side template injection

SSTIs happen when user-supplied data is unsafely embedded into a server-side template: if the user input contains a template expression, the latter will be executed when the template is rendered. This generally leads to code execution on the underlying server.

Such a vulnerability was found in the SEOmatic plugin for Craft CMS in 2018, and resulted from the way canonical URLs were built and the fact they were reflected in a response header. Practically, using internals of the Craft CMS, it was possible to retrieve elements such as the password granting access to the underlying database.

The main way to safeguard one’s application against this class of vulnerability is to avoid using data supplied by users directly when creating templates, and to pass user input as parameters to the template instead. It should also be noted that some templating engines such as Mustache are designed not to allow code execution, which also helps reduce the risk of such security issue to occur.

This article is a transcription of the content we presented at Sec4Dev 2020 and Detectify Hacker School Online #9. If you missed these talks or if you prefer blog posts instead of videos, enjoy!

Continue to Part 2, where we will discussed Engine Bugs and how to harden your systems.

About the authors:

Lena David (@_lemeda) and Thomas Chauchefoin (@swapgs) are Security Researchers working at Synacktiv. Thomas is also a Detectify Crowdsource hacker. They both are interested in web technologies security and have had the possibility to practice it for several years of penetration testing and red team engagements.


Do you have what it takes to join the Detectify Crowdsource ethical hacker community? Send us your application and try the challenge to find out!

Detectify automates the knowledge of the best ethical hackers in the world to secure websites against 2000+ known vulnerabilities beyond OWASP Top 10. With Detectify, users monitor subdomains for potential takeovers and remediate security bugs in staging and production as soon as they are known, to stay on top of threatsStart a free 14-day trial today.