Skip to main content

XPath injection

Need

Implementation of proper input validation and sanitization to prevent XPath injection attacks.

Context

  • Usage of PHP 5.0+ for server-side scripting and web development
  • Usage of DOMDocument for HTML parsing and manipulation
  • Usage of DOMXPath for querying XML documents using XPath expressions

Description

Non compliant code

public function searchUsers($searchTerm)
{
$xml = new DOMDocument;
$xml->load('users.xml');
$xpath = new DOMXPath($xml);
$users = $xpath->query("//user[contains(name,'$searchTerm')]");

foreach($users as $user)
{
echo $user->nodeValue, PHP_EOL;
}
}

This function searchUsers($searchTerm) is designed to search for users in an XML file based on a search term provided by the user. The function uses the DOMDocument and DOMXPath classes to load the XML file and execute an XPath query.

The XPath query is "//user[contains(name,'$searchTerm')]", which searches for <user> elements whose <name> child element contains the search term.

The problem here is that the $searchTerm variable is directly inserted into the XPath query without any sanitization or validation. This means an attacker could provide a specially crafted search term that alters the XPath query to access data they shouldn't be able to, or even perform actions on the XML document that the function wasn't designed to allow. This is known as an XPath Injection vulnerability.

For example, an attacker could provide the search term ') or '1'='1 which would result in the XPath query "//user[contains(name,'') or '1'='1]". This would return all <user> elements, regardless of their <name>, effectively bypassing any search term restrictions.

The vulnerability could be even more dangerous if the XML document contained sensitive data and the attacker knew the structure of the document, as they could craft an XPath query to access this data.

Steps

  • Avoid using dynamic XPath queries. Instead, use parameterized XPath queries.
  • If dynamic XPath queries are unavoidable, ensure to sanitize the input to prevent XPath injection. This can be done by escaping special characters.
  • Use a safe API which avoids the use of the interpreter entirely or provides a parameterized interface.
  • Use least privilege principle when connecting to XML databases. Only use the permissions necessary to perform the operation.
  • Use a web application firewall to detect and block XPath injection attacks.

Compliant code

public function searchUsers($searchTerm)
{
$xml = new DOMDocument;
$xml->load('users.xml');
$xpath = new DOMXPath($xml);
$searchTerm = $xpath->createTextNode($searchTerm); // Sanitize the input
$users = $xpath->query("//user[contains(name,'{$searchTerm->wholeText}')]");

foreach($users as $user)
{
echo $user->nodeValue, PHP_EOL;
}
}

The original code was vulnerable to XPath injection because it directly used the $searchTerm in the XPath query without any sanitization or validation. This means that if the $searchTerm contains any XPath expressions, they would be evaluated by the XPath engine, potentially leading to unauthorized data access or other security issues.

The fixed code mitigates this vulnerability by using the createTextNode method of the DOMXPath object to sanitize the $searchTerm. This method creates a new text node which automatically escapes any special characters that could be interpreted as XPath expressions. The sanitized $searchTerm is then used in the XPath query.

This way, even if the $searchTerm contains XPath expressions, they would be treated as plain text and not evaluated by the XPath engine, thus preventing XPath injection.

In addition to this code-level mitigation, it's also recommended to use a web application firewall to detect and block XPath injection attacks, and to follow the principle of least privilege when connecting to XML databases.

References