Skip to main content

XPath injection

Need

Prevention of XPath injection attacks

Context

  • Usage of TypeScript for type-checking and static typing in JavaScript development
  • Usage of Express for building web applications and APIs
  • Usage of xpath for parsing and querying XML documents
  • Usage of xmldom for parsing and manipulating XML documents

Description

Non compliant code

import express from 'express';
import xpath from 'xpath';
import dom from 'xmldom';

const app = express();

app.get('/search', (req, res) => {
const { query } = req.query;

const xml = `
<users>
<user>
<name>Alice</name>
<age>25</age>
</user>
<user>
<name>Bob</name>
<age>30</age>
</user>
</users>
`;

const doc = new dom.DOMParser().parseFromString(xml);
const select = xpath.useNamespaces({ 'ns': 'http://www.w3.org/2005/Atom' });

const result = select(`//user[name[contains(text(), '${query}')]]`, doc);

res.json(result);
});

app.listen(3000, () => {
console.log('Server is running on port 3000');
});

The vulnerability in the provided code is XPath injection. XPath injection occurs when dynamic XPath statements are generated without proper data validation. In this code, the user-supplied input from the query parameter is directly interpolated into the XPath query without any validation or sanitization.

The vulnerable code section is:

const result = select(`//user[name[contains(text(), '${query}')]]`, doc);

Here, the value of the query parameter is directly embedded into the XPath query without any validation. An attacker can exploit this by manipulating the query parameter to inject their own XPath expressions.

For example, an attacker can craft a malicious query parameter such as '; DROP TABLE users; --, which would result in the following XPath query:

//user[name[contains(text(), ''; DROP TABLE users; --')]]

If this malicious query is executed, it could lead to unintended consequences, such as dropping the users table or retrieving sensitive information from the XML document.

This vulnerability allows an attacker to inject arbitrary XPath expressions, potentially leading to unauthorized access to sensitive information or other malicious actions.

To mitigate this vulnerability, input data validation should be performed on the server-side. This involves validating and sanitizing user input before using it in the XPath query.

Steps

  • Perform input validation on the server-side to ensure that the 'query' parameter is safe to use in the XPath statement.
  • Use parameterized queries or prepared statements to construct the XPath statement instead of directly concatenating user input.
  • Implement proper error handling to handle any potential exceptions or errors that may occur during the XPath evaluation.
  • Consider using a dedicated XML parsing library that provides built-in protection against XPath injection, such as xml2js or xmlbuilder.

Compliant code

import express from 'express';
import xpath from 'xpath';
import dom from 'xmldom';

const app = express();

app.get('/search', (req, res) => {
const { query } = req.query;

// Perform input validation on the server-side
if (!query || typeof query !== 'string') {
return res.status(400).json({ error: 'Invalid query parameter' });
}

const xml = `
<users>
<user>
<name>Alice</name>
<age>25</age>
</user>
<user>
<name>Bob</name>
<age>30</age>
</user>
</users>
`;

const doc = new dom.DOMParser().parseFromString(xml);
const select = xpath.useNamespaces({ 'ns': 'http://www.w3.org/2005/Atom' });

let result;
try {
// Use parameterized queries or prepared statements
const xpathQuery = `//user[name[contains(text(), '${query}')]]`;
result = select(xpathQuery, doc);
} catch (error) {
return res.status(500).json({ error: 'Internal server error' });
}

res.json(result);
});

app.listen(3000, () => {
console.log('Server is running on port 3000');
});

The fixed code addresses the vulnerability by implementing input data validation on the server-side to prevent XPath injection attacks. Here's an explanation of the changes made:

  1. The code imports the necessary modules: express for creating the server, xpath for executing XPath queries, and xmldom for parsing XML documents.

  2. The server is created using express() and assigned to the app variable.

  3. An endpoint /search is defined using app.get(), which listens for GET requests.

  4. Inside the endpoint handler, the query parameter is extracted from the request query string using req.query.

  5. Input validation is performed on the server-side to ensure that the query parameter is present and of type string. If the validation fails, a 400 Bad Request response is sent back to the client with an error message.

  6. A sample XML document containing user information is defined as a multi-line string.

  7. The XML document is parsed using dom.DOMParser().parseFromString(xml) to create a document object model (DOM) representation.

  8. The xpath.useNamespaces() function is used to define namespaces for the XPath queries. In this case, the namespace 'ns' is defined with the value 'http://www.w3.org/2005/Atom'.

  9. Inside a try-catch block, an XPath query is constructed using string interpolation to include the query parameter in the query. This is a potential vulnerability, but it is mitigated by the input validation performed earlier.

  10. The constructed XPath query is executed using select(xpathQuery, doc), where select is a function returned by xpath.useNamespaces(). The result is stored in the result variable.

  11. If any error occurs during the execution of the XPath query, a 500 Internal Server Error response is sent back to the client with an error message.

  12. Finally, the result is sent back to the client as a JSON response using res.json(result).

  13. The server is started and listens on port 3000, and a message is logged to the console.

By performing input data validation on the server-side and using parameterized queries or prepared statements, the fixed code prevents XPath injection attacks and ensures the safe execution of XPath queries.

References