What is XPath
Many web applications use XML or EXtensible Markup Language to store and transport data in both human readable and machine readable format. It is often used to separate data from presentation.
XSLT or EXtensible Stylesheet Language Transformations is a recommended stylesheet language for XML, which is used to transform an XML document into HTML.
XPath is a major element in XSLT. It is used in XSLT to navigate through an XML document to find out required information.
To give an example, let’s consider this XML document :
<?xml version=”1.0″ encoding=”UTF-8″?>
<title lang=”en”>Learn Hacking</title>
In a modern browser, you can load the XML document using :
var xmlhttprequest=new XMLHttpRequest()
And, the following XPath query will select the title of the book from the XML document :
xmlDoc.evaluate(xpath, xmlDoc, null, XPathResult.ANY_TYPE, null);
What is XPath Injection Attack
Let’s understand this with an example.
Suppose, we have an authentication system on a webpage which takes inputs of username and password from the user and uses XPath to look up the following XML document to find out the proper user.
<?xml version=”1.0″ encoding=”utf-8″?>
Let’s consider it uses the following XPath to look for the user :
FindUserXPath = “//User[UserName/text()='” & Request(“Username”) & “‘ and Password/text()='” & Request(“Password”) & “‘]”
So, an attacker can send a malicious username and password in the web application to select XML nodes without knowing any actual username and password.
Username: blah’ or 1=1 or ‘a’=’a
So, logically FindUserXPath becomes equivalent to :
//User[(UserName/text()=’blah’ or 1=1) or
(‘a’=’a’ and Password/text()=’blah’)]
As the first part of the XPath is always true, the password part becomes irrelevant and the UserName part matches the admin. And thus, it can now reveal sensitive information from the server to the attacker, which the attacker can exploit for malicious purposes. And, the web application becomes vulnerable to XPath Injection Attack.
Use a parameterized XPath interface whenever possible.
Construct the XPath query dynamically and escape the user inputs properly.
In a dynamically constructed XPath query, if you are using quotes to terminate untrusted input, then make sure to escape that quote in the untrusted input, so that the untrusted input cannot try to break out of the quoted part. For example, if single quote (‘) is used to terminate the input username, then replace any single quote (‘) character in the XPath query with XML encoded version of that character, for example “'”
Using precompiled XPath query is always good. With this, the user inputs get escaped properly without missing any character that should have been escaped.