"Codenomicon Lab's core focus is to empower the decision makers to provide better quality software and better quality products"

Codenomicon Labs

XML Security and Fuzzing

This document collects together discussions related to XML security and fuzzing.

The official place for updates regarding XML vulnerabilities:

Fuzzing overview:

Codenomicon Ltd.:

Introduction

XML has come a long way from the days when it provided support for just a few applications. It is now used to a significant extent by all the main specifiers of protocols, and in some (but not all) cases is the dominant specification language. Usually the notation is associated with use of the XML-defined encoding, with a few exceptions.

For a long time academics have been writing about the robustness aspects of XML. An example of such an article can be seen here:

http://www.ibm.com/developerworks/java/library/j-fuzztest.html

In fact, some people still believe that XML will make communication protocols and file formats more reliable, however the reality is just that until now, there have been no good XML fuzzers.

Common misunderstandings about XML resilience to attacks:

  • XML formats make protocols/files resistant to broken messages because XML parser assumes nothing about the input
  • XML format is carefully and formally defined, and therefore any deviation is automatically detected

XML fuzzing

This is similar to our earlier work with ASN.1 vulnerabilities (PROTOS SNMP case in 2001-2002), the major improvement being to add the XML capability to our proprietary fuzzing framework. Early this year (2009) we released some of our first XML-based tools to the market and used XML fuzzing technology against a set of open source XML implementations. The result was that once again, everything broke.

XML fuzzing takes XML message structures and alters them in ways beyond imagination. Breaking the encoding, repetition of tags and elements, dropping of tags and elements, recursive structures, overflows, special characters, and many many other methods will easily corrupt XML parsing and XML-based protocol communications. The result is a denial-of-service situation, corruption of data, or maybe even a situation where hostile code can be executed on the vulnerable host.

Is the problem in a specific protocol using XML, or in individual XML parsers?

Like ASN.1, XML is not a protocol by itself, but a method for describing structures. It is used in a wide range of modern protocols and file formats.

These flaws are caused by a set of problematic inputs that can be given to any applications using XML. Each vendor (library, programming language, parser, application using XML) will have different flaws. This is exactly like the times when flaws were found in ASN.1. There also, it was not a single flaw in ASN.1, but each vendor that had their own set of flaws and those flaws could be triggered by ASN.1-aware fuzzers.

Is this and that product/library vulnerable to these issues?

This study is part of the CROSS project at Codenomicon, and therefore only open source XML libraries and projects have been tested. So far, all of the open source libraries we have tested have failed against XML fuzzing. After we found the first set of vulnerabilities in early 2009, we have worked hard with CERT-FI to report the found issues to the affected projects, and they have been working to distribute their patches to downstream users and vendors using their code.

We cannot discuss the security of commercial XML products or library versions within the CROSS project, as the project is intended to benefit the open source community only.

Exploitability?

Codenomicon does not release exploitation details. Codenomicon never focuses in the actual exploitation of flaws, so any discussion regarding exploitability of these issues should be taken to individual vendors or CERT-FI. However, time and experience has shown that with most flaws, exploitation is just a matter of time and effort. Fixing the issue and patching affected systems is what really matters.

Level of Exploitability: depends on the vulnerabile software, see below...

Easiness of Exploitability: depends also on the vulnerable software, see below...

Targets: Anything that uses XML

XML processing in applications is almost always implemented through use of XML libraries. In case a C library is used, any flaw can easily result in malicious code execution. Unfortunately, most libraries out there are written in C, and thus errors such as stack overflows are not that uncommon. When this is the case, exploitability depends on the anti-exploitation features of the platform (ASLR, DEP, NX bits, canaries etc.).

The attack interface depends on where XML is used. If the library is used to process files, then the flaw is most probably local only. If on the other hand the library is used in communications software, it would immediately be remote.

Any authentication or encryption mechanisms might be useless from protection perspective, because many of the flaws are in the XML itself, not in the data processing (which also can fail with XML fuzzing). Using authentication, signing or encryption only narrows the set of people who can exploit the flaw. For example, if a service requires authentication before XML uploading is possible, you either need to become a customer of the service, or to steal someone else's credentials.

How to fix the issues?

Currently, the only thing you can do is to monitor for updates from your selected vendors, and once updates are available, install them.

The effectiveness of the currently available updates depends on how XML libraries are used. A programming language can contain XML in itself. Sometimes in an application the library can be static or dynamic. People also implement their own proprietary XML parsers. Only time will tell how significant impact these XML flaws will have, or if anyone will even notice them until their individual piece of software is finally tested with an XML-capable fuzzer.

There are thousands if not millions of more or less critical applications using XML. No matter what publicity these issues will receive, it is probable that we can never reach all developers using XML - unfortunately.

Note that after fixing generic XML parsing issues, application-specific ones will still remain. A parser might not handle a generic XML file far enough to hit a flaw: the generic file might not contain the right kinds of fields, the expected namespaces or schema etc. Tailoring the fuzzing to the interface and implementation practically always uncovers more issues.

Application developers using the vulnerable libraries should review their software and update their libraries, rebuild if needed, release a patch, and inform their users if they are vulnerable.

The main thing in this is not the flaws themselves, but understanding the fact that XML does not protect you from parser errors. Worse even, using XML can make you even more vulnerable due to its complexity.

Where is XML used?

XML is used everywhere. XML is used in cloud computing, web applications, mobile applications, 3D images, documents, instant messaging...

XML is used in many remote communication facilities (XMLRPC, SOAP, and XMPP) as well as in various document formats (docx, openoffice, playlists, configuration files, svg vector graphics, RSS feeds, semantic web formats, you name it we play it). Thus, there are many vectors for remote attack such as sending malicious documents or network requests. Targets are similarly various, from server components (request brokers, XML firewalls, schema checkers) to client applications (office software, Flash, ...).

If you were to wave a magic wand and eliminate from the world all communications that are encoded using XML-defined values, disaster would certainly strike on a scale far beyond any that the most pessimistic had described for possible effects of the Y2K (year 2000) computer bugs. Aircraft would collide, mobile phones would cease to work, virtually all telecoms and network switches would be unmanageable and unmaintainable and would gradually die, electric power distribution systems would cease to work, and to look a little further ahead before we wave our magic wand, smart-card-based electronic transactions would fail to complete and your washing machine might fail to work! But worst of all, your selected social networking media would suddenly collapse and your life would become a misery!

Since web services and XML over HTTP has been the favored integration mechanism for legacy and new systems in the service-oriented-architecture, XML parsing is a critical item in almost all systems and all areas of economy. The analyzed weaknesses are in the core languages and frameworks used commonly for web technologies. Therefore, any system can be vulnerable.

Impacted business sectors are:

Banking

  • External web service interfaces (WSI) for money transfers (corporate system integration)
  • External WSI e-billing integration
  • Stock information systems
  • Intranet systems

Manufacturing

  • On all sectors through logistics chain integration of suppliers and customers through WSI
  • Intranet systems

Retail

  • On all sectors through logistics chain integration of suppliers and customers through WSI
  • Consumer E-commerce companies providing WS interfaces (shopping)
  • Intranet systems

Health Care

  • Patient Data systems integration
  • Intranet systems

Government

  • Tax information systems integration
  • Intranet systems

Electric/Gas/Water Network Companies

  • Remote meter reading
  • Network/Device management
  • Intranet systems

How will CROSS help going forward

Codenomicon CROSS team continues to report any flaws that they find in open source libraries to CERT-FI. Please check this page and CERT-FI web site for updates on the process.

So far Sun, Apache and Python have announced fixes for found vulnerabilities. Codenomicon and CERT-FI are currently working also with other projects to fix further vulnerabilities. CERT-FI advisory and our web pages will be updated as announcements come from other projects.

Affected libraries are extremely popular and they are used by almost all software, system and OS vendors globally. This means that an extremely large number of products depends on just a small set of XML libraries.

Let us know if there are other open source libraries that you think should be tested.

Other XML implementations

A list of web services platforms that may or may not be impacted:

http://en.wikipedia.org/wiki/List_of_Web_service_Frameworks