WS-Addressing, EPR comparison, Exclusive Canonicalization and QNames in content

The WS-Addressing WG are considering how to compare Endpoint References, specifically, the [reference parameters] property. Now the [reference parameters] of an EPR are XML elements so they really need to be canonicalized in order for comparison to work. Canonicalization plays a major part in XML Digital Signature, so I'm going to explore that for a while before coming back to EPR comparison.
 
There are currently two widely implemented canonicalization algorithms; Canonical XML (XMLC14N) and Exclusive XML Canonicalization (EXCC14N). The former, also known as Inclusive XML Canonicalization, works fine for XML where the context for canonicalization doesn't change. The latter was designed to deal with canonicalization where the context might change. In both cases, context essentially means [in-scope namespaces] (see the XML Information Set specification for a definition of this property). In general, when computing signatures for elements in SOAP messages, EXCC14N is what you want because it doesn't suffer from false negatives if extra namespace declarations get added on ancestor elements. However, where EXCC14N does suffer is where QNames or other data that relies on [in-scope namespaces], such as XPath expressions, appear in element or attribute content. Consider the following SOAP message;
 
<soap:Envelope xmlns:soap='http://www.w3.org/2003/05/soap-envelope'
   xmlns:wsa='http://schemas.xmlsoap.org/ws/2004/08/addressing'
   xmlns:m='http://example.org/weather'
   xmlns:a='http://example.net/aggregator' >
  <soap:Header>
   <wsa:Action>http://example.org/2005/01/Report</wsa:Action>
   <wsa:To>http://example.org/services/weather</wsa:To>
   <wsa:ReplyTo>
    <wsa:Address>http://example.net/aggregator</wsa:Address>
    <wsa:ReferenceParameters>
     <a:ReporterId>45</a:ReporterId>
     <a:Filter>m:Report/@Temp > 30</a:Filter>
     <a:Filter>m:Report/@Wind > 10</a:Filter>
    </wsa:ReferenceParameters>
   </wsa:ReplyTo>
  </soap:Header>
  <soap:Body>
   <m:Weather  
              Temperature='Celsius' WindSpeed='Knots' />
  </soap:Body>
</soap:Envelope>
 
Each a:Filter element has 4 [in-scope namespaces] for the prefixes soap, wsa, a and m. Now only one of these prefixes, a, is what the EXCC14N spec calls 'visibly utilized', that is, used as a prefix for an element or an attribute in the XML fragment being canonicalized. If we were canonicalizing the wsa:ReplyTo element (not an unreasonable assumption given we'll later on be talking about EPR comparison) then two prefixes, wsa and a, are 'visibly utilized'. All other namespace prefixes (and their declarations) are ignored for the purposes of EXCC14N. Thus the soap and m prefixes are not included in the output of the algorithm and thus are not protected by a digest (and associated signature) over the wsa:ReplyTo element.
 
If someone changes the namespace declaration for the prefix m, perhaps to http://example.org/otherweatherservice then the meaning of the XPath expressions in the a:Filter elements could be changed, perhaps to a report that returns temperature in Fahrenheit and wind speed in kilometers per hour, thus potentially luring this unsuspecting dinghy sailor out to sail in conditions far too cold and calm to be much fun. But an XML digital signature would not detect such a change when using EXCC14N. The upshot of all this is that modifying an apparently unused namespace declaration can potentially change the meaning of some content, and that such a modification will go unnoticed if EXCC14N is used.
 
So why not use XMLC14N instead? Indeed, in this case, XMLC14N would work just fine and would catch the change to the mapping for the prefix m, thus rendering such an attack ineffective. However, in the world of SOAP, intermediaries may add (or remove) SOAP headers and the [in-scope namespaces] for a given XML element may change as a result. For example, the above message could have an additional header (and namespace declaration) added thus;
 
<soap:Envelope xmlns:soap='http://www.w3.org/2003/05/soap-envelope'
   xmlns:wsa='http://schemas.xmlsoap.org/ws/2004/08/addressing'
   xmlns:m='http://example.org/weather'
   xmlns:a='http://example.net/aggregator'
   xmlns:wsse='http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd' >
  <soap:Header>
   <wsa:Action>http://example.org/2005/01/Report</wsa:Action>
   <wsa:To>http://example.org/services/weather</wsa:To>
   <wsa:ReplyTo>
    <wsa:Address>http://example.net/aggregator</wsa:Address>
    <wsa:ReferenceParameters>
     <a:ReporterId>45</a:ReporterId>
     <a:Filter>m:Report/@Temp > 30</a:Filter>
     <a:Filter>m:Report/@Wind > 10</a:Filter>
    </wsa:ReferenceParameters>
   </wsa:ReplyTo>
   <wsse:Security>
   . . .
   </wsse:Security>
  </soap:Header>
  <soap:Body>
   <m:Weather  
              Temperature='Celsius' WindSpeed='Knots' />
  </soap:Body>
</soap:Envelope>
 
Such an addition would cause XMLC14N to return a different canonical form for the wsa:ReplyTo than it would for the first XML example, despite the fact that the element has not actually been tampered with and has the same meaning in both messages.
 
So, if neither EXCC14N nor XMLC14N do the job, what are we to do? Well, I've not been entirely honest in my discussion of EXCC14N. In addition to the XML being canonicalized, the algorithm has an additional input; a list of namespace prefixes that are to be included in the canonicalization, regardless of whether they are visibly utilized or not. In the case of the first message, we would include soap and m in this input list, and this would prevent a change in value for the namespace declaration for m from slipping through undetected. If an intermediary later added the wsse:Security header (along with the namespace declaration) as in the second message, then the wsse prefix would still NOT be included in the canonicalization, so the signature would not break.
 
Software producing XML Digital Signatures over portions of SOAP messages needs to take care to populate the Inclusive Namespace Prefix List correctly whenever digests and signatures are being computed. This can typically be done quite easily as at the point the digests and signature are computed the [in-scope namespaces] are a known quantity and the XML syntax for EXCC14N allows the prefix list to be included in the signature itself. And the signature can be transmitted as part of the message.
 
It would seem that software comparing EPRs needs to take similar care to include the [in-scope namespaces] when performing such comparisons. However, when looking at an EPR in a SOAP message, in the absence of a digital signature containing the prefix list information, there is nothing to be done except use either XMLC14N or EXCC14N without the prefix list. Certain software performing EPR comparison might be aware of the data type for the reference parameters and therefore know that vanilla EXCC14N will yield correct results. Even in the absence of such knowledge, EXCC14N is still likely to yield the best results. However, it is subject to returning some false positives in cases where QNames in content exist and namespace declarations for prefixes that are not visibly utilized differ. Minters of EPRs and comparators of the same should be aware of these issues and take appropriate care.

Posted Jan 31 2005, 05:05 AM by martin-gudgin
Filed under:

Comments

Mike Taulty's Weblog wrote Endpoint References and XML Canonicalisation
on 01-31-2005 11:29 AM

Add a Comment

(required)  
(optional)
(required)  
Remember Me?