The problem

Stored in a varchar column, payload_xml, we have xml that looks something like this:
<?xml version="1.0" encoding="utf-16"?>  
<ReceiveMoneyParameters 
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">    
  ...
  <PaymentArrangementExternalReference>
    JT00009084
  </PaymentArrangementExternalReference>    
  ...
</ReceiveMoneyParameters>
I need to return the underlined section as a varchar.

The solution

With the help of Tim Chapman's article Shred XML data with XQuery, I've arrived at the following code:
select cast(cast(cast(payload_xml as nvarchar(max)) as xml)
.query('data(//ReceiveMoneyParameters/PaymentArrangementExternalReference)')
as varchar(max)) as PaymentArrangementExternalReference,
Which is, admittedly a bit of a mess, but each part has its purpose, and it works!

Observations

Interesting observations from inside out:
  • cast as nvarchar so that the encoding matches "utf-16"
  • otherwise sql server fails with the cryptic 'XML parsing: ... unable to switch the encoding'
  • data() function returns the actual content of the node, not the whole node
  • I didn't need to use [] indexers because there is only one element.
Lesson learned: use the xml datatype for the column.

Update

To handle different namespaces, e.g.

  <?xml version="1.0" encoding="utf-16"?>
  <PremiumDistributionRequest 
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
   xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <Payment>
      <PaymentId 
       xmlns="http://il.net.za/Agreement/Payment">
        14079
      </PaymentId>
 

add a namespace declaration to the query as follows:

select @x.query(
  'declare namespace ap="http://il.net.za/Agreement/Payment";
  data(//PremiumDistributionRequest/Payment/ap:PaymentId)')