Cory Foy

Wednesday, September 12, 2007

Getting a distinct list of values in XSL from a substring of data in a node

Today's post comes from a tutorial I'm working on for a customer who still does a lot of XML/XSL stuff. One of the more challenging things to do in XSL1.0 (which is what most browsers support) was getting a distinct list of values from a set of nodes. Exslt has a good article on the subject, including this link to Jeni Tenison's XSLT Template. In the back of my mind, I kept wondering if there was some way to do this with keys, and the customer I'm working with showed a way they do it.

Let's say you have an XML document like:

<Addresses>
  <Address>
    <State>FL</State>
  </Address>
  <Address>
    <State>GA</State>
  </Address>
  <Address>
    <State>MN</State>
  </Address>
  <Address>
    <State>FL</State>
  </Address>
</Addresses>

And you want to output something like:

<States>
  <State>FL</State>
  <State>GA</State>
  <State>MN</State>
</States>

The way to do this with keys is to define an xsl:key like this:

<xsl:key name="distinctState" match="Addresses/Address" use="./State"/>

This let's us set up a key to access Address nodes with. We can then get the distinct list in an xsl:for-each node by combining this with the generate-key function:

<xsl:for-each select="Addresses/Address[generate-id() = generate-id(key('distinctState', ./State))">

generate-id "generates a key that uniquely identifies a specified node", in this case, basically creating a node-set of the state nodes and returning the first node for each distinct value. So our full xsl would look something like:

<xsl:key name="distinctState" match="Addresses/Address" use="./State"/>
<xsl:template match="/">
<states>
<xsl:for-each select="Addresses/Address[generate-id() = generate-id(key('distinctState', ./State))">
  <state><xsl:value-of select="."/></state>
</xsl:for-each>
</states>

</xsl:template>

Which is all well and good. However, in the tutorial example, state isn't in its own node - it's embedded in the Address like:

<Address>123 Sample Way, Tampa, FL</Address>

Which is a tad trickier. In the tutorial, we've allowed the assumption that the state will always be the last 2 characters of the Address field. So how can we get a distinct list of states with data like this?

Turns out that we can do it in a very similar (if complex) way. We start off by specifying the key:

<xsl:key name="distinctState" match="/Customers/Customer" use="substring(Address, string-length(Address)-1)"/>

Here, our Address node is a child of Customer, which is a child of Customers - the root node. So we are matching Customers/Customer, and using the value of the last 2 characters of the Address. We then need to do the same for our for-each loop:

<xsl:for-each select="Customers/Customer[generate-id() = generate-id(key('distinctState', substring(Address, (string-length(Address)-1))))]">
  <xsl:call-template name="AggregateForState">
    <xsl:with-param
      
name="state"
       select="substring(Address, (string-length(Address)-1))"/>
  </xsl:call-template>
</xsl:for-each>

So we are doing the unique select on the state value. However, this returns the matching node - which is an Address node, so to use the value (in our example here as a parameter to a named template) we have to still substring it out.

Of course, this would probably be a better time to either see if you can get State into its own node, or do some sort of pre-processing to do that, but when you have neither of those options, this will work.

Thanks to Len and the SR team for the initial key idea

2 Comments:

  • Or you could just put a bullet in your head.

    By Anonymous Anonymous, at 2:48 PM  

  • pls put ] at the end of xsl:for-each
    before closing xsl:for-each
    Else everything workig fine.
    Thanks a lot.
    Karishma

    By Anonymous Nicky, at 2:22 AM  

Post a Comment

Links to this post:

Create a Link

<< Home