Skip to content

Rhaptos Software Development

Personal tools
You are here: Home » Developer Blog » Chuck's CnxBlog » Grouping/nesting sibling nodes in XSLT 1.0

Grouping/nesting sibling nodes in XSLT 1.0 Grouping/nesting sibling nodes in XSLT 1.0

Document Actions
Submitted by cbearden. on 2009-07-12 22:04. DevelopmentMarkup
A simple example in XSLT 1.0 of flexibly grouping sibling nodes using sibling-to-sibling recursion.

As a result of design discussions last week, we decided to wrap all <subcollection>, <module>, and <segue> children of <collection> or <subcollection> in a <content> element. So we want to convert a document like this:

  <collection>
    <title>...</title>
    <metadata>...</metadata>
    <segue>...</segue>
    <subcollection>
      <title>...</title>
      <module>...</module>
      <segue>...</segue>
      <module>...</module>
    </subcollection>
    <module>...</module>
  </collection>

into a document like this:

  <collection>
    <title>...</title>
    <metadata>...</metadata>
    <content>
      <segue>...</segue>
      <subcollection>
        <title>...</title>
        <content>
          <module>...</module>
          <segue>...</segue>
          <module>...</module>
        </content>
      </subcollection>
      <module>...</module>
    </content>
  </collection>
I needed to introduce this change systematically in all 65+ test docs without otherwise altering their validity or particular form of invalidity. One natural way to modify XML files in a consistent way is to specialize an XSLT identity transform to make just the modifications one wants. I've been using XSLT to make many of the prior changes to the test docs in ways that aren't blogworthy. But the need to modify the nesting of some but not all child nodes of an element provides an occasion me to illustrate a useful XSLT technique that isn't as widely known.

The standard identity transform template in the stylesheet below shows the way we usually think about node processing in XSLT; do something to the node itself (copy, modify, ignore), and within it do something to any attributes, and do something to any child nodes. A parent element is "responsible" for initiating the processing of its children. Note, though that the template that matches collection and subcollection only processes its first child node, and then in a special named mode 'walk'. Instead of applying templates to all its child nodes, this template "lights the fuse" by processing only the first child node, leaving the 'walk' mode template to process the rest of the siblings. For processing in 'walk' mode, a node is "responsible" for processing its next following sibling as well as itself.

The last template matches all nodes in 'walk' mode. It first applies templates in the default mode to itself, and then it applies templates to one or more of its following siblings. There are two cases:

  1. If the next element sibling is one if the elements to be nested, then it
    • creates a content element and within it applies templates in default mode to all the nestable elements plus any nodes in between them;
    • and then it applies templates in 'walk' mode to first node after the nestable elements.
  2. Otherwise, it applies templates in 'walk' mode to the next sibling node.

This stylesheet makes two assumptions:

  1. that the first child node of subcollection or collection will not be one of the elements to be contained by content; and
  2. that there is only one clump of subcollection, module, and segue child elements per subcollection or collection.

Given these two assumptions, there is probably a simpler way to accomplish this particular task. However, the technique illustrated here is quite flexible, and it can be applied to more complex situations, e.g. where elements may have several different groups of child elements, each of which must be nested separately. I'm fairly confident that it could be adapted to create correctly nested CNXML section elements from imported XHTML that has only h1..5 headers to mark section boundaries.

<?xml version="1.0"?>

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:col="http://cnx.rice.edu/collxml"
>

  <xsl:output indent="yes" method="xml"/>

  <!-- Standard identity transform. -->
  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="/col:collection|col:subcollection">
    <xsl:copy>
      <xsl:apply-templates select="@*"/>
      <xsl:apply-templates select="node()[1]" mode="walk"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="node()" mode="walk">
    <xsl:apply-templates select="self::node()"/>
    <xsl:choose>
      <xsl:when test="following-sibling::*[1] [self::col:subcollection or
                      self::col:module or self::col:segue]">
        <content xmlns="http://cnx.rice.edu/collxml">
<xsl:text>
  </xsl:text>
          <xsl:apply-templates select="following-sibling::col:subcollection|
                                       following-sibling::col:module|
                                       following-sibling::col:segue|
                                       following-sibling::node()[
                                         following-sibling::col:subcollection or
                                         following-sibling::col:module or
                                         following-sibling::col:segue
                                       ]"/>
<xsl:text>
  </xsl:text>
        </content>
        <xsl:apply-templates select="following-sibling::node()
              [not(self::col:subcollection or
                   self::col:module or self::col:segue)]
              [not(following-sibling::col:subcollection or 
                   following-sibling::col:module or 
                   following-sibling::col:segue)][1]" mode="walk"/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:apply-templates select="following-sibling::node()[1]" mode="walk"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

</xsl:stylesheet>