Skip to content

Rhaptos Software Development

Personal tools
You are here: Home » Developer Blog » Brent's Blog » xpathgrep

xpathgrep xpathgrep

Document Actions
Submitted by brentmh. on 2005-10-12 16:21. AnnouncementsDevelopment
I just checked in a useful script for searching through CNXML files via XPath patterns

Every once in a while I (or someone else) need to scan through a bunch of XML files looking for a particular pattern. Usually something like: find all of the modules that have <code> tags. If the pattern is simple enough you can usually get by with grep. Today however, the topic of the <tgroup> tag came up along with the fact that you can actually have multiple ''s in a single <table>. Unfortunately the number of trgoup children of a table is not something easy to test with grep.

It is, however, easy to test with an XPath: //cnx:table[count(cnx:tgroup) > 1] so I modified my xpath evaluator to create xpathgrep. Feed it an XPath expression and a list of CNXML files and it will tell you which files match the pattern. For example:

  <167 yoda:~/tmp/xmlpages > xpathgrep "//cnx:table[count(cnx:tgroup) > 1]" */index.cnxml
  m10184/index.cnxml: 1 matches
  m10511/index.cnxml: 1 matches
  m12131/index.cnxml: 1 matches

What do you know? Three modules make use of this. Of course our stylesheets currently turn them into 3 separate tables but that's a problem for another day.

If you want to make use of xpathgrep you'll find it in our subversion repository. Just svn co svn+ssh://software.cnx.rice.edu/scripts/trunk scripts

Notes:

  • I taught it about CNXML and MathML namespaces but not MDML, QML, or BibTeXML. Feel free to modify as you need.
  • You need to specify cnx as the prefix for CNXML tags because our XPath evaluator doesn't understand default namespaces