A bit of documentation can save a lot of frustration. Take the following code, for example:
import javax.xml.xpath.*
import org.xml.sax.InputSource
def xml =
'''
<foo>
<bar>
<baz>qux</baz>
</bar>
</foo>
'''
def reader = new StringReader(xml)
def source = new InputSource(reader)
def xpath = XPathFactory.newInstance().newXPath()
def nodes = xpath.evaluate('/foo', source, XPathConstants.NODESET)
println nodes.getLength()
Looks reasonable, right? And when you run it, everything works!
1
Now let's say we want to evaluate a few more XPath's:
import javax.xml.xpath.*
import org.xml.sax.InputSource
def xml =
'''
<foo>
<bar>
<baz>qux</baz>
</bar>
</foo>
'''
def reader = new StringReader(xml)
def source = new InputSource(reader)
def xpath = XPathFactory.newInstance().newXPath()
def nodes = xpath.evaluate('/foo', source, XPathConstants.NODESET)
println nodes.getLength()
nodes = xpath.evaluate('//bar', doc, XPathConstants.NODESET)
println nodes.getLength()
This yields:
1 Caught: javax.xml.xpath.XPathExpressionException at test.run(test.groovy:24)
What the-? Running in GroovyConsole provides a better clue:
javax.xml.xpath.XPathExpressionException ... Caused by: java.io.IOException: Stream closed
Ok, but why? The documentation on XPath.evaluate() doesn't mention any limitations on invoking the method multiple times. The title of this post may have tipped you off - the documentation, in this case, has a major bug.
When fed an InputSource, XPath.evaluate() will close() the underlying reader or stream after it has completed evaluation! When you try to call evaluate() again, BOOM!
The alternative is to feed in a DOM node:
import javax.xml.xpath.* import groovy.xml.DOMBuilder def xml = ''' <foo> <bar> <baz>qux</baz> </bar> </foo> ''' def reader = new StringReader(xml) def doc = DOMBuilder.parse(reader) def xpath = XPathFactory.newInstance().newXPath() def nodes = xpath.evaluate('/foo', doc, XPathConstants.NODESET) println nodes.getLength() nodes = xpath.evaluate('//bar', doc, XPathConstants.NODESET) println nodes.getLength()
Which runs correctly:
1 1
The moral of this episode? For code you're exposing to a wide audience, take the time to document gotchas. If your users don't thank you, at least they won't curse you!
Brownie points: use Groovy's DOMCategory to make dealing with the nodes returned by the XPath evaluation a similar experience to working with XmlParser or XmlSlurper:
import groovy.xml.dom.DOMCategory
...
def nodes = xpath.evaluate('/foo', doc, XPathConstants.NODESET)
println nodes.getLength()
nodes = xpath.evaluate('//bar', doc, XPathConstants.NODESET)
println nodes.getLength()
use (DOMCategory) {
println nodes.baz[0].text()
}
Prints:
1 1 qux
Rock on with XPath!