Using XPath to Retrieve Element Content

In addition to walking the document tree to find an element, Libxml2 includes support for use of XPath expressions to retrieve sets of nodes that match a specified criteria. Full documentation of the XPath API is here.

[Note]Note

A full discussion of XPath is beyond the scope of this document. For details on its use, see the XPath specification.

Full code for this example is at Appendix D, Code for XPath Example.

Using XPath requires setting up an xmlXPathContext and then supplying the XPath expression and the context to the xmlXPathEvalExpression function. The function returns an xmlXPathObjectPtr, which includes the set of nodes satisfying the XPath expression.

	xmlXPathObjectPtr
	getnodeset (xmlDocPtr doc){
	
	1xmlXPathContextPtr context;
	xmlChar *xpath;
	xmlXPathObjectPtr result;

       	2xpath = ("//keyword");
	3context = xmlXPathNewContext(doc);
	4result = xmlXPathEvalExpression(xpath, context);
	5if(xmlXPathNodeSetIsEmpty(result->nodesetval)){
                printf("No result\n");
                return NULL;
	}
      return result;
      

1

First we declare our variables.

2

Next we set our XPath expression.

3

Initialize the context variable.

4

Apply the XPath expression.

5

Check the result.

The xmlPathObjectPtr returned by the function contains a set of nodes and other information needed to iterate through the set and act on the results. For this example, our functions returns the xmlXPathObjectPtr. We use it to print the contents of keyword nodes in our document. The node set object includes the number of elements in the set (nodeNr) and an array of nodes (nodeTab):

	1for (i=0; i < nodeset->nodeNr; i++) {
	2keyword = xmlNodeListGetString(doc, nodeset->nodeTab[i]->xmlChildrenNode, 1);
		printf("keyword: %s\n", keyword);
	}
      

1

The value of nodeset->Nr holds the number of elements in the node set. Here we use it to iterate through the array.

2

Here we print the contents of each of the nodes returned.

[Note]Note

Note that we are printing the child node of the node that is returned, because the contents of the keyword element are a child text node.