Wednesday, February 03, 2010

Undestanding XPATH

XML Path Language
It is designed to be used by XML applications,such as XSLT and Xpointer.XSLT is xml stylesheet language transformation which is used to trasnform an XML document into another XML document.IT uses a non XML syntax to form expressions for use in Uniform resource identifier URI and XML attribute values.XPATH is basically used for addressing parts of an XML document.
We will be using a small tool.We can download the tool b-cage xpath evaluator from internet .It harldy tooks 1mb of space.Once you download it just extract and open the XPath Evaluator.htm file.You will get a screen like this.



Lets take a simple xml document and try to understand the functionality of XPATH.
<?xml version="1.0"?>
<sales xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<sale dept="ECE">
<id>420</id>
<salesperson>Arpit</salesperson>
<date>2009-12-22</date>
<amount>1000</amount>
</sale>
<sale>
<id>210</id>
<salesperson>Ankit</salesperson>
<date>2009-12-22</date>
<amount>2000</amount>
<country>UK</country>
</sale>
<sale>
<amount>5000</amount>
<country>UK</country>
<salesperson>Krishna</salesperson>
<id>10295</id>
<date>2009-12-22</date>
</sale>
</sales>

Now lets suppose we want to get the values of all the salesperson.Since it is a small document we can maually find out the names but consider a case in which we have a very long xml data with 1000 of records in that case it will be very difficult to find our information about a particular element.Hence we will use the Xpath to get those values easily.So in our xpath evaluator we will provide the following xpath expression to evalutate
//sales/sale/salesperson
Once we will provide these data and say evaluate we will get the following output
<salesperson>Arpit</salesperson>
<salesperson>Ankit</salesperson>
<salesperson>Krishna</salesperson>

location path expression
Absolute location path –It starts with a slash (/) character followed by a realtive location path.
Relative location path-It is made up of a sequence of one or more location steps separated by a slash character.

Relative path is -sales/sale/salesperson
Absolute path is-/sales/sale/salesperson
//sale/salesperson will also fetch the same result.So / points to the root element.So if we will be using // it will point to the second root element and hence the command will provide the same result.However if we will use the following xpath expression //salesperson it will also fetch us the same result.

If we only want to get the text values for the elemene our xpath expression will be something like this
//salesperson/text()
the result will be
Arpit
Ankit
Krishna

Again we want to locate the attribute value it can be done using @ command so the following command //sale/@dept will get us the result
dept="ECE"

XPATH Predicates
An xpath predicates is an boolean expression in square bracket which is evaluated for each node in the xml document.
Considering the same xml document use the following xpath expression
//sales/sale[salesperson="Arpit"]
IT will fetch you the following result
<sale dept="ECE">
<id>420</id>
<salesperson>Arpit</salesperson>
<date>2009-12-22</date>
<amount>1000</amount>
</sale>


Again we can use the following xpath expression
//sales/sale[2] or //sales/sale[position()=2]
both of them are same and will fetch us the same result that is it will display the second node of the sale so the outpur for the xpath expression will be

<sale>
<id>210</id>
<salesperson>Ankit</salesperson>
<date>2009-12-22</date>
<amount>2000</amount>
<country>UK</country>
</sale>

the Xpath expression can be combined with logical operators to form more well formed Xpath expression.
====================================================
Operators in XPATH Expression
The various operators in xpath expression are
+,<,>.<=,>=,=,!=,*,div,mod,- etc.
When using XPATH in XSLT Stylesheet document ,the operators < and > must be replaced with their entity representation.

XPATH functions
Xpath functions are used in Xpath predicates or expressions.Its general format is
Function name(arguments ,……)
Their can be more than one argument in the function call.
Thc function call may return boolean,string,node or number value.

Boolean Functions
Boolean(o)-It converts to a boolean
Not(b)-IF the agrument is false it retursn true and vice versa
True()-returns true
False()-returns false
Eg-The following XPATH expression
//sales/sale[not(amount="1000")] will fetch us the result which will contain the nodes other than the one which contains the amonunt 1000.SO the output will be
<sale>
<id>210</id>
<salesperson>Ankit</salesperson>
<date>2009-12-22</date>
<amount>2000</amount>
<country>UK</country>
</sale>
<sale>
<amount>5000</amount>
<country>UK</country>
<salesperson>Krishna</salesperson>
<id>10295</id>
<date>2009-12-22</date>
</sale>

IN boolean a node set is true if contains some element otherwise false if empty.A non zero numbers are converted to true and a zero value returns a false.

Number Function
Number(o)-Converts the argument to a number.eg.
Number(’10.23’) will be converted to 10.23
Sum() –The sum() function accepts a node set as an argument and returns the result of the sum of nodes,Each node is first converted to a string and then to a number and are summed up.
Ceiling()-IT returns the smallest integer that is greater than than its argument.eg-ceiling(2.35) returns a value of 3.
Floor()-It retruns the largest integer that is less than its argument.eg-floor(2.35) and it returns a value of 2.
Round() funciton returns the number rounded to its nearest integer.eg-round(2.65) it returns the value of 3.

Node set function
Last() returns the contect size
Position() returns the context position
Id(0) returns element by their unique id.
Count(s) returns number of nodes in a node set.
Local-name(n) returns the local node name.
==========================================
String functions
String(o) converts an object to an string
Concat(a,b,c…..) concates the arguments
Substring(s,n,n) returns a substring of a string argument from a start position to a length.
String-length(s) returns the length of a string.

There are lot of functions and I have not covered much examples here it is for the reader to do practices with all these funcitons as this just need you to know the commands and you can easily write your XPATH expression.

XPATH And XSLT
XSLT transforms XML into plain text,HTML or XML.IT specifies transformation rules in the elements with attributes that use xpath expressions.

It will be more clear with the following example.

<?xml version=”1.0” ?>
<xsl:stylesheet version=”1.0”
xmlns:xsl=”http://www.w3.org/1999/XSL/Transform”>
<xsl:template match=”//salesperson”>
<html>
<body>
<p><xsl:value-of select=”.”/></p>
<xsl:apply-templates/>
</body>
</html>
</xsl:template>
<xsl:template match=”*/text()”/>
</xsl:stylesheet>
As you can see
<?xml version=”1.0” ?>
<xsl:stylesheet version=”1.0”
xmlns:xsl=”http://www.w3.org/1999/XSL/Transform”>

This is the starting line for an XSLT.You can consider it as the default line that has to be used everytime.Next
<xsl:template match=”//salesperson”> contains the xpath expression which defines the set of nodes to which the template will be applied.Next
<xsl:value-of select=”.”/> this element select attribute containing an XPATH expression.It outputs the text value of the specified node.The (.)dot operator is used to point to the current node.
<xsl:template match=”*/text()”/> this specification states that the text node should be the output.
The ouput of the following XSL when we will apply it our xml document we will get the output as
Arpit
Ankit
Nitin
I will be discussing it more in my next secion and then it will be more clear.

1 comment:

Alfred Avina said...

The article is so appealing. You should read this article before choosing the Big data platform managed service you want to learn.