Friday, February 05, 2010

Transforming XML document using XSL transformation

XSL-XML Stylesheet language adds the capability of transforming an XML Document in to another document .This other doucment can be a XML or HTML document.XSL consists of two parts.
XSL Transformations (XSLT)-IT is an XML application that process rules contained in an XSLT stylesheet.It can be applied to any XML document.
XSL Formatting Object(XSL-FO)-IT is an XML application used for describing the precise layout of text on a page.
For BPEL purpose we need not concenterate on XSL-FO more.We will put more emphasis on XSLT.
So how does it work .Lets suppose we have a XML document.The xml document when goes to XML processor the XML processor checks the XSL Stylesheet langauge.It generates a new XML file based on the XSL Stylesheet.
XSLT stylesheet.
An XSLT Stylesheet is an XML Document that conatins
1>A <xsl:stylesheet> root element which declares a xsl namespace prefix and a mandatory namespace URI http://www.w3.org/1999/XSL/Transform.

2>One or more xsl template elements and other xsl elements which defines the transformation rule.
A genreal XSL document will look something like this

<?xml version =”1.0” ?>
<xsl:stylesheet version=”1.0”
xmlns:xsl=”http://www.w3.org/1999/XSL/Transform”>
<xsl:template match =”/”>….</xsl:template>
<xsl:template match =”/”>….</xsl:template>
</xsl:stylesheet>
The XSLT rules,called a template contains a match pattern which is specified as an Xpath expression that is compared against the node in the source XML document.
Now we will see a simple example of an XSLT Stylesheet .IT also needs some basic knowledge of HTML which I believe you ppl have.I will explain all the terms so that it will be easy to undestand.Lets take this XSL.
<?xml version="1.0" ?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<table border="2">
<tr>
<th>emp-id</th>
<th>emp-name</th>
<th>salary</th>
</tr>
<xsl:apply-templates/>
</table>
</body>
</html>
</xsl:template>
<xsl:template match="employee">
<tr>
<td><xsl:value-of select="employee_id"/></td>
<td><xsl:value-of select="employee_name"/></td>
<td><xsl:value-of select ="employee_salary"/></td>
</tr>
</xsl:template>
</xsl:stylesheet>
Now we will try to understand this program.The initial statements
<?xml version=”1.0” ?>
<xsl:stylesheet version=”1.0”
xmlns:xsl=”http://www.w3.org/1999/XSL/Transform”> are the prerequisite for defining any xsl document.
<xsl:template match=”/”>
IT says the template rules are applied to the whole XML document.It is because the slash(/) character is a XPAth expression which yields the whole xml document itself.
Next is our html code where in we are defining a table within three fields emp_id,emp_name and salary.
Now as soon as the parser finds the following statement <xsl:apply-templates/>
IT starts looking for the corresponding match statement in the xsl document.So the flow of execution reaches to <xsl:template match=”employee”>
Again here we are using html code and xsl:value-of select to display the corresponding values in the HTML document.IT ios defined within the table data tag pair.IT states that insert the value of the source document selected by the Xpath expression.
If you are facing some issues understanding this I wud recommend to go through the w3schools and get some idea on HTML coding.

Now we will see how to write a corresponding XML document for the XSL document.An XML document using the XSL should essentially contain a processing instruction <?xml-stylesheet type… href…..?>
It also contain two pseudo attributes
The type value and the href value which contain the source XSLT stylesheet document.A simple XML document for this XSL should look like this.
<?xml version=”1.0” ?>
<?xml-stylehseet type=”text/xsl” href=”emp.xsl”?>
<employees>
………………..
</employees>
Now lets design the XML document for the XSL document.
<?xml version =”1.0” ?>
<?xml-stylesheet type=”text/xsl” href=”emp.xsl” ?>
<employees>
<employee>
<employee_id>123</employee_id>
<employee_name>Arpit</employee_name>
<employee_salary>650000</employee_salary>
</employee>
<employee>
<employee_id>1234</employee_id>
<employee_name>Ankit</employee_name>
<employee_salary>7000000</employee_salary>
</employee>
</employees>

Now try to open the XML employees.xml in internet explorer or any other browser and check the output.You will get an output like this.



So now you will get an much clear idea of what is the use of XSL document and how does it work.Now lets take a look again on the code and try to understand it.


====================================

============================================
Template Rules
The general syntax for a template rule is
<xsl:template match=”XPATH expression”>
output-template
</xsl:template>
Template rules determine what the XSLT processor will output when an input node matches the template.The output template contains the instructions for formating the result document .The output template can be empty or can be a combination or HTML,XML etc.The example that we had taken to illustrate the functionality of XSL contains a template match=”/”.This process the root element and not any other node from the input document.To process the child node of the document,its template must include an <xsl:apply-templates/> rule.

<xsl:value-of>
The xsl value of command has general syntax as
<xsl:value-of select=”expression” disable-output-escaping=”yes|no”/>
Lets take an example
<employee name=”arpit”>
<emp_id>420</emp_id>
<emp_salary>6.5<emp_salary>
</employee>
Now we will just use the xsl:value-of command to get the idea what it will fetch.
If we use <xsl:value-of select=”emp_id”/> it will give us 420
If we use <xsl:value-of select=”@name”/> it will give us arpit.
The disable-output-escaping attribute values are
1>yes that outputs an & character for an &amp; in the input and > for and &lt;
2>no,it will produce the ouput as it is.
So in short it will use to change the entity variables.

<xsl:apply-templates/>
It is used recursively to process children of the current node.Lets take an example
<xsl:template match=”/”>
<xsl:apply-templates/>
</xsl:teamplate>
<xsl:template match=”Arpit”>
Describe Arpit
<xsl:apply-templates/>
</xsl:template>
<xsl:template match=”Ankit”>
Describe Ankit and Arpit
</xsl:template>
Here as you can see the parser when first time encounters the <xsl:Apply-templates/>
It starts luking for the match document so the execution reaches to
<xsl:template match=”Arpit”>
again it describe arpit and encounter again <xsl:Apply-templates/> so it again start luking for match and find <xsl:template match=”Ankit”> and thus ir define both ankit and arpit.The data output from the child node templates in inserted in to the location where the <xsl:apply-templates/> exists.


Controlling the activation of Template activation.
We have seen that the <xsl:apply-templates/> activates the child nodes by defalut however we can control the flow by using select statement in conjunction with the apply-template to redirect the exection to a particular block.This will be more clear with an example.
<?xml version=”1.0” ?>
<xsl:stylesheet version=”1.0”
xmlns:xsl=”http://www.w3.org/1999/XSL/Transform”>
<xsl:template match=”/”>
<xsl:apply-templates select=”//sale”/>
</xsl:template>
<xsl:template match=”sales”>
<b>Arpit</b>
</xsl:template>
<xsl:template match=”sale”>
<b>Ankit</b>
</xsl:template>
</xsl:stylesheet>

Here the ouput of the following xsl will be Ankit in bold letter because I have used html coding for it.It won’t display the Arpit value because this tag is not activated we have given instruction to the processor to find our the match condition for the select.If we wud not have defined the select in apply-templates then in that case both the value Arpit and Ankit would have displayed.
The xml that we will be using is
<?xml version=”1.0” ?>
<?xml-stylesheet type=”text/xsl” href=”emp.xsl” ?>
<employee>
<sales>Ankit</sales>
<sale>Arpit</sale>
</employee>

See despite we have defined Arpit in the sale element but still in our xsl we have
<xsl:template match=”sale”>
<b>Ankit</b>
</xsl:template>
which says to print the Ankit in bold statement so the output of this xml if opened in any browser will be
Ankit
======================
Template rules and priorities
There are two thumbs for rule
1>The rule with the highest priority is applied.
2>Use the rule that appears last if the priority are equal.
The default priority is –0.5
You can specifically assign priority by following
<xsl:template match=”/” priority=”1”>
again there are few points which must keep in mind while applying the template rule
1>A –0.5 value is applied if the match pattern is a simple node-test such as element name.
2>A 0 value if the pattern uses a node name qualified by a namespace prefix,or the processing instruction() function,eg:-employees:employee.
3>A 0.25 value for patterns using a namespace prefix with a wildcard node-test,for example,employees:*
4>A 0.5 value for all other patterns eg:-/employee which uses an absolute Xpath pattern.

Default Template rules
XSLT provides built in template ruleswhich are applied to nodes without a matching template rule in the XSL stylesheet.e.g:-element and root nodes
<xsl:template match=”*|/”>
<xsl:apply-templates/>
</xsl:template>
We will try to understand this with our previous example
<?xml version="1.0" ?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<table border="2">
<tr>
<th>emp-id</th>
<th>emp-name</th>
<th>salary</th>
</tr>
<xsl:apply-templates/>
</table>
</body>
</html>
</xsl:template>
<xsl:template match="employee">
<tr>
<td><xsl:value-of select="employee_id"/></td>
<td><xsl:value-of select="employee_name"/></td>
<td><xsl:value-of select ="employee_salary"/></td>
</tr>
</xsl:template>
</xsl:stylesheet>
In this XSL the <xsl:template match="/"> is looking for the root context and
<xsl:template match="employee"> is looking for the employee context.But interanally the first <xsl:template match="/"> command calls the
<xsl:template match=”*|/”>
<xsl:apply-templates/>
</xsl:template>
That is if our xml document has a corresponding root node employees it will be calling it first the parser will again find <xsl:apply-template> and then will search for matching template and hence will get <xsl:template match="employee">.

Once the XSL document is created and before it is processed by the processor,the text nodes are stripped if it contains white spaces.Stripping a text node removes it from the tree.The XSLT can do
1><xsl:strip-space elements=”abc”/>to define a space-separate list of elements for which spaces are stripped.
2><xsl:preserver-space elements=”abc”/> to define a list of space separated element in which spaces are preserved.



Looping with <xsl:for-each>
IT is used as a nested instruction within <xsl:template> element.
It will be more clear with an example
<?xml version="1.0" ?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="employees">
<xsl:for-each select="employee">
<p><xsl:value-of select="."/></p>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

Lets take an XML document which will be used for this XSL
<?xml version =”1.0” ?>
<?xml-stylesheet type="text/xsl" href="emp.xsl" ?>
<employees>
<employee>
<employee_name>Arpit</employee_name>
<employee_id>420</employee_id>
</employee>
<employee>
<employee_name>Ankit</employee_name>
<employee_id>840</employee_id>
</employee>
</employees>
Here we are considering the xsl document name is emp.xsl.Now if you will open the xml file in a browser you will get the following output
Arpit 420
Ankit 840
Now lets understand this why this result came.
As you can see we have not defined here apply-templates but still parser calls the default template rule and activates the employees node.Now the department root element is activated by the for-each statement.It just says to the parser that activate all the elements in the document root.the document root contain two elements and there are two document root so both of them gets activated and we get the output as shown above.


Output Formats
The <xsl:output> element specifies the output format of the result.It must be the child of <xsl:stylesheet>.In general it is defined as
<xsl:output method=”xml” media-type=”text/xml”/>
You can define different methods as XML,HTML,test or other formats like wml ,uml etc.


Attribute Value templates
This is one of the most important feature of the XSLT.IT allows the elements in an XML document to be converted in to an attribute element in the resulting XSLT document.
Lets take an example and try to understand this
Lets suppose we have the following root –element
<employee>
<employee_name>Arpit</employee_name>
<employee_id>420</employee_id>
</employee>
and corresponding xsl document as
<xsl:template match="/">
<employees><xsl:apply-templates/></employees>
</xsl:template>
<xsl:template match="employee">
<employee id="{employee_id}" name="{employee_name}"/>
</xsl:template>

The resulting output document for the XML document when parsed with the given xsl will be something like this
<employees>
<employee id=”420” name=”Arpit”/>
<\employees>
thus we can see that the elements in the XML document are converted to the attribute in the resulting xml document.


Creating Elements with Attributes
To create an element,use
<xsl:element name=”emp_name”>Arpit</xsl:element>
This corresponds to following in the xml document
<emp_name>Arpit</emp_name>
To create an attribute use
<xsl:element name=”emp_name”>
<xsl:attribute name=”id”>420</xsl:attribute>Arpit</xsl:element>
So the corresponding output xml will be
<emp_name id=”420”>Arpit</emp_name>
Using attribute set
<xsl:attribute-set name=”employee_info”>
<xsl:attribute name=”id”>420</xsl:attribute>
<xsl:attribute name=”emp_name”>Arpit</xsl:attribute>
</xsl:attribute-set>
<xsl:template match=”employee[1]”>
<xsl:element name=”{local-name()}” use-attribute-sets=”region-info”>
</xsl:element>
</xsl:template>
This will fetch us result
<employee id =”420” name=”Arpit”/>

==================================================================

Sorting an XML document
<xsl:sort> this command can be used to sort the ouput document in order of the pattern specified.Lets take an example to understand this.
<?xml version=”1.0” ?>
<xsl:stylehseet version=”1.0”
xmlns:xsl=”http://www.w3.org/1999/XSL/Transform”>
<xsl:template match=”/employees”>
<xsl:for-each select=”employee”>
<xsl:sort select=”last_name”/>
<p><xsl:value-of select=”.”/></p>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Now lets apply this XSL to the xml document and try to find out the result
<?xml version=”1.0” ?>
<?xml-stylesheet type=”text/xsl” href=”emp.xsl” ?>
<employees>
<employee num=”1”>
<employee_id>420</employee_id>
<last_name>Rahi</last_name>
</employee>
<employee num=”2”>
<employee_id>840</employee_id>
<last_name>Kumar</last_name>
</employee>
<employee num=”3”>
<employee_id>1260</employee_id>
<last_name>krishna</last_name>
</employee>
</employees>

So if we will try to open the xml file in any browser we will get the following output
1260 krishna
840 Kumar
420 Rahi

IT is will be in ascending order of the last name.
=================================================
Conditional processing
XSLT provides several conditional commands for processing the XML documents.
<xsl:if>.It provides a test evaluating an Xpath expression as a boolean result of true or false.The Xpath expression is converted to either true or false.Any numeric value which is not zero is called as true,a string with a zero length is false and all other condition it is true.A node with a value is true and an empty node is false.So its general syntax is
<xsl:if test=”test condition|element|nodes”>
processing instruction
</xsl:if>

Similar to <xsl:if > we have one more conditional processing instruction it is <xsl:choose> which has one or more <xsl:when> and optionally one <xsl:otherwise>.
IT is just like any other conditional branch in java which says when it is true do this otherwise perform the deafult.Its genreal syntax is
<xsl:choose>
<xsl:when test =”test condition”>
Processing instruction
</xsl:when>
<xsl:otherwise>
Do something else
</xsl:otherwise>
</xsl:choose>

Modes
Modes in XSL allows the processing of same input XML element more than once.they are applied using the mode attribute in <xsl:template> and <xsl:apply-teampltes>
<xsl:template match =”employee” mode=”toc”>
<xsl:template match =”employee” mode= “body”>
its reference
<xsl:apply-templates select=”employee” mode=”toc”/>
<xsl:apply-templates select=”employee” mode=”body”/>
the mode attribute value defined is either toc i.e. table of contents or body which has the same meaning.


Calling templates by name.
You can name the template by specifying attribute
<xsl:template name=”employee”>
Now you can call them anywhere in your program to be referred by the name employee.The genral syntax for calling template is
<xsl:call-template name=”employee”/>

Creating an using parameters
Parameters can be defined using the <xsl:param>
<xsl:param name=”emp_id”/>
further it can be used in processing instruction as
<xsl:for-each select=”//employee[id=$emp_id]”>

We can pass the parameter using <xsl:with-param> in an <xsl:apply-templates> or <xsl:call-template>
<xsl:with-param name=”emp_id” select=”employee/id”/>

Using the ORAXSL utility
Oraxsl utility is a command line utility which transforms xml documents with an XSLT stylesheet.It requires the following two things.The java executable in the PATH And the xmlparserv2.jar file in the CLASSPATH.
The general syntx for calling the oraxsl utility is
Java oracle.xml.parser.v2.oraxsl [opts] source stylesheet [result]
Opts here represents zero or more options
Source represents the source xml document
Stylesheet represent the xsl document
Result is agina an standard output.It can be a text ,note ,xml or html.IF not specified the result will be written to the command window or standard output.

SO if I will take an example of the previous one which I have taken it will be



As I have not specified the destination (result) location the output result is produced in the command console itself.You can get the help for the command

2 comments:

Anonymous said...

good one.Very helpful

Anonymous said...

very good one