Thursday, April 25, 2013

XSLT - Tips to make you a better XSLT programmer

The combination of XML and XSLT is growing in popularity with webmasters of medium-sized and large Web sites. Prior to XSLT, changing the presentation of a Web site was a major undertaking: one had to revisit and to change every page on the site. XSLT automates the process, leading to significant time savings.
Tip 1: Cascading style sheets, tables, and XSLT
It may seem strange to start with a tip on CSS in an article on XSLT, but I'm often asked "Are the two style sheet languages compatible?" The answer is a resounding yes.
This is illustrated in Listing 1 with products.xml, a list of products written in XML. Take a minute to become familiar with products.xml because I'll use it throughout the five tips.

Listing 1. products.xml, a list of products in XML
<?xml version="1.0"?>
<products>
   <product href="http://www.playfield.com/text">
      <name>Playfield Text</name>
      <price currency="usd">299</price>
      <description>Faster than the competition.</description>
      <version>1.0</version>
   </product>
   <product href="http://www.playfield.com/virus">
      <name>Playfield Virus</name>
      <price currency="eur">199</price>
      <description>
         Protect yourself against malicious code.
      </description>
      <version>5.0</version>
   </product>
   <product href="http://www.playfield.com/calc">
      <name>Playfield Calc</name>
      <price currency="usd">299</price>
      <description>Clear picture on your data.</description>
      <version>1.5</version>
   </product>
   <product href="http://www.playfield.com/db">
      <name>Playfield DB</name>
      <price currency="cad">599</price>
      <description>Organize your data.</description>
   </product>
</products>

To format this document in HTML, you could use table.xsl in Listing 2. Applying table.xsl to products.xml looks like Figure 1. As you can see, every other line has a gray background, which improves readability. In table.xsl, this is achieved through a cascading style sheet.

Listing 2. table.xsl, an XSLT style sheet that uses CSS
<?xml version="1.0"?>
<xsl:stylesheet
   version="1.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="html"indent="no"/>
   
<xsl:template match="/products">
   <html>
      <head>
         <title>Cascading Style Sheet</title>
         <link rel="stylesheet" type="text/css" href="table.css" 
              title="Style"/>
      </head>
      <body>
        <table>
          <tr class="header">
             <td>Name</td>
             <td>Price</td>
             <td>Description</td>
          </tr>
          <xsl:apply-templates/>
        </table>
      </body>
   </html>
</xsl:template>

<xsl:template match="product[position() mod 2 = 1]">
   <tr class="odd">
      <td><xsl:value-of select="name"/></td>
      <td><xsl:value-of select="price"/></td>
      <td><xsl:value-of select="description"/></td>
   </tr>
</xsl:template>

<xsl:template match="product">
   <tr class="even">
      <td><xsl:value-of select="name"/></td>
      <td><xsl:value-of select="price"/></td>
      <td><xsl:value-of select="description"/></td>
   </tr>
</xsl:template>

</xsl:stylesheet>

Microtip: Smaller HTML files Note the indent="no" attribute highlighted in red in the <xsl:output> element of table.xsl in Listing 2. The attribute tells the XSLT processor not to indent the HMTL document, which typically results in smaller HTML files that download faster.

Figure 1. Product list in HTML
NamePriceDescription
Playfield Text299Faster than the competition.
Playfield Virus199Protect against malicious code.
Playfield Calc299Clear picture on your data.
Playfield DB599Organize your data.
How does it work? The table.xsl style sheet inserts an HTML <link> element in the output. The <link> loads a cascading style sheet:
<link rel="stylesheet" type="text/css" href="table.css" title="Style"/>
No conflict exists between CSS and XSLT because they are not used at the same time. The XSLT style sheet is applied first to products.xml, the XML document. The result is an HTML document that is passed to the browser. The fact that the HTML document loads a cascading style sheet is irrelevant at this stage. It is only when the browser loads the HTML document that the CSS is put to use.
classoddevenmatch
<xsl:template match="product[position() mod 2 = 1]">
   <tr class="odd">
      <td><xsl:value-of select="name"/></td>
      <td><xsl:value-of select="price"/></td>
      <td><xsl:value-of select="description"/></td>
   </tr>
</xsl:template>

The simple cascading style sheet in Listing 3, table.css, places a gray background on even lines of the table.

Listing 3. table.css, the CSS for the table in Figure 1
.header { background-color: #999999; font-weight: bold; }
.odd { background-color: normal; }
.even { background-color: #dfdfdf; }

When should you mix CSS and XSLT? I think the combination is attractive in the following cases:
  • To gain more control over display since CSS is more powerful than raw HTML
  • To produce smaller HTML files that download faster
Note, however, that you don't gain much in maintainability when you combine both types of style sheets. Indeed designers were originally attracted to CSS because one file controls the design of the entire Web site. In that respect, CSS is redundant with XSLT, which already make it easy to reformat several documents at once.
Beware! table.xsl and the other style sheets in this article need a standard-compliant XSLT processor. The one that ships with Internet Explorer 5.0 and 5.5 is not standard compliant. If you need a compliant XSLT processor, try Xalan, the XSLT processor from the Apache project, or upgrade the processor in IE to version 3.0 (see Resources).
Tip 2: HTML entities
Another common question developers ask about writing XSLT style sheets is how to insert HTML entities. In particular, how to insert an &nbsp; entity (nonbreakable space). Among other things, &nbsp; spaces are used to create non-empty cells in tables.
Unfortunately, the obvious solution does not work:
<tr>&nbsp;</tr>
Why? The XSLT style sheet is an XML document. When it is parsed, entities are resolved and the &nbsp; entity is resolved as an XML entity. Since &nbsp; is not defined in HTML, it results in an error.
The workaround is illustrated in nbsp.xsl, in Listing 4, with the relevant element highlighted in red. As you can see, writing an entity in HTML takes more characters than you might like to type each time you need one, but once you know the code you can simply cut and paste it where appropriate.

Listing 4. nbsp.xsl, showing how to insert HTML entities in style sheets
<?xml version="1.0"?>

<xsl:stylesheet
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   version="1.0">

<xsl:output method="html" indent="no"/>

<xsl:template match="/">
  <html>
    <head><title>HTML Entities</title></head>
      <body>
        <table border="1">
          <tr>
            <td>Name</td>
            <td>Price</td>
            <td>Description</td>
            <td>Version</td>
          </tr>
          <xsl:apply-templates/>
       </table>
    </body>
  </html>
</xsl:template>

<xsl:template match="product[version]">
  <tr>
    <td><xsl:value-of select="name"/></td>
    <td><xsl:value-of select="price"/></td>
    <td><xsl:value-of select="description"/></td>
    <td><xsl:value-of select="version"/></td>
  </tr>
</xsl:template>

<xsl:template match="product">
  <tr>
    <td><xsl:value-of select="name"/></td>
    <td><xsl:value-of select="price"/></td>
    <td><xsl:value-of select="description"/></td>
    <td>
      <xsl:text disable-output-escaping="yes">&amp;nbsp;</xsl:text>
    </td>
</tr>
</xsl:template>

</xsl:stylesheet>


Microtip: Multiple output documents

What about the opposite case, when you need more than one output document? What about creating several output XML documents from one style sheet? Unfortunately, at the time of writing, there is no standard solution. Most XSLT processors have some sort of support for multiple output files, such as in Xalan. And don't despair of a standard solution; XSLT 1.1 (currently under development) will support multiple output documents.
The correct code for the nonbreaking space entity, as excerpted from Listing 4, is:
<xsl:text disable-output-escaping="yes">&amp;nbsp;</xsl:text>
What's happening in that line of code? The trick is to use an <xsl:text> element with the disable-output-escaping="yes" attribute. <xsl:text> creates some text in the output HTML. The attribute tells the processor not to escape the content of the text. The text itself is simply &nbsp; with the & character escaped for XML.
Again the style sheet is read as an XML document, so &amp;nbsp; is &nbsp;. If you tell the processor not to escape the & in the output HTML, it writes &nbsp;.
Tip 3: Multiple input documents
A typical XSLT style sheet transforms one XML document into another XML document, or into an HTML document. Sometimes, this is too limiting. (For a word on how to handle the reverse, see the sidebar, Multiple output documents.)
For example, notice in products.xml that the <price> element has a currency attribute. The currency is not written in clear, but as a code (for example, usd for U.S. dollars or cad for Canadian dollars). You will want to translate the codes before displaying them.
With more than a hundred currencies around the world and many applications that handle currencies, you really should store the list of currencies in its own XML document. An extract of the file may look like codes.xml in Listing 5.

Listing 5. codes.xml, an XML document with currency codes
<?xml version="1.0"?>
<currencies>
   <currency>
      <code>eur</code>
      <name>Euros</name>
   </currency>
   <currency>
      <code>usd</code>
      <name>Dollars</name>
   </currency>
   <currency>
      <code>cad</code>
      <name>Canadian dollars</name>
   </currency>
</currencies>

In effect, now the example has two XML files, products.xml and codes.xml, that you need to combine to create the HTML document. Fortunately, XSLT makes it easy to combine several input XML files, as multi.xsl in Listing 6 illustrates.
There are two important steps in multi.xsl. First the style sheet opens codes.xml (with the document() function) and assigns it to the currencies variable:
<xsl:variable name="currencies" select="document('codes.xml')/currencies"/>
Then the style sheet can extract information from the code list through the variable. Of course, you can use XPath to query the currencies document:
<xsl:value-of select="$currencies/currency[code=$currency]/name"/>

Listing 6. multi.xsl, a style sheet that combines several XML documents
<?xml version="1.0"?>
<xsl:stylesheet
   version="1.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="html" indent="no"/>

<xsl:variable
   name="currencies"
   select="document('codes.xml')/currencies"/>

<xsl:template match="/">
   <html>
      <head><title>Multiple documents</title></head>
      <body>
         <table>
            <tr bgcolor="#999999">
               <td>Name</td>
               <td>Price</td>
               <td>Description</td>
               <td>Version</td>
            </tr>
            <xsl:apply-templates/>
         </table>
      </body>
   </html>
</xsl:template>

<xsl:template match="product">
   <xsl:variable name="currency" select="price/@currency"/>
   <tr>
      <td><xsl:value-of select="name"/></td>
      <td>
         <xsl:value-of select="price"/>
         <xsl:text> </xsl:text>
         <xsl:value-of select="$currencies/currency[code=$currency]/name"/>
      </td>
      <td><xsl:value-of select="description"/></td>
      <td><xsl:value-of select="version"/></td>
   </tr>
</xsl:template>

</xsl:stylesheet>

Tip 4: XSLT and client-side JavaScript
Although XSLT is powerful, in some cases it is not enough. For example, you may want to use client-side scripts, such as JavaScript, JScript, or VBScript.
As we saw with CSS, XSLT places no limits on the HTML being generated -- and that includes the ability to use scripts. Furthermore, as javascript.xsl in Listing 7 illustrates, it is possible to pass values from XSLT to JavaScript.

Listing 7. javascript.xsl, a style sheet that generates client-side JavaScript
<?xml version="1.0"?>

<xsl:stylesheet
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   version="1.0">

<xsl:output method="html" indent="no"/>

<xsl:template match="/">
   <html>
      <head>
         <title>JavaScript</title>
         <script language="JavaScript"><xsl:comment>
// creates and initializes an array of product descriptions
var urls = new Array()
<xsl:for-each select="products/product">
urls[<xsl:value-of select="position()"/>] = 
       "<xsl:value-of select="@href"/>"
</xsl:for-each>
// user function
function doSelect(i)
{
   open(urls[i])
}
         // </xsl:comment></script>
      </head>
      <body>
         <ul>
            <xsl:for-each select="products/product">
               <li><a href="javascript:doSelect({position()})">
                  <xsl:value-of select="name"/>
               </a></li>
            </xsl:for-each>

         </ul>
      </body>
   </html>
</xsl:template>

</xsl:stylesheet>

Again, to read javascript.xsl, you need to remember what happens where. XSLT is applied first and generates an HTML file. The file includes a <script> element whose content was generated by XSLT. Next the browser loads the HTML file and executes the script. Remember that the script is executed by the browser, not by the style sheet.
For example, in the lines highlighted in red in javascript.xsl, the style sheet initializes a JavaScript array; in effect, passing values to the script.
Later the style sheet generates calls into the function and, again, passes a value to the script (through the XSLT position() function), as shown in the lines of Listing 7 highlighted in blue.
Listing 8 shows what gets generated in HTML. This script is executed by the browser when the user clicks an <a> tag, as shown:

Listing 8. HTML generated from javascript.xsl
<ul>
   <li><a href="javascript:doSelect(1)">Playfield Text</a></li>
        <li><a href="javascript:doSelect(2)">Playfield Virus</a></li>
        <li><a href="javascript:doSelect(3)">Playfield Calc</a></li>
        <li><a href="javascript:doSelect(4)">Playfield DB</a></li>
</ul>

Could you call JavaScript scripts from within the XSLT style sheet instead of from the HTML document? Yes, using XSLT extensions. Unfortunately, XSLT extensions are not fully standardized in XSLT 1.0. XSLT 1.1 will improve the support.
Tip 5: Automating style sheet creation
This tip is the most ambitious so far in this gang of five tips.
In some cases, you have to write so many style sheets that it makes sense to use a style sheet to create them. This is not as difficult as you may think, and it is particularly useful for Web sites written in different languages or sites with many pages that differ only in details.
This is illustrated in the start.xsl style sheet in Listing 9. At first sight it looks like a typical XSLT style sheet. If you take a closer look, you will see that it uses special elements such as <link> and <para> instead of HTML elements (which would have been <a> or <p>). Furthermore, the <xsl:output> element lacks the method attribute.

Listing 9. start.xsl, a generic XSLT style sheet
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">

<xsl:output/>

<xsl:template match="/">
   <page title="XSLT Through Generation">
      <xsl:for-each select="products/product">
         <para>
            <link href="{@href}"><xsl:value-of select="name"/></link>
         </para>
      </xsl:for-each>
   </page>
</xsl:template>

</xsl:stylesheet>

The trick is to use another style sheet, generate_html.xsl, in Listing 10 to turn start.xsl into a more typical style sheet.

Listing 10. generate_html.xsl, a style sheet to adapt start.xsl for HTML
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">

<xsl:template match="xsl:output">
  <xsl:copy>
     <xsl:attribute name="method">html</xsl:attribute>
  </xsl:copy>
</xsl:template>

<xsl:template match="page">
   <html>
      <head><title><xsl:value-of select="@title"/></title></head>
      <body>
         <h1><xsl:value-of select="@title"/></h1>
         <xsl:apply-templates/>
      </body>
   </html>
</xsl:template>

<xsl:template match="para">
   <p><xsl:apply-templates/></p>
</xsl:template>

<xsl:template match="link">
   <a href="{@href}"><xsl:apply-templates/></a>
</xsl:template>

<xsl:template match="@*|node()">
  <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>

The generate_html.xsl style sheet treats start.xsl as an XML document and transforms it into another XML document. The fact that start.xsl is itself an XSLT style sheet is of no concern to generate_html.xsl. For example, the following rule turns the <link> element into <a>:
<xsl:template match="link">
   <a href="{@href}"><xsl:apply-templates/></a>
</xsl:template>

<xsl:output>method
<xsl:template match="xsl:output">
  <xsl:copy>
     <xsl:attribute name="method">html</xsl:attribute>
  </xsl:copy>
</xsl:template>

<xsl:template><xsl:for-each>
<xsl:template match="@*|node()">
  <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
</xsl:template>

Applying generate_html.xsl to start.xsl produces a style sheet, generated.xsl. It is generated.xsl, not start.xsl, that will create the HTML document.
This tip is not about a dog running after his tail, but an example of the sort of benefits gained from pushing the logic of XSLT to its maximum. Indeed, since XSLT is great for transforming XML documents into other XML documents, it stands to reason that it is ideal for transforming XSLT style sheets themselves.

Listing 11. generated.xsl, demonstrating how start.xsl has been specialized for HTML
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    version="1.0">

<xsl:output method="html"/>

<xsl:template match="/">
  <html><head><title>XSLT Through Generation</title></head>
  <body><h1>XSLT Through Generation</h1>
     <xsl:for-each select="products/product">
        <p>
           <a href="{@href}"><xsl:value-of select="name"/></a>
        </p>
     </xsl:for-each>
  </body></html>
</xsl:template>

</xsl:stylesheet>

You will want to apply this tip when your Web site requires the creation of many style sheets. For example, by replacing generate_html.xsl with generate_wml.xsl, you can automatically adapt start.xsl for WML (WML is a markup language for wireless smartphones).

Listing 12. generate_wml.xsl, which adapts start.xsl to WML
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">

<xsl:template match="xsl:output">
<xsl:copy>
 <xsl:attribute name="method">xml</xsl:attribute>
 <xsl:attribute name="doctype-public">-//WAPFORUM//DTD WML 1.1//EN</xsl:attribute>
 <xsl:attribute name="doctype-system">http://www.wapforum.org/DTD/wml_1.1.xml
</xsl:attribute>
</xsl:copy>
</xsl:template>

<xsl:template match="page">
   <wml>
      <card title="{@title}">
         <xsl:apply-templates/>
      </card>
   </wml>
</xsl:template>

<xsl:template match="para">
   <p><xsl:apply-templates/></p>
</xsl:template>

<xsl:template match="link">
   <anchor><go href="{@href}"/><xsl:apply-templates/></anchor>
</xsl:template>

<xsl:template match="@*|node()">
  <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>

The moral of the story: If you are after maximum automation for a large Web site, it is not enough to generate HTML documents automatically from XML documents, you ought to automatically generate XSLT style sheets as well.

Resources
  • The Xalan XSLT processor is distributed as open source. It is available in Java and C++.
  • XT is another popular XSLT processor written in Java. You can even find the beta release of XT.
  • To use standard-compliant style sheets with Internet Explorer, you need to upgrade to MSXML 3.0.
  • The XSLT 1.0 is the latest official version of XSLT, at the time of writing.
  • The XSLT 1.1 is currently under development. It features a few needed improvements.
  • The W3C XSLT page offers many pointers to XSLT resources.

Reference

http://www.ibm.com/developerworks/library/x-xslt5/index.html

0 comments:

Post a Comment