Formatting Objects considered harmful

This article was published in 1999. Since then, XSLT and XSL-FO have become separate Recommendations (XSLT 1.0, XSLT 2.0, XSL-FO 1.0, XSL-FO 1.1), but has not seen much use on the web. This article has been updated to use the terminlogy established by the Recommendations. Also, the style sheet has been tweaked, the contact information has been updated and some links have been revised. The original article is here

Abstract

The W3C Working Group on XSL is currently producing two specifications: a transformation language (XSLT) and a set of formatting objects written in XML (XSL-FO). The idea is for XSLT to transform XML data and documents into set of formatting objects which subsequently can be rendered. On the ladder of abstraction from presentation to semantics, XSL-FO is at the level of presentational HTML elements. A Web of XSL-FO documents can be compared to a Web of HTML documents with only FONT and BR tags. Although not intended to be used on the Web, it's unlikely that it can be prevented. XSL-FO is therefore a threat to accessibility, device-independence and the dream of a semantic Web. The note ends with some suggestions on how to solve the problem.

XSLT and XSL-FO

XML holds the promise of being the cornerstone in the building of a semantic Web. By capturing semantics which is outside the scope of HTML, new formats written in XML will facilitate, amongst other things, better document cataloging and discovery services for authors and users.

The XSL Working Group is working on two specifications which, if successful, will change this picture. The first is a transformation language (XSLT), and the second is a DTD for formatting objects written in XML (XSL-FO).

A common use of XSLT is to transform XML data and documents into HTML on the server side. Several experimental implementations support XSLT and they allow content providers to use their favorite DTDs internally while serving HTML to the huge installed base of Web browsers. XSLT provides a declarative way of specifying simple transformations, and this is a good thing.

XSLT can also be used to generate XSL-FO. Formatting objects describe how chunks of information are formatted before presented to a human user. The push for XSL-FO comes from vendors with a noble goal: they would like to improve the quality of printed material from the Web. Unfortunately, when transforming documents into XSL-FO, all semantics is removed and only the human presentation is left. Moreover, the presentation is tied to a certain output media (which most likely is visual).

If XSL-FO is deployed on the Web, accessibility, device-independence and semantics will be the victims. It's important to note that this problem only arises when XSL-FO are shipped across the Web. When contained within formatters, formatting objects are not not harmful.

Code examples

This section will give three examples of how XSLT can be used. The first example transforms from XML to HTML, the second transforms from XML to XSL-FO and the third transforms from XML to HTML/CSS. All examples use this simple XML element as input:

Example 1: XML to HTML

The resulting HTML is at a high enough level of abstraction that device-independence and accessibility is preserved. What is lacking in information about how to present it.

Example 2: XML to XSL-FO

In this example, the XSLT sheet transforms the XML element into a formatting object:

The difference between example 1 and example 2 is one of semantics vs. presentation. When transformed into HTML, the semantics of the XML is preserved since the H1 element is globally recognized as being a headline of level 1. When transformed into XSL-FO, semantics is removed and replaced by presentational properties.

Example 3: XML to HTML/CSS

The last example transforms XML into an HTML element with associated CSS stylistic properties:

The result preserves the semantics while also containing information on how to present the content. This is the best of both worlds.

(When authoring with CSS, one would normally move the stylistic properties into a separate style sheet. This eases maintenance and makes documents smaller. However, both forms are valid and one can programatically convert between the two.)

A demonstration

XSL-FO was not designed to be used over the Web, and most people interested enough to read this far will agree that the use of XSLT+XSL-FO described in this document is abuse. However, it seems that there is no way to stop the abuse. Rather, it seems like conforming implementations are required to support XSL-FO on the Web.

So, straight out of the box, XSLT+XSL-FO browsers will display XSL-FO documents from the Web.

Analysis

When removing document semantics and replacing it with presentational properties, the content moves downwards on the ladder of abstraction and important information is lost. For example, generating an aural presentation based on the output of example 2 is much harder than basing the aural rendition on semantic markup which is present in the output from example 1 and 3. In a scenario where only visual formatting objects are published on the Web, aural renditions are brought back to perform the role of the "screen reader", i.e. special software that uses heuristics to decode information meant for another medium. Also, other services which take advantage of semantic markup -- e.g. search engines -- will perform worse.

For these reasons, I believe W3C should encourage authors to publish documents in semantically rich HTML and XML [2] with attached style sheets. The style sheets should be evaluated on the client side. This gives us the best of both worlds: rich applications and rich presentations.

FAQ

This section contains answers to questions that often come up in discussions about the use of XSL-FO on the Web.

The XSLT to XSL-FO transformations don't have to take place on the server side. Can't you preserve semantics by performing the transformation in the client?

Yes. Given the XML source and the XSLT sheet the transformation can take place on the client side. This preserves semantics, and the number of bytes sent over the Web will generally be smaller. In this scenario, however, there is no need for an XML vocabulary to express formatting objects since the client will both transform and present the content. This highlights an important point which isn't clear from the title of the document: it's not formatting objects per se that are harmful (any system that does formatting uses some kind of formatting objects). The harm is done when formatting objects are stored and shipped over the Web.

Ok, but even if transformations take place on the server, accessibility can still be preserved. By defining formatting objects for all media types, presentations for all sorts of devices can be generated, no?

In theory, yes. In practice, no. For example, to successfully present content aurally, there are four prerequisites:

there must be a specification for aural formatting objects
there must be implementations of aural formatting objects
the fact that the user has an aural client must be known to the server
all web sites must install XSLT sheets to transform content into aural formatting objects

Among these, the first two will require much time and work. The third is undesirable, while the fourth is impossible in practice. Besides, caching suffers.

RDF is important, but will not replace semantic markup. That has never been the intention of W3C Metadata efforts.

But, in order to do high-quality printing we need transformations. CSS doesn't have transformations and is therefore unusable. Isn't that why XSLT and XSL-FO were developed?

High-quality printing is very hard, and can't be done without looking at the shape of the glyphs. Neither XSL-FO nor CSS takes this approach. Instead, they both have the same property/value model and as long as the properties and values are the same, their potential for improving printing is the same. Transformations are, if not a prerequisite of printing, at least a helpful tool. The transformation step comes before the styling part, and XSLT can equally well be used with CSS as it can with XSL-FO.

In my organization, I have thousands of legacy word processing documents where styles have been used inconsistently. Don't you think it's better to use XSL-FO and admit that there is no semantics rather than using HTML and claim there is?

Use presentational HTML or PDF for your documents. We can't risk losing the semantic Web due to legacy documents.

W3C is developing SVG and the elements defined in the SVG WD don't have much semantics. They're more like formatting objects. Aren't they just as harmful?

No. Compared to the GIF images SVG will replace, the move represents an upwards climb on the ladder of abstraction. XSL-FO, on the other hand, represents a steep downwards step compared to a CSS-based solution.

Most of the HTML documents on the Web are presentational. Why do you think XSL-FO can make the situation worse?

W3C has been working hard to deprecate presentational elements in HTML, but it's true that they are still widely used. However, the crucial difference between using HTML and XSL-FO is that HTML has the ability to represent semantics as well. HTML allows authors to do the right thing, but this is not the case with XSL-FO.

But the semantics HTML can capture is so shallow. What's the point of using it?

Consider one example from Braille renderings. Since Braille characters use much space, words are often contracted to fit more text on one page. However, some words -- for example program variables -- should not be contracted. HTML gives you the ability to express this (using the VAR element) and this is crucial to improve Braille renderings. XSL-FO, on the other hand, gives access to the text but information that can be used to decide if a word can be contracted or not is lost.

Avoiding disaster

Here are some commented ideas on how to avoid the disaster scenario which has been outlined above:

Conclusion

The use of XSL-FO on the Web is a threat to accessibility, device-independence and a semantic Web. Although not designed for this use, the current architecture does not prevent it. Technical barriers should be put in place to avoid XSL-FO from being used as a document format on the Web. If technical barriers aren't possible, XSL-FO should not become a W3C Recommendation.

Footnotes

[1] I call it an "XSLT sheet" rather than an "XSLT style sheet" since the XSLT language has no notion of style.

[2] Publishing semantically rich XML should be encouraged when the semantics is globally known, e.g. MathML. Publishing arbitrary XML should be discouraged.