Information exchange in MultiTorg

By Dag Solvoll, Geir Ivarsøy, Håkon W Lie, and Per E Dybvik

1 Introduction
2 Information models; structure and distribution
2.1 Information structures and formats
2.2 Translation among document formats
2.3 Hyperdocuments
3 Standardisation of document architectures
3.1 ODA
3.1.1 Description
3.1.2 Document Application Profiles
3.1.3 Extensions
3.2 SGML based standards
3.2.1 SGML
3.2.2 HTML
3.2.3 HyTime
3.3 MHEG
4 The MultiTorg project
4.1 The World-Wide Web (WWW) model
4.2 Information included in MultiTorg
4.3 Why is MultiTorg based on Internet protocols and WWW?
4.4 Unanswered questions
5 Discussion and conclusion
6 Acknowledgement

1 Introduction

The creation of national and global information highways is being discussed in many fora. There are, however, a number of problems that have to be solved before a world-wide information network will offer the required quality of service.

This paper discusses problems and options when exchanging information in heterogeneous networks. The basis of the discussion is the everlasting 'Tower of Babel' problem; to exchange information, one needs a common language. Similarly, when computers communicate, they too need a common basis for the exchange of information. Establishing common formats between applications is one of the major challenges in order to promote communication.

Section 2 discusses the relationship between information formats and the internal information structure used by applications.

Section 3 discusses some of the emerging document formats in distributed multimedia applications.

Section 4 contains a description of an information system set up as a part of the MultiTorg project at Norwegian Telecom Research (NTR). We describe some of the options when setting up a distributed information system in a heterogeneous environment, and the resulting system.

Finally, the paper ends with some concluding remarks about the future for MultiTorg and information systems in general.

2 Information models; structure and distribution

When writing a book, the author structures the text in logical parts, e.g. chapters; when making a movie, the director structures the film into scenes. In order to represent these human created structures in a digital computer, the computer needs data structures. These data structures are described using abstract grammars, and represented using specific formats, either for computer storage or as transmission formats.

2.1 Information structures and formats

Every system contains both structure and process. The process changes the structure and the structure represents the state of the system. In a computer application the structure is represented internally as a data structure and externally as a data format. In order to exchange information, the internal data structure has to be written into the external format.

A document is a piece of information that can be processed as one unit. Examples of documents are business reports, invoices, movies and newspapers. A document might be described using several structures, but we can distinguish between two main types; structures regarding the whole document, and structures regarding a particular content type, e.g. graphics, text, or audio. The types are denoted document structures - and content structures.

Different document structures denote different views of the document. Reading an office document, the user may see the 'logical' structure defining the chapters, sections and paragraphs, or the 'layout' structure defining pages, columns and blocks.

Formatting is the process of transforming the logical structure into a layout structure. Presentation is the process of transforming the layout structure to an image structure, e.g. a bitmap image on a screen or printer. See figure 1.

In order to communicate, the internal document structure needs an external representation, called a document format. To describe this format it is necessary to use both an abstract syntax of the format, e.g. defined using BNF, and the coded representation of this syntax.

Figure 2 shows a sample document structure description (a), and the coded representation (b) described, using an ad hoc BNF. (c) contains a sample document using the grammar, and coding of a and b.

Many applications operate with several structures simultaneously. FrameMaker, a desktop publishing system, uses a logical structure including paragraphs, pictures, tables, and maths. Also, FrameMaker uses style-sheets to indicate the formatting of each paragraph. Layout processing produces a layout structure from the logical structure and the style-sheets, which is imaged onto the physical screen or printer, which is the final structure.

The TeX system is another example; the user creates the document using the abstract TeX grammar and a conventional text editor, e.g. emacs. The document is then formatted by the TeX formatter creating a layout format (DVI), which can be presented as a bitmap on the screen or on a printer (see figure 3).

The content of a document might be of several types; text, bitmap, video, etc. Each content type has its own structure and format. For example text has a format containing character codes and sets, fonts and other characteristics associated with text.

The structural elements of a bitmap, on the other hand, are called picture elements (pixels). Compression techniques, which are quite common, are also parts of a bitmap structure.

2.2 Translation among document formats

Exchanging documents among heterogeneous applications requires translation. Translation is the operation of transforming a document in format A into the same document in format B. If the two formats A and B are different but derived from the same structure, the translation is easy. This is the case in many document formats based on a logical mark-up language. In such documents the text is interspersed with tags (mark-ups) determining a structured element (figure 2(c) shows a sample of a logical mark-up document). So, when the structure in A and B is similar, and only the tags differ, the translation process is trivial.

If both format and structure of A and B differ, the translation process becomes difficult, sometimes even impossible. Translation from layout structure (e.g. a Postscript document) to a logical structure (e.g. LaTeX) is in general impossible. The translation has to be performed on similar structures, i.e. logical to logical, or layout to layout.

For a description of some of the translation utilities we have developed in MultiTorg, see section 4.

2.3 Hyperdocuments

References are essential in documents. Both footnotes and pointers to other documents are commonly found in paper-based documents. In electronic documents, hypertext implements references and lowers the threshold of accessing the referenced information to a mouse click.

In the 80s, hypertext systems typically referenced themselves in closed loops. With the emerging global computer networks, hyperlinks will offer access to documents outside your disk or LAN, and they are still only a mouse click away.

Documents on paper can be retraced due to referencing conventions and numbering schemes like ISBN, which provides a unique identity. Using this system, the source of the referenced document is uniquely determined. Electronic documents need similar functionality, and one emerging option is the Universal Resource Locator discussed in 3.2.2.

3 Standardisation of document architectures

In order to communicate information in a heterogeneous world it is necessary to have a standardised interchange format. The simplest - and most widely used - is ASCII, which is useful for unformatted text but cannot represent document structures.

Page Description Languages, e.g. Postscript, take another approach to document interchange. They allow you to exchange electronic documents preserving layout information, but the documents are not processable - you cannot edit the content of a document.

In order to handle the interchange of processable documents with various content types (text, bitmaps, audio speech, etc.) the standardisation organisations have developed standards like ODA and SGML. These will be described in the following sections.

There is also ongoing work to provide standards for the representation of documents with continuous content types like audio and video, and with hypermedia functionality. These are HyTime (based on SGML), MHEG, and several extensions to ODA.

3.1 ODA

The Open Document Architecture (ODA) is an ISO and CCITT standard designed to facilitate transmission of compound documents between open systems. It has its focus on blind interchange - the originator need not know anything about the recipient's system. An ODA document may easily be transferred from one word processor to another (if both support ODA). In this case the document is said to be in a 'revisable' or 'processable' form. ODA also supports a 'final' or 'formatted' form, i.e. the receiver cannot edit the document.

The ODA standard addresses the interchange of documents in a typical office environment. Examples of documents that can be handled are memoranda, letters, invoices, forms, and reports. Documents may include graphics and images. In order to meet the increasing interest in new data types, there is also ongoing work to add hypermedia functionality.

3.1.1 Description

The ODA standard divides information into three main categories: logical information, layout information, and content. Further, ODA also has three more components: generic structures, styles, and a document profile.

The logical information is the structuring of the content in terms of hierarchy and order. E.g. a chapter may be seen as a sequence of sections, which in turn may be a sequence of paragraphs. This is called the logical structure.

The layout information organises the physical appearance (size and positioning) of the content on a presentation medium (typically paper or computer screens). ODA defines a hierarchy of layout components called page sets, pages, frames and blocks. This organisation is called the layout structure.

Content is organised in portions of either text, images or graphics. These content portions are references from both the logical and layout structure, and make it possible to either have a logical view or a layout view of the content. Different views of the document facilitates different applications. A printing application would only need the layout view in order to construct an image of the document, while a word processor would use the logical view in the editing process. See figure 4.

Generic structures can be logical or layout. They are rules which define the class of a document. For example, a class 'article' may determine that the document must start with an abstract, followed by one or more numbered chapters, each consisting of only one level of sections.

ODA has divided the style concept into layout styles and presentation styles. Layout styles are associated with the logical structure. They can specify, for example, that a heading and the following paragraph both should appear on the same page. Presentation styles are concerned with the layout and imaging aspects of the content and are specified for the lowest level logical and layout components. There are different sets of presentation styles for different content types. For character content one may specify parameters like line spacing or which fonts to use.

The document profile contains information about the document as a whole. It has management information (e.g. title, name of the author, keywords), and technical information like which structures are present and which coding standards are used for different content types.

3.1.2 Document Application Profiles

ODA is a very complex standard because of its general applicability. However, it specifies a way to form subsets of the total set of features to implement different levels of user requirements. These subsets are called Document Application Profiles (DAP).

There exist three levels of profiles with increasing levels of features. The first level provides for documents containing character content only. The document may have sequences of paragraphs which are laid out in a single column of text. The second level profile supports documents with both character, image and graphics content and can be structured into chapters, sections, and paragraphs. The content may be laid out in multi-columns. The third level will provide support for more sophisticated word processing.

3.1.3 Extensions

The current version of ODA addresses documents in an office environment, but extensions to the standard are under way. Support for audio will be included in the next version of the standard, while the standardisation work on video has just begun.

Also, there is work going on to add support for hypermedia (HyperODA). In the HyperODA model, a document consists of one or more ODA documents containing links between arbitrary document elements. These links may be separated from the documents they refer to and interchanged independently.

3.2 SGML based standards

3.2.1 SGML

Standard Generalised Mark-up Language (SGML) is a language for defining structured documents and is an ISO standard. In contrast to ODA, SGML's primary concern is logical structuring of the content. The logical structuring is done by adding semantic mark-up to content parts.

Mark-up is text that is added to a document in order to convey semantic information. The mark-up serves two purposes: separating the logical elements of the document from the content, and specifying the processing function to be performed on those elements. The logical elements are marked by adding a generic identifier to the start of the element (start-tag) and to the end (end-tag).

It is possible to define classes of documents with a Document Type Definition (DTD). A DTD defines the mark-up structure permitted in the class. It is also used to minimise mark-up, i.e. permit omissions of unnecessary tags. Neither SGML itself nor the DTD specify how the document should be formatted - this is application dependent.

An SGML document is divided into three different parts. The first is the SGML declaration which specifies the character set of the document; which characters have a special meaning to SGML and which advanced features are used. The declaration can be omitted if the document only uses default features. The second part is the DTD which specifies the document type and which tags can be used. The last part is the marked-up document itself, called a document instance.

The lack of layout information in the document format has both advantages and disadvantages. The advantage is that a single SGML document can be formatted or processed in many ways. This makes SGML very powerful and general. The disadvantage is that in order to be used as an interchange format, the communicating parts must agree upon the interpretation of the used DTD.

3.2.2 HTML

Hypertext Mark-up Language (HTML) is defined in terms of SGML and is a simple SGML DTD. It is capable of handling hyperlinks and simple formatting. The hyperlinks are implemented as tags with attributes giving the location of the end of the link. This location can be within the document itself as well as an external document. Figure 5a gives an example of a hyperlink in an HTML document.

The structure of the hyperlink is defined by a mechanism called 'Universal Resource Locator' (URL). This is in fact an address containing three sub-addresses; a protocol definition, a host address (IP-address) with port number and finally a file name. (Figure 5b contains a complete address.)

HTML is the format used in the World Wide Web (WWW) information system. WWW uses a client-server model, and the client is responsible for formatting and presenting HTML documents. Figure 6 shows an example of an SGML document using the HTML, For a description of WWW see section 4.7.

Because of the limits in the formatting capabilities in HTML, an extension of the format called HTML+ is under way. New features that will be added in HTML+ are support for tables, captioned pictures, and fill-in forms for querying remote databases.

3.2.3 HyTime

The Hypermedia/Time-based Structuring Language (HyTime) is a standardised infrastructure for the representation of integrated open hypermedia documents. It is based on SGML and defines constructs for making DTDs for hypermedia documents. HyTime can represent hypertext linking, time scheduling, and synchronisation. Links can be made both to documents that conform to HyTime and to other documents.

Objects in a HyTime hyperdocument can be formatted and unformatted documents, audio and video segments, still images, etc. The documents that constitute a HyTime hyperdocument can conform to any architecture and be represented in any notation permitted by that architecture.

HyTime is intended to be an interchange format for hypermedia applications and not used as an internal representation for such applications. It is highly expressive and may be difficult to optimise for runtime efficiency.

3.3 MHEG

MHEG is a draft standard for representing multimedia and hypermedia information objects, named after the ISO group developing it (Multimedia and Hypermedia information coding Expert Group). The standard does not define any content format, it only provides rules regarding the structuring of objects. MHEG accepts the use of any standard format for monomedia content.

MHEG categorises the objects in classes which share behaviour and characteristics. MHEG defines content classes for each relevant media type, a selection class for interaction, an action class for rendering objects, a link class for hyperlinks and a composite class for grouping related objects.

MHEG represents objects in a final form with the aim of direct presentation. It is thus unsuitable as an input format for hypermedia authoring applications. A potential approach is to use MHEG as the output format of hypermedia application taking HyTime as input. This would benefit from both the expressive power of HyTime and the runtime efficiency of MHEG.

4 The MultiTorg project

At Norwegian Telecom Research, the MultiTorg project attempts to develop models for supporting a distributed, electronic marketplace for information. In this marketplace, information vendors will offer their electronic goods, and customers can choose from a variety of services and products.

The information marketplace includes three actors; information providers, network and marketplace operators, and finally the users (see figure 7). The marketplace will provide functions for both the information providers and the users. Examples of such functions are: communication protocols, marketing, translation among information formats, accounting, and security.

The MultiTorg project is not attempting to build a full-scale electronic market, rather, we are trying to build tools for demonstrating the potential of an electronic information marketplace. We have developed several prototypes serving different equipment and users.

One prototype is based on the World-Wide Web which is one of the fastest growing services on the Internet. Another prototype forwards personalised news and mail to pagers. In this paper, we will concentrate on describing our work within the context of WWW.

4.1 The World-Wide Web (WWW) model

In 1986, at the CERN particle physic laboratory, a group of people started working on something that now is referred to as World-Wide Web. The idea behind 'the web' was to use the rapidly growing Internet infrastructure to distribute information in an efficient manner throughout a campus network; later also to the Internet.

The WWW is a client-server architecture where the server is the protector of information. There are many servers in the network (see figure 8) and clients access a server by an address definition called a 'Universal Resource Locator' (URL).

Most software supporting the web is freely available. This includes servers as well as clients. At this point, all servers run under the UNIX operating system, taking advantage of native support for TCP/IP networking.

On the client side, implementations exist for X11, MS-Windows, text terminals, etc. If you have a WWW client, we welcome you to access one of our servers at http://www.nta.no/. From there, you can access most of the content described in section 4.2.

4.2 Information included in MultiTorg

The technology of WWW is general enough to be applied to many environments. With great success, various types of information has been made available using WWW, e.g. documentation, personnel databases, weather reports and news. We believe WWW is suited for commercial use; i.e. to mediate information between suppliers and consumers of information.

In order to gain experience with supporting information systems in heterogeneous environments, we gather information from various sources. The goal has been to offer information that is interesting enough for people to start using the system without other incentives. (See figure 9.)

Typically, an information vendor uses proprietary tools and formats for manipulating information. One of the major problems in establishing a prototype for an electronic marketplace is to transform these formats into HTML which is used by MultiTorg. In most cases, the information vendor will not be interested in introducing new tools supporting the HTML format. In order to ease the 'Tower of Babel', translators have been built.

4.3 Why is MultiTorg based on Internet protocols and WWW?

The Internet is a global computer network. The number of connected computers is growing very fast. The network contains millions of users, an enormous test bed for new services.

Much of the core software of MultiTorg is developed by people strongly attached to the Internet, and their software is freely distributed there. World Wide Web is emerging as the preferred way of distributing and accessing information on the Internet due to a number of reasons:

During the last year several thousand new WWW servers have been installed, of which MultiTorg is one. This global information network contains vast amounts of information, but the information is restricted to those of us that are 'on the net'.

WWW runs on top of the TCP/IP protocol suite; the basis of the Internet. As the Internet increasingly takes advantage of broadband connections, we will see the introduction of services based on continuous media like video and audio. We believe WWW is a foundation that will scale to also include these media types.

4.4 Unanswered questions

Having established an arena for experiments, we have located areas where a technical and social development should or will take place.

5 Discussion and conclusion

The number of computers connected in networks is growing fast. New broadband networks will provide a communication highway were new services will emerge. Information will be the basis for all these new services.

In a not too distant future, there will exist an open electronic marketplace for information. We believe the Internet and the emerging WWW technology should be seriously considered as foundations for such a market.

In MultiTorg, we have developed several prototypes for converting formats, accessing distributed documents, and presenting information through various media.

The 'Tower of Babel' on computer networks is an important and problematic issue, and applications will continue to use native formats. Although several candidates for universal formats exist, it is not likely that one format will be dominant in the next decade. Rather, we should concentrate on building translation utilities.

6 Acknowledgement

We would like to thank Kåre Andersen of NTB for productive co-operation.