Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
XML AND WEB
SERVICE
VII SEM
PTCSE (MIT)
NOTES
PREPARED BY
N.VIJAYAKUMAR
7/7 PTCSE
MIT
1
UNIT I
XXm
mll –– BBeenneeffiittss
AAddvvaannttaaggeess O
Off XXm
mll O mll,, EEddii,, DDaattaabbaasseess
Ovveerr HHttm
XXm
mll BBaasseedd SSttaannddaarrddss
SSttrruuccttuurriinngg W
Wiitthh SScchheem mll SScchheem
maass –– DDttdd,, XXm maass
XXm
mll PPrroocceessssiinngg –– DDoom
m
XXm
mll PPrroocceessssiinngg –– SSaaxx
PPrreesseennttaattiioonn TTeecchhnnoollooggiieess –– XXssll
XXffoorrm
mss ,, XXhhttm
mll
TTrraannssffoorrm
maattiioonn –– XXSSLLTT
XXlliinnkk,,XXppaatthh
XXqquueerryy
2
XML INTRODUCTION
XML stands for eXtensible Markup Language.XML is designed to transport and store data.
XML Document Example
<?xml version="1.0"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
Introduction to XML
XML was designed to transport and store data.HTML was designed to display data.
What is XML?
• XML stands for EXtensible Markup Language
• XML is a markup language much like HTML
• XML was designed to carry data, not to display data
• XML tags are not predefined. You must define your own tags
• XML is designed to be self‐descriptive
• XML is a W3C Recommendation
The Difference between XML and HTML
XML is not a replacement for HTML.XML and HTML were designed with different goals:
• XML was designed to transport and store data, with focus on what data is.
• HTML was designed to display data, with focus on how data looks.
HTML is about displaying information, while XML is about carrying information.
XML Does not DO Anything
Maybe it is a little hard to understand, but XML does not DO anything. XML was created to
structure, store, and transport information.
<note>
3
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
The note above is quite self descriptive. It has sender and receiver information, it also has a
heading and a message body.But still, this XML document does not DO anything. It is just pure
information wrapped in tags. Someone must write a piece of software to send, receive or
display it.
XML is Just Plain Text
XML is nothing special. It is just plain text. Software that can handle plain text can also handle
XML.
However, XML‐aware applications can handle the XML tags specially. The functional meaning of
the tags depends on the nature of the application.
With XML You Invent Your Own Tags
The tags in the example above (like <to> and <from>) are not defined in any XML standard.
These tags are "invented" by the author of the XML document.
That is because the XML language has no predefined tags.
The tags used in HTML (and the structure of HTML) are predefined. HTML documents can only
use tags defined in the HTML standard (like <p>, <h1>, etc.).
XML allows the author to define his own tags and his own document structure.
XML is Not a Replacement for HTML XML is a complement to HTML.
It is important to understand that XML is not a replacement for HTML. In most web
applications, XML is used to transport data, while HTML is used to format and display the data.
My best description of XML is this:
XML is a software‐ and hardware‐independent tool for carrying information.
XML is a W3C Recommendation
XML became a W3C Recommendation 10. February 1998.
XML is Everywhere
We have been participating in XML development since its creation. It has been amazing to see
how quickly the XML standard has developed, and how quickly a large number of software
vendors has adopted the standard.
XML is now as important for the Web as HTML was to the foundation of the Web.
XML is everywhere. It is the most common tool for data transmissions between all sorts of
applications, and is becoming more and more popular in the area of storing and describing
information.
How Can XML be Used?
XML is used in many aspects of web development, often to simplify data storage and sharing.
XML Separates Data from HTML
4
If you need to display dynamic data in your HTML document, it will take a lot of work to edit the
HTML each time the data changes.
With XML, data can be stored in separate XML files. This way you can concentrate on using
HTML for layout and display, and be sure that changes in the underlying data will not require
any changes to the HTML.With a few lines of JavaScript, you can read an external XML file and
update the data content of your HTML.
XML Simplifies Data Sharing
In the real world, computer systems and databases contain data in incompatible formats.XML
data is stored in plain text format. This provides a software‐ and hardware‐independent way of
storing data.This makes it much easier to create data that different applications can share.
XML Simplifies Data Transport
With XML, data can easily be exchanged between incompatible systems.
One of the most time‐consuming challenges for developers is to exchange data between
incompatible systems over the Internet.
Exchanging data as XML greatly reduces this complexity, since the data can be read by different
incompatible applications.
XML Simplifies Platform Changes
Upgrading to new systems (hardware or software platforms), is always very time consuming.
Large amounts of data must be converted and incompatible data is often lost.
XML data is stored in text format. This makes it easier to expand or upgrade to new operating
systems, new applications, or new browsers, without losing data.
XML Makes Your Data More Available
Since XML is independent of hardware, software and application, XML can make your data
more available and useful.
Different applications can access your data, not only in HTML pages, but also from XML data
sources.
With XML, your data can be available to all kinds of "reading machines" (Handheld computers,
voice machines, news feeds, etc), and make it more available for blind people, or people with
other disabilities.
XML is Used to Create New Internet Languages
A lot of new Internet languages are created with XML. Here are some examples:
• XHTML the latest version of HTML
• WSDL for describing available web services
• WAP and WML as markup languages for handheld devices
• RSS languages for news feeds
• RDF and OWL for describing resources and ontology
• SMIL for describing multimedia for the web
If Developers Have Sense
If they DO have sense, future applications will exchange their data in XML.
5
The future might give us word processors, spreadsheet applications and databases that can
read each other's data in a pure text format, without any conversion utilities in between.
We can only pray that all the software vendors will agree.
XML Tree
XML documents form a tree structure that starts at "the root" and branches to "the leaves".
An Example XML Document
XML documents use a self‐describing and simple syntax:
<?xml version="1.0" encoding="ISO‐8859‐1"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
The first line is the XML declaration. It defines the XML version (1.0) and the encoding used
(ISO‐8859‐1 = Latin‐1/West European character set).
The next line describes the root element of the document (like saying: "this document is a
note"):
<note>
The next 4 lines describe 4 child elements of the root (to, from, heading, and body):
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
And finally the last line defines the end of the root element:
</note>
You can assume, from this example, that the XML document contains a note to Tove from Jani.
Don't you agree that XML is pretty self‐descriptive?
XML Documents Form a Tree Structure
XML documents must contain a root element. This element is "the parent" of all other
elements.
The elements in an XML document form a document tree. The tree starts at the root and
branches to the lowest level of the tree.
All elements can have sub elements (child elements):
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
The terms parent, child, and sibling are used to describe the relationships between elements.
Parent elements have children. Children on the same level are called siblings (brothers or
sisters).All elements can have text content and attributes (just like in HTML).
6
Example:
The image above represents one book in the XML below:
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
The root element in the example is <bookstore>. All <book> elements in the document are
contained within <bookstore>.
The <book> element has 4 children: <title>,< author>, <year>, <price>.
XML Syntax Rules
The syntax rules of XML are very simple and logical. The rules are easy to learn, and easy to use.
All XML Elements Must Have a Closing Tag
In HTML, you will often see elements that don't have a closing tag:
<p>This is a paragraph
7
<p>This is another paragraph
In XML, it is illegal to omit the closing tag. All elements must have a closing tag:
<p>This is a paragraph</p>
<p>This is another paragraph</p>
Note: You might have noticed from the previous example that the XML declaration did not have
a closing tag. This is not an error. The declaration is not a part of the XML document itself, and
it has no closing tag.
XML Tags are Case Sensitive
XML elements are defined using XML tags.
XML tags are case sensitive. With XML, the tag <Letter> is different from the tag <letter>.
Opening and closing tags must be written with the same case:
<Message>This is incorrect</message> <message>This is correct</message>
Note: "Opening and closing tags" are often referred to as "Start and end tags". Use whatever
you prefer. It is exactly the same thing.
XML Elements Must be Properly Nested
In HTML, you might see improperly nested elements:
<b><i>This text is bold and italic</b></i>
In XML, all elements must be properly nested within each other:
<b><i>This text is bold and italic</i></b>
In the example above, "Properly nested" simply means that since the <i> element is opened
inside the <b> element, it must be closed inside the <b> element.
XML Documents Must Have a Root Element
XML documents must contain one element that is the parent of all other elements. This
element is called the root element.
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
XML Attribute Values Must be Quoted
XML elements can have attributes in name/value pairs just like in HTML.
In XML the attribute value must always be quoted. Study the two XML documents below. The
first one is incorrect, the second is correct:
<note date=12/11/2007>
<to>Tove</to>
<from>Jani</from>
</note>
<note date="12/11/2007">
<to>Tove</to>
<from>Jani</from>
</note>
The error in the first document is that the date attribute in the note element is not quoted.
8
Entity References
Some characters have a special meaning in XML.
If you place a character like "<" inside an XML element, it will generate an error because the
parser interprets it as the start of a new element. This will generate an XML error:
<message>if salary < 1000 then</message>
To avoid this error, replace the "<" character with an entity reference:
<message>if salary < 1000 then</message>
There are 5 predefined entity references in XML:
< < less than
> > greater than
& & ampersand
' ' apostrophe
" " quotation mark
Note: Only the characters "<" and "&" are strictly illegal in XML. The greater than character is
legal, but it is a good habit to replace it.
Comments in XML
The syntax for writing comments in XML is similar to that of HTML.
<!‐‐ This is a comment ‐‐>
White‐space is Preserved in XML
HTML truncates multiple white‐space characters to one single white‐space:
HTML: Hello my name is Tove
Output: Hello my name is Tove.
With XML, the white‐space in a document is not truncated.
XML Stores New Line as LF
In Windows applications, a new line is normally stored as a pair of characters: carriage return
(CR) and line feed (LF). The character pair bears some resemblance to the typewriter actions of
setting a new line. In Unix applications, a new line is normally stored as a LF character.
Macintosh applications also use an LF to store a new line.
XML Elements
An XML document contains XML Elements.
What is an XML Element?
An XML element is everything from (including) the element's start tag to (including) the
element's end tag.
An element can contain other elements, simple text or a mixture of both. Elements can also
have attributes.
<bookstore>
<book category="CHILDREN">
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
9
<price>29.99</price>
</book>
<book category="WEB">
<title>Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
In the example above, <bookstore> and <book> have element contents, because they contain
other elements. <author> has text content because it contains text.
In the example above only <book> has an attribute (category="CHILDREN").
XML Naming Rules
XML elements must follow these naming rules:
• Names can contain letters, numbers, and other characters
• Names cannot start with a number or punctuation character
• Names cannot start with the letters xml (or XML, or Xml, etc)
• Names cannot contain spaces
Any name can be used, no words are reserved.
Best Naming Practices
Make names descriptive. Names with an underscore separator are nice: <first_name>,
<last_name>.
Names should be short and simple, like this: <book_title> not like this:
<the_title_of_the_book>.
Avoid "‐" characters. If you name something "first‐name," some software may think you want
to subtract name from first.
Avoid "." characters. If you name something "first.name," some software may think that
"name" is a property of the object "first."
Avoid ":" characters. Colons are reserved to be used for something called namespaces (more
later).
XML documents often have a corresponding database. A good practice is to use the naming
rules of your database for the elements in the XML documents.
Non‐English letters like éòá are perfectly legal in XML, but watch out for problems if your
software vendor doesn't support them.
XML Elements are Extensible
XML elements can be extended to carry more information.
Look at the following XML example:
<note>
<to>Tove</to>
<from>Jani</from>
<body>Don't forget me this weekend!</body>
</note>
10
Let's imagine that we created an application that extracted the <to>, <from>, and <body>
elements from the XML document to produce this output:
MESSAGE
To: Tove
From: Jani
Don't forget me this weekend!
Imagine that the author of the XML document added some extra information to it:
<note>
<date>2008‐01‐10</date>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
Should the application break or crash?
No. The application should still be able to find the <to>, <from>, and <body> elements in the
XML document and produce the same output.
One of the beauties of XML, is that it can often be extended without breaking applications.
XML Attributes
XML elements can have attributes in the start tag, just like HTML.
Attributes provide additional information about elements.
XML Attributes
From HTML you will remember this: <img src="computer.gif">. The "src" attribute provides
additional information about the <img> element.
In HTML (and in XML) attributes provide additional information about elements:
<img src="computer.gif">
<a href="demo.asp">
Attributes often provide information that is not a part of the data. In the example below, the
file type is irrelevant to the data, but important to the software that wants to manipulate the
element:
<file type="gif">computer.gif</file>
XML Attributes Must be Quoted
Attribute values must always be enclosed in quotes, but either single or double quotes can be
used. For a person's sex, the person tag can be written like this:
<person sex="female">
or like this:
<person sex='female'>
If the attribute value itself contains double quotes you can use single quotes, like in this
example:
<gangster name='George "Shotgun" Ziegler'>
or you can use character entities:
<gangster name="George "Shotgun" Ziegler">
11
XML Elements vs. Attributes
Take a look at these examples:
<person sex="female">
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
<person>
<sex>female</sex>
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
In the first example sex is an attribute. In the last, sex is an element. Both examples provide the
same information.
There are no rules about when to use attributes and when to use elements. Attributes are
handy in HTML. In XML my advice is to avoid them. Use elements instead.
My Favorite Way
The following three XML documents contain exactly the same information:
A date attribute is used in the first example:
<note date="10/01/2008">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
A date element is used in the second example:
<note>
<date>10/01/2008</date>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
An expanded date element is used in the third: (THIS IS MY FAVORITE):
<note>
<date>
<day>10</day>
<month>01</month>
<year>2008</year>
</date>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
12
Avoid XML Attributes?
Some of the problems with using attributes are:
• attributes cannot contain multiple values (elements can)
• attributes cannot contain tree structures (elements can)
• attributes are not easily expandable (for future changes)
Attributes are difficult to read and maintain. Use elements for data. Use attributes for
information that is not relevant to the data.Don't end up like this:
<note day="10" month="01" year="2008"
to="Tove" from="Jani" heading="Reminder"
body="Don't forget me this weekend!">
</note>
XML Attributes for Metadata
Sometimes ID references are assigned to elements. These IDs can be used to identify XML
elements in much the same way as the ID attribute in HTML. This example demonstrates this:
<messages>
<note id="501">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
<note id="502">
<to>Jani</to>
<from>Tove</from>
<heading>Re: Reminder</heading>
<body>I will not</body>
</note>
</messages>
The ID above is just an identifier, to identify the different notes. It is not a part of the note
itself.What I'm trying to say here is that metadata (data about data) should be stored as
attributes, and that data itself should be stored as elements.
XML Validation
XML with correct syntax is "Well Formed" XML.
XML validated against a DTD is "Valid" XML.
Well Formed XML Documents
A "Well Formed" XML document has correct XML syntax.
The syntax rules were described in the previous chapters:
• XML documents must have a root element
• XML elements must have a closing tag
• XML tags are case sensitive
• XML elements must be properly nested
• XML attribute values must be quoted
<?xml version="1.0" encoding="ISO‐8859‐1"?>
13
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
Valid XML Documents
A "Valid" XML document is a "Well Formed" XML document, which also conforms to the rules
of a Document Type Definition (DTD):
<?xml version="1.0" encoding="ISO‐8859‐1"?>
<!DOCTYPE note SYSTEM "Note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
The DOCTYPE declaration in the example above, is a reference to an external DTD file. The
content of the file is shown in the paragraph below.
XML DTD
The purpose of a DTD is to define the structure of an XML document. It defines the structure
with a list of legal elements:
<!DOCTYPE note
[
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
XML Schema
W3C supports an XML‐based alternative to DTD, called XML Schema:
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
14
A General XML Validator
To help you check the syntax of your XML files, we have created an XML validator to syntax‐
check your XML.
Viewing XML Files
Raw XML files can be viewed in all major browsers.
Don't expect XML files to be displayed as HTML pages.
Viewing XML Files
<?xml version="1.0" encoding="ISO‐8859‐1"?>
‐ <note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
Look at this XML file: note.xml
The XML document will be displayed with color‐coded root and child elements. A plus (+) or
minus sign (‐) to the left of the elements can be clicked to expand or collapse the element
structure. To view the raw XML source (without the + and ‐ signs), select "View Page Source" or
"View Source" from the browser menu.
Note: In Chrome, Opera, and Safari, only the element text will be displayed. To view the raw
XML, you must right click the page and select "View Source"
Why Does XML Display Like This?
XML documents do not carry information about how to display the data.
Since XML tags are "invented" by the author of the XML document, browsers do not know if a
tag like <table> describes an HTML table or a dining table.
Without any information about how to display the data, most browsers will just display the
XML document as it is.
In the next chapters, we will take a look at different solutions to the display problem, using CSS,
XSLT and JavaScript.
Displaying XML with CSS
With CSS (Cascading Style Sheets) you can add display information to an XML document.
Displaying your XML Files with CSS?
It is possible to use CSS to format an XML document.
Below is an example of how to use a CSS style sheet to format an XML document:
Below is a fraction of the XML file. The second line links the XML file to the CSS file:
<?xml version="1.0" encoding="ISO‐8859‐1"?>
<?xml‐stylesheet type="text/css" href="cd_catalog.css"?>
<CATALOG>
15
<CD>
<TITLE>Empire Burlesque</TITLE>
<ARTIST>Bob Dylan</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>Columbia</COMPANY>
<PRICE>10.90</PRICE>
<YEAR>1985</YEAR>
</CD>
<CD>
<TITLE>Hide your heart</TITLE>
<ARTIST>Bonnie Tyler</ARTIST>
<COUNTRY>UK</COUNTRY>
<COMPANY>CBS Records</COMPANY>
<PRICE>9.90</PRICE>
<YEAR>1988</YEAR>
</CD>
.
.
.
</CATALOG>
Formatting XML with CSS is not the most common method.
W3C recommend using XSLT instead. See the next chapter.
Displaying XML with XSLT
With XSLT you can transform an XML document into HTML.
Displaying XML with XSLT
XSLT is the recommended style sheet language of XML.XSLT (eXtensible Stylesheet Language
Transformations) is far more sophisticated than CSS.
XSLT can be used to transform XML into HTML, before it is displayed by a browser:
Display XML with XSLT
Transforming XML with XSLT on the Server
In the example above, the XSLT transformation is done by the browser, when the browser reads
the XML file.
Different browsers may produce different result when transforming XML with XSLT. To reduce
this problem the XSLT transformation can be done on the server.
The XMLHttpRequest Object
With an XMLHttpRequest you can communicate with your server from inside a web page.
What is the XMLHttpRequest Object?
The XMLHttpRequest object is the developer’s dream, because you can:
• Update a web page with new data without reloading the page
16
• Request and receive new data from a server after the page has loaded
• Communicate with a server in the background
XMLHttpRequest Example
When you type in the input box below, an HTTP request is sent to the server and name
suggestions are returned from a name list:
Type a letter in the input box:
First Name
Suggestions:
Creating an XMLHttpRequest Object
Creating an XMLHttpRequest object is done with one single line of JavaScript.
In all modern browsers:
var xmlhttp=new XMLHttpRequest()
In older Microsoft browsers (IE 5 and 6):
var xmlhttp=new ActiveXObject("Microsoft.XMLHTTP")
In the next chapter, we will use the XMLHttpRequest object to retrieve XML information from a
server.
The XMLHttpRequest object is supported in all modern browsers
Is the XMLHttpRequest Object a W3C Standard?
The XMLHttpRequest object is not specified in any W3C recommendation.
However, the W3C DOM Level 3 "Load and Save" specification contains some similar
functionality, but these are not implemented in any browsers yet.
XML Parser
Most browsers have a built‐in XML parser to read and manipulate XML.
The parser converts XML into a JavaScript accessible object (the XML DOM).
XML Parser
The XML DOM contains methods (functions) to traverse XML trees, access, insert, and delete
nodes.
However, before an XML document can be accessed and manipulated, it must be loaded into
an XML DOM object.
An XML parser reads XML, and converts it into an XML DOM object that can be accessed with
JavaScript.
Most browsers have a built‐in XML parser.
Load an XML Document
The following JavaScript fragment loads an XML document ("books.xml"):
Example
17
if (window.XMLHttpRequest)
{
xhttp=new XMLHttpRequest();
}
else // Internet Explorer 5/6
{
xhttp=new ActiveXObject("Microsoft.XMLHTTP");
}
xhttp.open("GET","books.xml",false);
xhttp.send("");
xmlDoc=xhttp.responseXML;
Code explained:
• Create an XMLHTTP object
• Open the XMLHTTP object
• Send an XML HTTP request to the server
• Set the response as an XML DOM object
Load an XML String
The following code loads and parses an XML string:
Example
if (window.DOMParser)
{
parser=new DOMParser();
xmlDoc=parser.parseFromString(text,"text/xml");
}
else // Internet Explorer
{
xmlDoc=new ActiveXObject("Microsoft.XMLDOM");
xmlDoc.async="false";
xmlDoc.loadXML(text);
}
Note: Internet Explorer uses the loadXML() method to parse an XML string, while other
browsers use the DOMParser object.
Access Across Domains
For security reasons, modern browsers do not allow access across domains.
This means, that both the web page and the XML file it tries to load, must be located on the
same server.
The examples on W3Schools all open XML files located on the W3Schools domain.
If you want to use the example above on one of your web pages, the XML files you load must be
located on your own server.
The XML DOM
In the next chapter of this tutorial, you will learn how to access and retrieve data from the XML
document object (the XML DOM).
18
XML DOM
The DOM (Document Object Model) defines a standard way for accessing and manipulating
documents.
The XML DOM
The XML DOM (XML Document Object Model) defines a standard way for accessing and
manipulating XML documents.
The DOM views XML documents as a tree‐structure. All elements can be accessed through the
DOM tree. Their content (text and attributes) can be modified or deleted, and new elements
can be created. The elements, their text, and their attributes are all known as nodes.
In the examples below we use the following DOM reference to get the text from the <to>
element:
xmlDoc.getElementsByTagName("to")[0].childNodes[0].nodeValue
• xmlDoc ‐ the XML document created by the parser.
• getElementsByTagName("to")[0] ‐ the first <to> element
• childNodes[0] ‐ the first child of the <to> element (the text node)
• nodeValue ‐ the value of the node (the text itself)
The HTML DOM
The HTML DOM (HTML Document Object Model) defines a standard way for accessing and
manipulating HTML documents.
All HTML elements can be accessed through the HTML DOM.
In the examples below we use the following DOM reference to change the text of the HTML
element where id="to":
document.getElementById("to").innerHTML=
• document ‐ the HTML document
• getElementById("to") ‐ the HTML element where id="to"
• innerHTML ‐ the inner text of the HTML element
Load an XML File ‐ A Cross browser Example
The following code loads an XML document ("note.xml") into the XML parser:
Example
<html>
<body>
<h1>W3Schools Internal Note</h1>
<p><b>To:</b> <span id="to"></span><br />
<b>From:</b> <span id="from"></span><br />
<b>Message:</b> <span id="message"></span>
<script type="text/javascript">
if (window.XMLHttpRequest)
{
xhttp=new XMLHttpRequest()
}
19
else
{
xhttp=new ActiveXObject("Microsoft.XMLHTTP")
}
xhttp.open("GET","note.xml",false);
xhttp.send("");
xmlDoc=xhttp.responseXML;
document.getElementById("to").innerHTML=
xmlDoc.getElementsByTagName("to")[0].childNodes[0].nodeValue;
document.getElementById("from").innerHTML=
xmlDoc.getElementsByTagName("from")[0].childNodes[0].nodeValue;
document.getElementById("message").innerHTML=
xmlDoc.getElementsByTagName("body")[0].childNodes[0].nodeValue;
</script>
</body>
</html>
Important Note
To extract the text "Jani" from the XML, the syntax is:
getElementsByTagName("from")[0].childNodes[0].nodeValue
In the XML example there is only one <from> tag, but you still have to specify the array index
[0], because the XML parser method getElementsByTagName() returns an array of all <from>
nodes.
Load an XML String ‐ A Cross browser Example
The following code loads and parses an XML string:
Example
<html>
<body>
<h1>W3Schools Internal Note</h1>
<p><b>To:</b> <span id="to"></span><br />
<b>From:</b> <span id="from"></span><br />
<b>Message:</b> <span id="message"></span></p>
<script>
text="<note>";
text=text+"<to>Tove</to>";
text=text+"<from>Jani</from>";
text=text+"<heading>Reminder</heading>";
text=text+"<body>Don't forget me this weekend!</body>";
text=text+"</note>";
if (window.DOMParser)
{
parser=new DOMParser();
xmlDoc=parser.parseFromString(text,"text/xml");
}
20
else // Internet Explorer
{
xmlDoc=new ActiveXObject("Microsoft.XMLDOM");
xmlDoc.async="false";
xmlDoc.loadXML(text);
}
document.getElementById("to").innerHTML=
xmlDoc.getElementsByTagName("to")[0].childNodes[0].nodeValue;
document.getElementById("from").innerHTML=
xmlDoc.getElementsByTagName("from")[0].childNodes[0].nodeValue;
document.getElementById("message").innerHTML=
xmlDoc.getElementsByTagName("body")[0].childNodes[0].nodeValue;
</script>
</body>
</html>
Note: Internet Explorer uses the loadXML() method to parse an XML string, while other
browsers use the DOMParser object.
XML to HTML
This chapter explains how to display XML data as HTML.
Display XML Data in HTML
In the example below, we loop through an XML file (cd_catalog.xml), and display each CD
element as an HTML table row:
Example
<html>
<body>
<script type="text/javascript">
if (window.XMLHttpRequest)
{
xhttp=new XMLHttpRequest();
}
else // Internet Explorer 5/6
{
xhttp=new ActiveXObject("Microsoft.XMLHTTP");
}
xhttp.open("GET","cd_catalog.xml",false);
xhttp.send("");
xmlDoc=xhttp.responseXML;
document.write("<table border='1'>");
var x=xmlDoc.getElementsByTagName("CD");
for (i=0;i<x.length;i++)
{
document.write("<tr><td>");
document.write(x[i].getElementsByTagName("ARTIST")[0].childNodes[0].nodeValue);
21
document.write("</td><td>");
document.write(x[i].getElementsByTagName("TITLE")[0].childNodes[0].nodeValue);
document.write("</td></tr>");
}
document.write("</table>");
</script>
</body>
</html>
Example explained
• We check the browser, and load the XML using the correct parser (explained in the
previous chapter)
• We create an HTML table with <table border="1">
• We use getElementsByTagName() to get all XML CD nodes
• For each CD node, we display data from ARTIST and TITLE as table data.
• We end the table with </table>
XML Application
This chapter demonstrates a small XML application built with HTML and JavaScript.
The XML Example Document
Look at the following XML document ("cd_catalog.xml"), that represents a CD catalog:
<?xml version="1.0" encoding="ISO‐8859‐1"?>
<CATALOG>
<CD>
<TITLE>Empire Burlesque</TITLE>
<ARTIST>Bob Dylan</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>Columbia</COMPANY>
<PRICE>10.90</PRICE>
<YEAR>1985</YEAR>
</CD>
.
Load the XML Document
To load the XML document (cd_catalog.xml), we use the same code as we used in the XML
Parser chapter:
if (window.XMLHttpRequest)
{
xhttp=new XMLHttpRequest();
}
else // Internet Explorer 5/6
{
xhttp=new ActiveXObject("Microsoft.XMLHTTP");
}
xhttp.open("GET","cd_catalog.xml",false);
xhttp.send("");
22
xmlDoc=xhttp.responseXML;
After the execution of this code, xmlDoc is an XML DOM object, accessible by JavaScript.
Display XML Data as an HTML Table
The following code displays an HTML table filled with data from the XML DOM object:
Example
document.write("<table border='1'>");
var x=xmlDoc.getElementsByTagName("CD");
for (i=0;i<x.length;i++)
{
document.write("<tr><td>");
document.write(x[i].getElementsByTagName("ARTIST")[0].childNodes[0].nodeValue);
document.write("</td><td>");
document.write(x[i].getElementsByTagName("TITLE")[0].childNodes[0].nodeValue);
document.write("</td></tr>");
}
document.write("</table>");
For each CD element in the XML document, a table row is created. Each table row contains two
table data with ARTIST and TITLE from the current CD element.
Display XML Data in any HTML Element
XML data can be copied into any HTML element that can display text.
The code below is part of the <head> section of the HTML file. It gets the XML data from the
first <CD> element and displays it in the HTML element with the id="show":
Example
var x=xmlDoc.getElementsByTagName("CD");
i=0;
function display()
{
artist=(x[i].getElementsByTagName("ARTIST")[0].childNodes[0].nodeValue);
title=(x[i].getElementsByTagName("TITLE")[0].childNodes[0].nodeValue);
year=(x[i].getElementsByTagName("YEAR")[0].childNodes[0].nodeValue);
txt="Artist: " + artist + "<br />Title: " + title + "<br />Year: "+ year;
document.getElementById("show").innerHTML=txt;
}
The body of the HTML document contains an onload event attribute that calls the display()
function when the page is loaded. It also contains a <div id='show'> element to receive the XML
data.
<body onload="display()">
<div id='show'></div>
</body>
In the example above, you will only see data from the first CD element in the XML document.
To navigate to the next CD element, you have to add some more code.
Add a Navigation Script
To add navigation to the example above, create two functions called next() and previous():
23
Example
function next()
{
if (i<x.length‐1)
{
i++;
display();
}
}
function previous()
{
if (i>0)
{
i‐‐;
display();
}
}
The next() function displays the next CD, unless you are on the last CD element.
The previous() function displays the previous CD, unless you are at the first CD element.
The next() and previous() functions are called by clicking next/previous buttons:
<input type="button" onclick="previous()" value="previous" />
<input type="button" onclick="next()" value="next" />
XML Namespaces
XML Namespaces provide a method to avoid element name conflicts.
Name Conflicts
In XML, element names are defined by the developer. This often results in a conflict when
trying to mix XML documents from different XML applications.
This XML carries HTML table information:
<table>
<tr>
<td>Apples</td>
<td>Bananas</td>
</tr>
</table>
This XML carries information about a table (a piece of furniture):
<table>
<name>African Coffee Table</name>
<width>80</width>
<length>120</length>
</table>
If these XML fragments were added together, there would be a name conflict. Both contain a
<table> element, but the elements have different content and meaning.
24
An XML parser will not know how to handle these differences.
Solving the Name Conflict Using a Prefix
Name conflicts in XML can easily be avoided using a name prefix.
This XML carries information about an HTML table, and a piece of furniture:
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
In the example above, there will be no conflict because the two <table> elements have
different names.
XML Namespaces ‐ The xmlns Attribute
When using prefixes in XML, a so‐called namespace for the prefix must be defined.
The namespace is defined by the xmlns attribute in the start tag of an element.
The namespace declaration has the following syntax. xmlns:prefix="URI".
<root>
<h:table xmlns:h="http://www.w3.org/TR/html4/">
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table xmlns:f="http://www.w3schools.com/furniture">
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>
In the example above, the xmlns attribute in the <table> tag give the h: and f: prefixes a
qualified namespace.When a namespace is defined for an element, all child elements with the
same prefix are associated with the same namespace.
Namespaces can be declared in the elements where they are used or in the XML root element:
<root
xmlns:h="http://www.w3.org/TR/html4/"
xmlns:f="http://www.w3schools.com/furniture">
<h:table>
<h:tr>
<h:td>Apples</h:td>
25
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>
Note: The namespace URI is not used by the parser to look up information.
The purpose is to give the namespace a unique name. However, often companies use the
namespace as a pointer to a web page containing namespace information.
Uniform Resource Identifier (URI)
A Uniform Resource Identifier (URI) is a string of characters which identifies an Internet
Resource.
The most common URI is the Uniform Resource Locator (URL) which identifies an Internet
domain address. Another, not so common type of URI is the Universal Resource Name (URN).
Default Namespaces
Defining a default namespace for an element saves us from using prefixes in all the child
elements. It has the following syntax:
xmlns="namespaceURI"
This XML carries HTML table information:
<table xmlns="http://www.w3.org/TR/html4/">
<tr>
<td>Apples</td>
<td>Bananas</td>
</tr>
</table>
This XML carries information about a piece of furniture:
<table xmlns="http://www.w3schools.com/furniture">
<name>African Coffee Table</name>
<width>80</width>
<length>120</length>
</table>
Namespaces in Real Use
XSLT is an XML language that can be used to transform XML documents into other formats, like
HTML.
In the XSLT document below, you can see that most of the tags are HTML tags.
The tags that are not HTML tags have the prefix xsl, identified by the namespace
xmlns:xsl="http://www.w3.org/1999/XSL/Transform":
<?xml version="1.0" encoding="ISO‐8859‐1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
26
<xsl:template match="/">
<html>
<body>
<h2>My CD Collection</h2>
<table border="1">
<tr>
<th align="left">Title</th>
<th align="left">Artist</th>
</tr>
<xsl:for‐each select="catalog/cd">
<tr>
<td><xsl:value‐of select="title"/></td>
<td><xsl:value‐of select="artist"/></td>
</tr>
</xsl:for‐each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
XML CDATA
All text in an XML document will be parsed by the parser.
But text inside a CDATA section will be ignored by the parser.
PCDATA ‐ Parsed Character Data
XML parsers normally parse all the text in an XML document.
When an XML element is parsed, the text between the XML tags is also parsed:
<message>This text is also parsed</message>
The parser does this because XML elements can contain other elements, as in this example,
where the <name> element contains two other elements (first and last):
<name><first>Bill</first><last>Gates</last></name>
and the parser will break it up into sub‐elements like this:
<name>
<first>Bill</first>
<last>Gates</last>
</name>
Parsed Character Data (PCDATA) is a term used about text data that will be parsed by the XML
parser.
CDATA ‐ (Unparsed) Character Data
The term CDATA is used about text data that should not be parsed by the XML parser.
Characters like "<" and "&" are illegal in XML elements.
"<" will generate an error because the parser interprets it as the start of a new element.
"&" will generate an error because the parser interprets it as the start of an character entity.
27
Some text, like JavaScript code, contains a lot of "<" or "&" characters. To avoid errors script
code can be defined as CDATA.
Everything inside a CDATA section is ignored by the parser.
A CDATA section starts with "<![CDATA[" and ends with "]]>":
<script>
<![CDATA[
function matchwo(a,b)
{
if (a < b && a < 0) then
{
return 1;
}
else
{
return 0;
}
}
]]>
</script>
In the example above, everything inside the CDATA section is ignored by the parser.
Notes on CDATA sections:
A CDATA section cannot contain the string "]]>". Nested CDATA sections are not allowed.
The "]]>" that marks the end of the CDATA section cannot contain spaces or line breaks.
XML Encoding
XML documents can contain non ASCII characters, like Norwegian æ ø å , or French ê è é.
To avoid errors, specify the XML encoding, or save XML files as Unicode.
XML Encoding Errors
If you load an XML document, you can get two different errors indicating encoding problems:
An invalid character was found in text content.
You get this error if your XML contains non ASCII characters, and the file was saved as single‐
byte ANSI (or ASCII) with no encoding specified.
Single byte XML file with encoding attribute.
Same single byte XML file with no encoding attribute.
Switch from current encoding to specified encoding not supported.
You get this error if your XML file was saved as double‐byte Unicode (or UTF‐16) with a single‐
byte encoding (Windows‐1252, ISO‐8859‐1, UTF‐8) specified.
You also get this error if your XML file was saved with single‐byte ANSI (or ASCII), with double‐
byte encoding (UTF‐16) specified.
Double byte XML file without encoding.
Same double byte XML file with single byte encoding.
Windows Notepad
Windows Notepad save files as single‐byte ANSI (ASCII) by default.
28
If you select "Save as...", you can specify double‐byte Unicode (UTF‐16).
Save the XML file below as Unicode (note that the document does not contain any encoding
attribute):
<?xml version="1.0"?>
<note>
<from>Jani</from>
<to>Tove</to>
<message>Norwegian: æøå. French: êèé</message>
</note>
The file above, note_encode_none_u.xml will NOT generate an error. But if you specify a single‐
byte encoding it will.
The following encoding (open it), will give an error message:
<?xml version="1.0" encoding="windows‐1252"?>
The following encoding (open it), will give an error message:
<?xml version="1.0" encoding="ISO‐8859‐1"?>
The following encoding (open it), will give an error message:
<?xml version="1.0" encoding="UTF‐8"?>
The following encoding (open it), will NOT give an error:
<?xml version="1.0" encoding="UTF‐16"?>
Conclusion
• Always use the encoding attribute
• Use an editor that supports encoding
• Make sure you know what encoding the editor uses
• Use the same encoding in your encoding attribute
XML on the Server
XML files are plain text files just like HTML files.
XML can easily be stored and generated by a standard web server.
Storing XML Files on the Server
XML files can be stored on an Internet server exactly the same way as HTML files.
Start Windows Notepad and write the following lines:
<?xml version="1.0" encoding="ISO‐8859‐1"?>
<note>
<from>Jani</from>
<to>Tove</to>
<message>Remember me this weekend</message>
</note>
Save the file on your web server with a proper name like "note.xml".
Generating XML with ASP
XML can be generated on a server without any installed XML software.
29
To generate an XML response from the server ‐ simply write the following code and save it as
an ASP file on the web server:
<%
response.ContentType="text/xml"
response.Write("<?xml version='1.0' encoding='ISO‐8859‐1'?>")
response.Write("<note>")
response.Write("<from>Jani</from>")
response.Write("<to>Tove</to>")
response.Write("<message>Remember me this weekend</message>")
response.Write("</note>")
%>
Note that the content type of the response must be set to "text/xml".
Generating XML with PHP
To generate an XML response from the server using PHP, use following code:
<?php
header("Content‐type: text/xml");
echo "<?xml version='1.0' encoding='ISO‐8859‐1'?>";
echo "<note>";
echo "<from>Jani</from>";
echo "<to>Tove</to>";
echo "<message>Remember me this weekend</message>";
echo "</note>";
?>
Note that the content type of the response header must be set to "text/xml".
Generating XML From a Database
XML can be generated from a database without any installed XML software.
To generate an XML database response from the server, simply write the following code and
save it as an ASP file on the web server:
<%
response.ContentType = "text/xml"
set conn=Server.CreateObject("ADODB.Connection")
conn.provider="Microsoft.Jet.OLEDB.4.0;"
conn.open server.mappath("/db/database.mdb")
sql="select fname,lname from tblGuestBook"
set rs=Conn.Execute(sql)
response.write("<?xml version='1.0' encoding='ISO‐8859‐1'?>")
response.write("<guestbook>")
while (not rs.EOF)
response.write("<guest>")
response.write("<fname>" & rs("fname") & "</fname>")
response.write("<lname>" & rs("lname") & "</lname>")
response.write("</guest>")
rs.MoveNext()
wend
30
rs.close()
conn.close()
response.write("</guestbook>")
%>
Transforming XML with XSLT on the Server
This ASP transforms an XML file to XHTML on the server:
<%
'Load XML
set xml = Server.CreateObject("Microsoft.XMLDOM")
xml.async = false
xml.load(Server.MapPath("simple.xml"))
'Load XSL
set xsl = Server.CreateObject("Microsoft.XMLDOM")
xsl.async = false
xsl.load(Server.MapPath("simple.xsl"))
'Transform file
Response.Write(xml.transformNode(xsl))
%>
Example explained
• The first block of code creates an instance of the Microsoft XML parser (XMLDOM), and
loads the XML file into memory.
• The second block of code creates another instance of the parser and loads the XSL file
into memory.
• The last line of code transforms the XML document using the XSL document, and sends
the result as XHTML to your browser. Nice!
Saving XML To a File Using ASP
This ASP example creates a simple XML document and saves it on the server:
<%
text="<note>"
text=text & "<to>Tove</to>"
text=text & "<from>Jani</from>"
text=text & "<heading>Reminder</heading>"
text=text & "<body>Don't forget me this weekend!</body>"
text=text & "</note>"
set xmlDoc=Server.CreateObject("Microsoft.XMLDOM")
xmlDoc.async="false"
xmlDoc.loadXML(text)
xmlDoc.Save("test.xml")
%>
XML DOM Advanced
The XML DOM (Document Object Model) defines a standard way for accessing and
manipulating XML documents.
31
The XML DOM
The DOM views XML documents as a tree‐structure. All elements can be accessed through the
DOM tree. Their content (text and attributes) can be modified or deleted, and new elements
can be created. The elements, their text, and their attributes are all known as nodes.
In an earlier chapter of this tutorial we introduced the XML DOM , and used the XML DOM
getElementsByTagName() method to retrieve data from a DOM tree.
In this chapter we will describe some other commonly used XML DOM methods. In the
examples below, we have used the XML file: books.xml.
Get the Value of an Element
The following code retrieves the text value of the first <title> element:
Example
x=xmlDoc.getElementsByTagName("title")[0].childNodes[0];
txt=x.nodeValue;
Get the Value of an Attribute
The following code retrieves the text value of the "lang" attribute of the first <title> element:
Example
txt=xmlDoc.getElementsByTagName("title")[0].getAttribute("lang");
Change the Value of an Element
The following code changes the text value of the first <title> element:
Example
x=xmlDoc.getElementsByTagName("title")[0].childNodes[0];
x.nodeValue="Easy Cooking";
Change the Value of an Attribute
The setAttribute() method can be used to change the value of an existing attribute, or to create
a new attribute.
The following code adds a new attribute called "edition" (with the value "first") to each <book>
element:
Example
x=xmlDoc.getElementsByTagName("book");
for(i=0;i<x.length;i++)
{
x[i].setAttribute("edition","first");
}
Create an Element
The createElement() method creates a new element node.
The createTextNode() method creates a new text node.
The appendChild() method adds a child node to a node (after the last child).
To create a new element with text content, it is necessary to create both an element node and
a text node.
32
The following code creates an element (<edition>), and adds it to the first <book> element:
Example
newel=xmlDoc.createElement("edition");
newtext=xmlDoc.createTextNode("First");
newel.appendChild(newtext);
x=xmlDoc.getElementsByTagName("book");
x[0].appendChild(newel);
Example explained:
• Create an <edition> element
• Create a text node with value = "First"
• Append the text node to the <edition> element
• Append the <edition> element to the first <book> element
Remove an Element
The removeChild() method removes a specified node (or element).
The following code fragment will remove the first node in the first <book> element:
Example
x=xmlDoc.getElementsByTagName("book")[0];
x.removeChild(x.childNodes[0]);
Note: The result of the example above may be different depending on what browser you use.
Firefox treats new lines as empty text nodes, Internet Explorer does not. You can read more
about this and how to avoid it in the XML DOM tutorial.
XML in Real Life
Some examples of how XML can be used to exchange information.
Example: XML News
XMLNews is a specification for exchanging news and other information.
Using such a standard makes it easier for both news producers and news consumers to
produce, receive, and archive any kind of news information across different hardware,
software, and programming languages.
An example XMLNews document:
<?xml version="1.0" encoding="ISO‐8859‐1"?>
<nitf>
<head>
<title>Colombia Earthquake</title>
</head>
<body>
<headline>
<hl1>143 Dead in Colombia Earthquake</hl1>
</headline>
<byline>
<bytag>By Jared Kotler, Associated Press Writer</bytag>
33
/ </byline>
<dateline>
<location>Bogota, Colombia</location>
<date>Monday January 25 1999 7:28 ET</date>
</dateline>
</body>
</nitf>
Example: XML Weather Service
An example of an XML national weather service from NOAA (National Oceanic and Atmospheric
Administration):
<?xml version="1.0" encoding="ISO‐8859‐1" ?>
<current_observation>
<credit>NOAA's National Weather Service</credit>
<credit_URL>http://weather.gov/</credit_URL>
<image>
<url>http://weather.gov/images/xml_logo.gif</url>
<title>NOAA's National Weather Service</title>
<link>http://weather.gov</link>
</image>
<location>New York/John F. Kennedy Intl Airport, NY</location>
<station_id>KJFK</station_id>
<latitude>40.66</latitude>
<longitude>‐73.78</longitude>
<observation_time_rfc822>Mon, 11 Feb 2008 06:51:00 ‐0500 EST
</observation_time_rfc822>
<weather>A Few Clouds</weather>
<temp_f>11</temp_f>
<temp_c>‐12</temp_c>
<relative_humidity>36</relative_humidity>
<wind_dir>West</wind_dir>
<wind_degrees>280</wind_degrees>
<wind_mph>18.4</wind_mph>
<wind_gust_mph>29</wind_gust_mph>
<pressure_mb>1023.6</pressure_mb>
<pressure_in>30.23</pressure_in>
<dewpoint_f>‐11</dewpoint_f>
<dewpoint_c>‐24</dewpoint_c>
<windchill_f>‐7</windchill_f>
<windchill_c>‐22</windchill_c>
<visibility_mi>10.00</visibility_mi>
<icon_url_base>http://weather.gov/weather/images/fcicons/</icon_url_base>
<icon_url_name>nfew.jpg</icon_url_name>
<disclaimer_url>http://weather.gov/disclaimer.html</disclaimer_url>
<copyright_url>http://weather.gov/disclaimer.html</copyright_url>
</current_observation>
34
XML Summary. What is Next?
XML Summary
XML can be used to exchange, share, and store data.
XML documents form a tree structure that starts at "the root" and branches to "the leaves".
XML has very simple syntax rules. XML with correct syntax is "Well Formed". Valid XML also
validates against a DTD.
XSLT is used to transform XML into other formats like HTML.
All modern browsers have a built‐in XML parser that can read and manipulate XML.
The DOM (Document Object Model) defines a standard way for accessing XML.
The XMLHttpRequest object provides a way to communicate with a server after a web page has
loaded.
XML Namespaces provide a method to avoid element name conflicts.
Text inside a CDATA section is ignored by the parser.
Our XML examples also represent a summary of this XML tutorial.
What to Study Next?
Our recommendation is to learn about the XML DOM and XSLT.
If you want to learn more about validating XML, we recommend DTD and XML Schema.
Below is a short description of each subject.
XML DOM (Document Object Model)
The XML DOM defines a standard way for accessing and manipulating XML documents.
The XML DOM is platform and language independent and can be used by any programming
language like Java, JavaScript, and VBScript.
If you want to learn more about the DOM, please visit our XML DOM tutorial.
XSLT (XML Stylesheet Language Transformations)
XSLT is the style sheet language for XML files.
With XSLT you can transform XML documents into other formats, like XHTML.
If you want to learn more about XSLT, please visit our XSLT tutorial.
35
XML DTD (Document Type Definition)
The purpose of a DTD is to define what elements, attributes and entities is legal in an XML
document.
With DTD, each of your XML files can carry a description of its own format with it.
DTD can be used to verify that the data you receive, and your own data, is valid.
If you want to learn more about DTD, please visit our DTD tutorial.
XML Schema
XML Schema is an XML based alternative to DTD.
Unlike DTD, XML Schemas has support for datatypes, and XML Schema use XML Syntax.
XML‐ The Benefits
• Simplicity
Information coded in XML is easy to read and understand, plus it can be processed easily
by computers.
• Openness
XML is a W3C standard, endorsed by software industry market leaders.
• Extensibility
There is no fixed set of tags. New tags can be created as they are needed.
• Self‐description
In traditional databases, data records require schemas set up by the database
administrator. XML documents can be stored without such definitions, because they
contain meta data in the form of tags and attributes.XML Provides a basis for author
identification and versioning at the element level. Any XML tag can possess an unlimited
number of attributes such as author or version.
• Contains machine‐readable context information
Tags, attributes and element structure provide context information that can be used to
interpret the meaning of content, opening up new possibilities for highly efficient search
engines, intelligent data mining, agents, etc.
This is a major advantage over HTML or plain text, where context information is difficult
or impossible to evaluate.
• Separates content from presentation
XML tags describe meaning not presentation. The motto of HTML is: "I know how it
looks", whereas the motto of XML is: "I know what it means, and you tell me how it
should look." The look and feel of an XML document can be controlled by XSL style
sheets, allowing the look of a document (or of a complete Web site) to be changed
without touching the content of the document. Multiple views or presentations of the
same content are easily rendered.
• Supports multilingual documents and Unicode
This is important for the internationalization of applications.
36
• Facilitates the comparison and aggregation of data
The tree structure of XML documents allows documents to be compared and
aggregated efficiently element by element.
• Can embed multiple data types
XML documents can contain any possible data type ‐ from multimedia data (image,
sound, video) to active components (Java applets, ActiveX).
• Can embed existing data
Mapping existing data structures like file systems or relational databases to XML is
simple. XML supports multiple data formats and can cover all existing data structures
and .
• Provides a 'one‐server view' for distributed data
XML documents can consist of nested elements that are distributed over multiple
remote servers. XML is currently the most sophisticated format for distributed data ‐ the
World Wide Web can be seen as one huge XML database.
• Rapid adoption by industry
Software AG, IBM, Sun, Microsoft, Netscape, DataChannel, SAP and many others have
already announced support for XML. Microsoft will use XML as the exchange format for
its Office product line, while both Microsoft's and Netscape's Web browsers support
XML. SAP has announced support of XML through the SAP Business Connector with R/3.
Software AG supports XML in its Bolero and Natural product lines and provides Tamino,
a native XML database.
Introduction
XML is structured
XML documents are easily committed to a persistence layer
XML is platform independent, textual information
XML is an open standard
XML is language independent
DOM and SAX are open, language‐independent set of interfaces
XML is web enabled
XML is totally extensible
XML supports shareable structure (using DTDs)
XML enables interoperability
Vision
Introduction
There is a lot of hype surrounding XML, and a lot of hype surrounding Java. Together these
technologies propose to solve many of the most common (and persistent) general computing
problems that have been around for the last 20 years. XML and Java are not revolutionary in
the approach to solving these problems of interoperability of code and data across and within
platform and application boundaries. Rather, XML and Java provide solutions to these problems
by using the most successful strategies and techniques that have been honed and refined over
the last 20 years of computing.
37
In the following paragraphs, I will highlight some of the most basic and important advantages
that XML and Java provide to almost any system that uses them properly. This is by no means a
comprehensive list of benefits, but items in this list should appear across just about any use of
XML and Java technologies.
I will take a break from my normal pragmatic approach to getting you (the programmer) started
with using XML and Java and just talk about the high level (design level) benefits of this
wonderful combination. A good design is important to a good implementation for any system.
XML is structured
When you create your data using an XML editor (that you can write), you can not only input the
content of your data, but also define the structural relationships that exist inside your data. By
allowing you to define your own tags and create the proper structural relationships in your
information (with a DTD), you can use any XML parser to check the validity and integrity of the
data stored in your XML documents. This makes it very easy to validate the structure and
content of your information when you use XML. Without XML, you could also provide this
validation feature at the expense of developing the code to this yourself. XML is a great time
saver because most of the features that are available in XML are used by most programmers
when working on most projects.
By using XML and Java, you can quickly create and use information that is properly structured
and valid. By using (or creating) DTDs and storing your information in XML documents, you have
a cross‐platform and language independent data validation mechanism (for free) in all your
projects!
You might use XML to define file formats to store information that is generated and used by
your applications. This is another use of the structured nature of XML. The only limitation is
that binary information can’t be embedded in the body of XML documents. For example, if you
wrote a word processor in Java, you might choose to save your word processor documents to
an XML (actually your ApplicationML) file. If you use a DTD then your word processor would
also get input file format validation as a feature for free. There are many other advantages to
using XML and a file storage format for your applications which will be illustrated later in the
chapter.
Here are some benefits of the structured nature of XML:
• XML parsers make your application code more reliable and quick to develop by
providing validity checking on your XML documents (if you use a DTD).
• XML allows you to easily generate XML documents (that contain your information),
since it is so structured.
• XML parsers allow you to code faster by giving you a parser for your all your XML
documents (with and without DTDs).
XML documents are easily committed to a persistence layer
XML documents may be stored in files or databases. When stored in files, XML documents are
simply plain text files with tags (and possibly DTDs). It is very easy to save your XML documents
38
to a text file and pass the text file around to other machines, platforms and programs (as long
as they can understand the data). In the worst case scenario, XML documents (files) can be
viewed in a text editor on just about any platform.
XML documents are also naturally committed to a database (relational or object) or any other
kind of XML document store. There are commercial products available which allow you to save
XML documents to an XML storage layer (which is not a database per se), like Datachannel’s
XStore and ODI’s eXcelon. These XML store solutions are quite expensive ($10,000 to $20,000
range).
XML documents are also quite naturally retrieved from a persistence layer (databases, file
systems, XML stores). This lends XML to be used in real world applications where the
information being used by different parts of a system is the most important thing.
XML is platform independent, textual information
Information in an XML document is stored in plain‐text. This might seem like a restriction if
were thinking of embedding binary information in an XML document. There are several
advantages to keeping things plain text. First, it is easy to write parsers and all other XML
enabling technology on different platforms. Second, it makes everything very interoperable by
staying with the lowest common denominator approach. This is the whole reason the web is so
successful despite all its flaws. By accepting and sending information in plain text format,
programs running on disparate platforms can communicate with each other. This also makes it
easy to integrate new programs on top of older ones (without rewriting the old programs), by
simply making the interface between the new and old program use XML.
For example, if you have an address book document stored in an XML file, created on a Mac,
that you would like to share with someone who has a PC, you can simply email them the plain
text address book XML document. This cant be done with binary encoded information which is
totally platform (and program) dependent.
Another example is web enabling legacy systems. It is very feasible to create a Java web
ennoblement application server that simply uses the services provided by the underlying legacy
system. Instead of rewriting the legacy system, if the system can be made to communicate
results and parameters through XML, the new and old system can work together without
throwing away a company’s investment in the legacy system.
XML is an open standard
By making the W3C the keeper of the XML standard, it ensures that no one vendor should be
able to cause interoperability problems to occur between systems that use the open standard.
This should be reassuring to most companies making an investment in this technology, by being
vendor neutral, this solution proposes to keep even small companies out of reach of big
companies choosing to change the standards on them. For example, if a big company chooses
to change the platform at its whim, then most other companies relying on that platform suffer.
By keeping all data in XML and using XML in communications protocols, companies can
maximize the lifetime of their investment in their products and solutions.
39
XML is language independent
By being language independent, XML bypasses the requirement to have a standard binary
encoding or storage format. Language independence also fosters immense interoperability
amongst heterogeneous systems. It is also good for future compatilbilty. For example, if in the
future a product needs to be changed in order to deal with a new computing paradigm or
network protocol, by keeping XML flowing through the system, addition of a new layer to deal
with this change is feasible.
DOM and SAX are open, language‐independent set of interfaces
By defining a set of programming language independent interfaces that allow the accessing and
mutation of XML documents, the W3C made it easier for programmers to deal with XML. Not
only does XML address the need for a standard information encoding and storage format, it
also allows programmers a standard way to use that information. SAX is a very low level API,
but it is more than what has been available before it. DOM is a higher level API that even
provides a default object model for all XML documents (saving time in creating one from
scratch if you are using data is document data).
SAX, DOM and XML are very developer friendly because developers are going to decide
whether this technology will be adopted by the majority and become a successful effort
towards the goal of interoperable, platform, and device independent computing.
XML is web enabled
XML is derived from SGML, and so was HTML. So in essence, the current infrastructure available
today to deal with HTML content can be re‐used to work with XML. This is a very big advantage
towards delivering XML content using the software and networking infrastructure already in
place today. This should be a big plus in considering XML for use in any of your projects,
because XML naturally lends itself to being used over the web.
Even if clients don’t support XML natively, it is not a big hindrance. In fact, Java with Servlets
(on the server side) can convert XML with stylesheets to generate plain HTML that can be
displayed in all web browsers.
Using XML to pass parameters and return values on servers makes it very easy to allow these
servers to be web‐enabled. A thin server side Java layer might be added that interacts with web
browsers using HTML and translates the requests and responses from the client into XML, that
is then fed into the server.
XML is totally extensible
By not predefining any tags in the XML Recommendation, the W3C allowed developers full
control over customizing their data as they see fit. This makes XML very attractive to encoding
data that already exists in legacy databases (by using database metadata, and other schema
40
information). This extensibility of XML makes it such a great fit when trying to get different
systems to work with each other.
XML supports shareable structure (using DTDs)
Since the structure of the XML document can be specified in DTDs they provide a simple way to
make it easier to exchange XML documents that conform to a DTD. For example, if two
software systems need to exchange information, then if both of the systems conform to one
DTD, the two systems can process information from each other. DTDs are not as powerful as
some kind of schema architecture for XML, they don’t support typing, subclassing, or
instantiation mechanisms that a schema architecture must have.
DTDs are a simple way to make sure that 2 or more XML documents are of the same “type”. Its
a very limited approach to making “typed” XML documents shareable across systems. In the
future some kind of schema system will be proposed by the W3C that should allow typing,
instantiation and inheritance of information (in XML).
XML enables interoperability
All of the advantages of XML outlined so far all make interoperability possible. This is one of the
most important requirements for XML, to enable disparate systems to be able to share
information easily.
By taking the lowest common denominator approach, by being web enabled, protocol
independent, network independent, platform independent and extensible, XML makes it
possible for new systems and old systems (that are all different) to communicate with each
other. Encoding information in plain text with tags is better than using propietary and platform
dependent binary formats.
Vision
XML provides solutions for problems that have existed for the past 20 years. With most
applications and software services using the Internet as a target platform for deployment, XML
could not have come at a better time. With the web becoming so popular, a new paradigm of
computing has emerged for which XML supplies one of the most important pieces, platform,
vendor and application neutral data. Regardless of the programming language used to process
XML, it will enable this new networked computing world.
Java is also a key component of this new paradigm. On the server side, by working with XML, it
can more naturally integrate legacy systems and services. With XML, Java can do what it does
best, work very well on the server side, and web (and Internet) enable software systems.
The advantages of XML over the HTML
1) HTML is limited to a finite tags where as XML can have more customized tags
2) Parsing a XML document is easier than an HTML document
41
3) XML can be used for pure XML pages as well as for embedding new kinds of
contends in HTML , like the
a) MathML ‐the mathematical mark up language is an XML application to
include a mathematical equations in Web Pages.
b) SMIL ‐ Synchronised Multimedia Integration language ‐is an XML
application for including a timed Multimedia like the Slide shows and sub titled
videos on web pages.
where as HTML has no such feature , it can be used only to create a plain HTML
document only.
4) An XML document will have a DTD ( Document type Definition) which will impose
conditions for a valid documents where as a HTML document has no such validation
mechanism.
Advantages of XML over HTML
• By defining own markup language, can code documents more precisely
• Reflects structure and semantics of documents ‐‐> better searching and navigation
• Tagging/content separate from display
• Allows single document to be used many ways
Advantages of XML
Using XML to exchange information offers many benefits, including the following:
• Uses human, not computer, language. XML is readable (and understandable, even by
novices) and no more difficult to code than HTML.
• Completely compatible with Java and 100% portable. Any application that can process
XML (on any platform) can use your information.
• Extendable. Create your own tags (or use tags created by others) that use the native
language of your domain, have the attributes you need, and make sense to you and
your users.
The following example illustrates, in a simplified way, the readability and extensibility of XML:
HTML example XML example
<HTML> <?XML VERSION="1.0" STANDALONE="yes" ?>
<H1 ID="MN">State</H1> <STATE STATEID="MN">
<H2 ID="12">City</H2> <CITY CITYID="12">
<DL> <NAME>Johnson</name>
<DT>Name</DT> <POPULATION>5000</POPULATION>
<DD>Johnson</DD> </CITY>
<DT>Population</DT> <CITY CITYID="15">
<DD>5000</DD> <NAME>Pineville</NAME>
</DL> <POPULATION>60000</POPULATION>
<H2 ID="15">City</H2> </CITY>
42
<DL> <CITY CITYID="20">
<DT>Name</DT> <NAME>Lake Bell</NAME>
<DD>Pineville</DD> <POPULATION>20</POPULATION>
<DT>Population</DT> </CITY>
<DD>60000</DD> </STATE>
</DL>
<H2 ID="20">City</H2>
<DL>
<DT>Name</DT>
<DD>Lake Bell</DD>
<DT>Population</DT>
<DD>20</DD>
</DL>
</HTML>
HTML tag names reveal nothing about the meaning of their content. The example above uses
an HTML definition list, but the problems inherent in using HTML would occur if the data were
contained in a table or some other kind of HTML tags: For example:
• Many the HTML tags are acronyms, so they are not as readable as common language.
• HTML tags represent data (in this above example, city names and populations) as items
to display, for example, as definitions in a list or cells in a table. This makes it difficult to
manipulate the data or or exchange it between applications.
The XML tag names are readable and convey the meaning of the data. Each XML tag
immediately precedes the associated data, helping to make the information structure easily
discerned by both humans and computers. The data structure follows a noticeable and useful
pattern, making it easy to manipulate and exchange the data.
Uses of XML
XML has a variety of uses, including:
• Web publishing: XML allows you to create interactive pages, allows the customer to
customize those pages, and makes creating e‐commerce applications more intuitive.
With XML, you store the data once and then render that content for different viewers or
devices based on style sheet processing using an XSL/XSLT processor.
• Web searching and automating Web tasks: XML defines the type of information
contained in a document, making it easier to return useful results when searching the
Web:
o For example, using HTML to search for books authored by Tom Wolf is likely to
return instances of the term 'wolf' outside of the context of author. Using XML
restricts the search to the proper context (say, the information contained in the
<author> tag) and returns only the desired type of information. Using XML, Web
agents and robots (programs that automate Web searches or other tasks) will be
more efficient and produce more useful results.
43
• General applications: XML provides a standard method to access information, making it
easier for applications and devices of all kinds to use, store, transmit, and display data.
• e‐business applications: XML implementations make electronic data interchange (EDI)
more accessible for information interchange, business‐to‐business transactions, and
business‐to‐consumer transactions.
• Metadata applications: XML makes is easier to express metadata (Unified Modeling
Language design models or user interface properties, for example) in a portable,
reusable format.
• Pervasive computing: XML provides portable and structured information types for
display on pervasive (wireless) computing devices such as PDAs, cellular phones, and
others.
o For example, WML (Wireless Markup Language) and VoiceXML are currently
evolving standards for describing visual and speech‐driven wireless device
interfaces.
Disadvantages of XML
• More difficult, demanding, and precise than HTML
• Lack of browser support/end user applications
• Still experimental/not solidified
Disadvantages of XML
However, awesome XML is, there are some drawbacks which have hindered it from
gaining widespread use since its inception. Let's look at the biggest drawback: The
lack of adequate processing applications.
For one, XML requires a processing application. That is, the nice thing about HTML
was that you knew that if you wrote an HTML document, anyone, anywhere in the
world, could read your document using Netscape. Well, with XML documents, that is
not yet the case. There are no XML browsers on the market yet (although the latest
version of IE does a pretty good job of incorporating XSL and XML documents
provided HTML is the output).
"While it's true that browser support is limited, IE
5 and Netscape 5 are expected to fully support
XML. Also, W3C's Amaya browser supports it
today, as does the JUMBO browser that was
created for the Chemical Markup Language.
XML isn't about display ‐‐ it's about structure. This
has implications that make the browser question
secondary. So the whole issue of what is to be
displayed and by what means is intentionally left
44
to other applications. You can target the same
XML (with different XSL) for different devices
(standard web browser, palm pilot, printer, etc.).
You should not get the impression that XML is
useless until browsers support it. This is definitely
not true ‐‐ we are using it at NASA in ways where
no browser plays any role." ‐ Ken Sall
Thus, XML documents must either be converted into HTML before distribution or
converting it to HTML on‐the‐fly by middleware. Barring translation, developers must
code their own processing applications.
The most common tactic used now is to write parsing routines in DHTML or Java, or
Server‐Side perl to parse through an XML document, apply the formatting rules
specified by the style sheet, and "convert" it all to HTML.
However, this takes some magic and the amount of work necessary even to print
"hello world" are sometimes enough to dissuade developers from adopting the
technology.
Nevertheless, parsing algorithms and tools continue to improve over time as more
and more people see the long‐term benefits of migrating their data to XML. The
backend part of XML will continue to become simpler and simpler. Already Internet
Explorer and Netscape provide a decent amount of built in XML parsing tools.
EDI –ELECTRONIC DATA INTERCHANGE
Short for Electronic Data Interchange, the transfer of data between different companies using
networks, such as VANs or the Internet. As more and more companies get connected to the
Internet, EDI is becoming increasingly important as an easy mechanism for companies to buy,
sell, and trade information. ANSI has approved a set of EDI standards known as the X12
standards.
AdvantagesofXMLoverEDI
Explicit structure
Easier validation
Can easily use the Internet
Cheaper to implement
Can open up electronic commerce to small and medium‐size businesses (social agenda again)
45
Electronic Data Interchange (EDI) refers to the structured transmission of data between
organizations by electronic means. It is used to transfer electronic documents from one
computer system to another, i.e. from one trading partner to another trading partner. It is
more than mere E‐mail; for instance, organizations might replace bills of lading and even
Cheques with appropriate EDI messages. It also refers specifically to a family of standards,
including the X12 series. However, EDI also exhibits its pre‐Internet roots, and the standards
tend to focus on ASCII (American Standard Code for Information Interchange)‐formatted single
messages rather than the whole sequence of conditions and exchanges that make up an inter‐
organization business process.
Electronic data interchange (EDI) is the electronic movement of data between or within
organizations in a structured, computer‐retrievable data format that permits information to be
transferred from a computer program in one location to a computer program in another
location without rekeying. EDI includes the direct transmission of data between locations;
transmission using an intermediary such as a communication network; and the exchange of
computer tapes, disks, or other digital storage devices. In many cases, content‐related error
checking and some degree of processing of the information are also involved. EDI differs from
electronic mail in that an actual transaction is transmitted electronically, rather than a simple
message consisting primarily of text.
EDI is used for electronic funds transfer (EFT) between financial institutions, which facilitates
such common transactions as the direct deposit of payroll checks by employers, the direct debit
of consumer accounts to make mortgage or utility payments, and the electronic payment of
federal taxes by businesses. Another common application of EDI involves the direct exchange of
standard business transaction documents—such as purchase orders, invoices, and bills of
lading—from one business to another via computer. EDI is also used by retail businesses as part
of their electronic scanning and point‐of‐sale (POS) inventory replenishment systems. Overall,
EDI offers a number of benefits to businesses and—thanks to the rapid evolution of the related
technology—is becoming more readily available to small businesses all the time.
Benefits of Edi
"EDI saves money and time because transactions can be transmitted from one information
system to another through a telecommunications network, eliminating the printing and
handling of paper at one end and the inputting of data at the other," Kenneth C. Laudon and
Jane Price Laudon wrote in their book Management Information Systems: A Contemporary
Perspective. "EDI may also provide strategic benefits by helping a firm 'lock in' customers,
making it easier for customers or distributors to order from them rather than from
competitors." EDI was developed to solve the problems inherent in paper‐based transaction
processing and in other forms of electronic communication. In solving these problems, EDI is a
tool that enables organizations to reengineer information flows and business processes. It
directly addresses several problems long associated with paper‐based transaction systems:
• Time delays—Paper documents may take days to transport from one location to
another, while manual processing methodologies necessitate steps like keying and filing
that are rendered unnecessary through EDI.
• Labor costs—In non‐EDI systems, manual processing is required for data keying,
document storage and retrieval, sorting, matching, reconciling, envelope stuffing,
stamping, signing, etc. While automated equipment can help with some of these
46
processes, most managers will agree that labor costs for document processing represent
a significant proportion of their overhead. In general, labor‐based processes are much
more expensive in the long term EDI alternatives.
• Accuracy—EDI systems are more accurate than their manual processing counterparts
because there are fewer points at which errors can be introduced into the system.
• Information Access—EDI systems permit myriad users access to a vast amount of
detailed transaction data in a timely fashion. In a non‐EDI environment, in which
information is held in offices and file cabinets, such dissemination of information is
possible only with great effort, and it cannot hope to match an EDI system's timeliness.
Because EDI data is already in computer‐retrievable form, it is subject to automated
processing and analysis. It also requires far less storage space.
Infrastructure for Edi
Several elements of infrastructure must exist in order to introduce an EDI system, including: 1)
format standards to facilitate automated processing by all users, 2) translation software to
translate from a user's proprietary format for internal data storage into the generic external
format and back again, 3) value‐added networks to solve the technical problems of sending
information between computers, 4) inexpensive microcomputers to bring all potential users—
even small ones—into the market, and 5) procedures for complying with legal rules. It has only
been in the past several years that all of these ingredients have fallen into place.
FORMAT STANDARDS. To permit the efficient use of computers, information must be highly
organized into a consistent data format. A format defines how information in a message is
organized: what data goes where, what data is mandatory, what is optional, how many
characters are permitted for each data field, how data fields are ordered, and what codes or
abbreviations are permitted.
Early EDI efforts in the 1960s used proprietary formats developed by one firm for exclusive use
by its trading partners. This worked well until a firm wanted to exchange EDI documents with
other firms who wanted to use their own formats. Since the different formats were not
compatible, data exchange was difficult if not impossible. To facilitate the widespread use of
EDI, standard formats were developed so that an electronic message sent by one party could be
understood by any receiver that subscribes to that format standard. In the United States the
Transportation Data Coordinating Committee began in 1968 to design format standards for
transportation documents. The first document was approved in 1975. This group pioneered the
ideas that are used by all standards organizations today.
North American standards are currently developed and maintained by a volunteer organization
called ANSI (American National Standards Institute). The format for a document defined by
ANSI is broad enough to satisfy the needs of many different industries. Electronic documents
are typically of variable length and most of the information is optional. When a firm sends a
standard EDI purchase order to another firm, it is possible for the receiving firm to pass the
purchase order data through an EDI translation program directly to a business application
without manual intervention. In the late 1990s, international format standards were
established and introduced as well to facilitate international business activity.
TRANSLATION SOFTWARE. Translation software makes EDI work by translating data from the
sending firm's internal format into a generic EDI format. Translation software also receives a
47
sender's EDI message and translates it from the generic standard into the receiver's internal
format. There are currently translation software packages for almost all types of computers and
operating systems.
VALUE‐ADDED NETWORKS (VANS). When firms first began using EDI, most communications of
EDI documents were directly between trading partners. Unfortunately, direct computer‐to‐
computer communications requires that both firms 1) use similar communication protocols, 2)
have the same transmission speed, 3) have phone lines available at the same time, and 4) have
compatible computer hardware. If these conditions are not met, then communication becomes
difficult if not impossible. A value‐added network (VAN) can solve these problems by providing
an electronic mailbox service. By using a VAN, an EDI sender need only learn to send and
receive messages to or from one party: the VAN. Since a VAN provides a very flexible computer
interface, it can talk to virtually any type of computer. This means that to conduct EDI with
hundreds of trading partners, an organization only has to talk to one party. In addition, VANs
provide important security elements for dissemination of information between parties.
INEXPENSIVE COMPUTERS. The fourth building block of EDI is inexpensive computers that
permit even small firms to implement EDI. Since microcomputers are now so prevalent, it is
possible for firms of all sizes to deal with each other using EDI.
PROCEDURES FOR COMPLYING WITH LEGAL RULES. Legal rules apply to the documents that
accompany a wide variety of business transactions. For example, some contracts must include a
signature or must be an original in order to be legal. If documents are to be transmitted via EDI,
companies must establish procedures to verify that messages are authentic and that they
comply with the agreed‐upon protocol. In addition, EDI requires companies to institute error‐
checking procedures as well as security measures to prevent unauthorized use of their
computer systems. Still, it is important to note that some sorts of business documents—such as
warranties or limitations of liability—are difficult to transmit legally using EDI.
EDI is considered to be a technical representation of a business conversation between two
entities, either internal or external. Note, there is a perception that "EDI" constitutes the entire
electronic data interchange paradigm, including the transmission, message flow, document
format, and software used to interpret the documents. EDI is considered to describe the
rigorously standardized format of electronic documents.EDI is very useful in supply chain.
The EDI standards were designed to be independent of communication and software
technologies. EDI can be transmitted using any methodology agreed to by the sender and
recipient. This includes a variety of technologies, including modem (asynchronous, and
bisynchronous), FTP, E‐mail, HTTP, AS1, AS2, etc. It is important to differentiate between the
EDI documents and the methods for transmitting them. When they compared the
bisynchronous protocol 2400 bit/s modems, CLEO devices, and value‐added networks used to
transmit EDI documents to transmitting via the Internet, some people equated the non‐Internet
technologies with EDI and predicted erroneously that EDI itself would be replaced along with
the non‐Internet technologies. These non‐internet transmission methods are being replaced by
Internet Protocols such as FTP, telnet, and E‐mail, but the EDI documents themselves still
remain.
48
There are four major sets of EDI standards:
• The UN‐recommended UN/EDIFACT is the only international standard and is
predominant outside of North America.
• The US standard ANSI ASC X12 (X12) is predominant in North America.
• The TRADACOMS standard developed by the ANA (Article Numbering Association) is
predominant in the UK retail industry.
• The ODETTE standard used within the European automotive industry
• 1. Name of Standard. Electronic Data Interchange (EDI) (FIPS PUB 161‐2).
• 2. Category of Standard. Electronic Data Interchange.
• 3. Explanation.
• 3.1. Definition and Use of EDI. EDI is the computer‐to‐computer interchange of strictly
formatted messages that represent documents other than monetary instruments. EDI
implies a sequence of messages between two parties, either of whom may serve as
originator or recipient. The formatted data representing the documents may be
transmitted from originator to recipient via telecommunications or physically
transported on electronic storage media.
• In EDI, the usual processing of received messages is by computer only. Human
intervention in the processing of a received message is typically intended only for error
conditions, for quality review, and for special situations. For example, the transmission
of binary or textual data is not EDI as defined here unless the data are treated as one or
more data elements of an EDI message and are not normally intended for human
interpretation as part of on‐line data processing.
• An example of EDI is a set of interchanges between a buyer and a seller. Messages from
buyer to seller could include, for example, request for quotation (RFQ), purchase order,
receiving advice and payment advice; messages from seller to buyer could include, simi‐
larly, bid in response to RFQ, purchase order acknowledgment, shipping notice and
invoice. These messages may simply provide information, e.g., receiving advice or
shipping notice, or they may include data that may be interpreted as a legally binding
obligation, e.g., bid in response to RFQ or purchase order.
• EDI is being used also for an increasingly diverse set of concerns, for example, for
interchanges between healthcare providers and insurers, for travel and hotel bookings,
for education administration, and for government regulatory, statistical and tax
reporting.
• 3.2. Standards Required for EDI. From the point of view of the standards needed, EDI
may be defined as an interchange between computers of a sequence of standardized
messages taken from a predetermined set of message types. Each message is
composed, according to a standardized syntax, of a sequence of standardized data
elements. It is the standardization of message formats using a standard syntax, and the
standardization of data elements within the messages, that makes possible the
assembling, disassembling, and processing of the messages by computer.
• Implementation of EDI requires the use of a family of interrelated standards. Standards
are required for, at minimum: (a) the syntax used to compose the messages and
separate the various parts of a message, (b) types and definitions of application data
elements, most of variable length, (c) the message types, defined by the identification
49
and sequence of data elements forming each message, and (d) the definitions and
sequence of control data elements in message headers and trailers.
• Additional standards may define: (e) a set of short sequences of data elements called
data segments, (f) the manner in which more than one message may be included in a
single transmission, and (g) the manner of adding protective measures for integrity,
confidentiality, and authentication into transmitted messages.
• 3.3. Limited Coverage of this Standard. This FIPS covers only EDI. It does not cover
other forms of electronic interchange, for example, systems of interchange that do not
consist of messages taken from a predetermined set. Additionally, an interchange
application including only one or two predetermined message types using only fixed‐
length data elements is excluded from coverage of this FIPS. This FIPS also is not
intended to cover transmissions from medical, laboratory, or environment‐sensing
instrumentation.
• 3.4. The Long‐Range Goal for EDI Standards. There are several different EDI standards
in use today, but the achievement of a single universally‐used family of EDI standards is
a long‐range goal. A single universally‐used family of standards would make use of EDI
more efficient and minimize aggregate costs of use. Specifically, it would (a) minimize
needs for training of personnel in use and maintenance of EDI standards, (b) eliminate
duplication of functionality and the costs of achieving that duplication now existing in
different systems of standards, (c) minimize requirements for different kinds of
translation software, and (d) allow for a universal set of data elements that would ease
the flow of data among different but interconnected applications, and thereby maximize
useful information interchange.
• This FIPS PUB recognizes the reality that some families of EDI standards were developed
to provide solutions to immediate needs, and that inclusion of the goal of universality in
their development would have unacceptably delayed their availability. However, a
future is envisioned in which the benefits of universality outweigh the sunk costs in
specialized solutions, leading first to cooperation among standards developers, then to
harmonization of standards, and eventually to a single universally accepted family of EDI
standards.
• 3.5. Adoption of Specific Families of Standards. This FIPS PUB adopts, with specific
conditions specified below, the families of EDI standards known as X12, UN/EDIFACT
and HL7. This FIPS PUB does not mandate the implementation of EDI systems within the
Federal Government; rather it requires the use of the identified families of standards
with specified constraints when Federal departments or agencies implement EDI
systems.
DATA BASE IN XML
Similarities with Database
! Storage: Tables vs. XML documents
50
! Schemas: Database schema vs. DTD, XML schema languages
! Query languages: SQL vs. XQL, XML‐QL, QUILT
! Programming interface: JDBC vs.SAX, DOM
Differences from Database
! Storage mechanism
! Indexing
! Built‐in security
! Transaction support
! Data integrity
! Multi‐user access
! Query across multiple data sources
XML Database Types
XML enabled database
"XML is used for input and output
"Relational tables internally
"Middleware converts between XML and relational database tables
Pros
"Proven database technology
Cons
"Conversion between XML and relational tables is needed
# Conversion performance overhead
# Complex XML hierarchy information is hard to convert to tables
# Round‐tripping to exactly same document is hard
"Might not handle Unicode well enough
Native XML database
"XML data is stored in its native format
Pros
"Preserves the XML hierarchy information
Cons
"Not proven yet
"Scalability concern
! When to use
"to integrate information from many different platforms and formats and send it
to business partners or customers
Mapping XML Structure to DB Structure
! Template‐driven mapping
! Model‐driven mapping
"Table model
"Data‐specific object model
Template‐driven Mapping
51
! No predefined mapping
! Embed commands in a template
! Pocessed by middleware
! Very flexible
Model‐Driven Mapping
! Data model is imposed on the structure of the XML document
! Not as flexible as Template‐model
"Typically used with XSLT to compensateIt
! Simple
! Two models
"Table model
"Data‐specific object model
Issues of Storing XML data into Relational Database
! Data types
! Binary data
! Character set
! Processing instruction
! Storing markup
! DTD generation from DB schema and vice versa
Data Types
! All data in XML document is text "Need to be translated to datatypes of
database
! Issues
"Translation is not always easy
Binary Data
! Two ways of storing binary data in XML document
"Unparsed entities
"Base64 encoding
! Issues
"No XML standard notation for Base64‐encoded data
"Application specific notation is needed
# Could be lost during translation
Character Set
! Unicode is native encoding scheme of XML document
! Issues
"Databases might not support Unicode
Storing Markup
! Storing markup information of XML document could be useful
! Issues
"Metadata describing markup data need to be created and maintained
Advanced Features
! Indexing when storing XML documents
52
"Allows faster search
An XML database is a data persistence software system that allows data to be stored in XML
format. This data can then be queried, exported and serialized into the desired format.
Two major classes of XML database exist:
1. XML‐enabled: these map all XML to a traditional database (such as a relational
database[1]), accepting XML as input and rendering XML as output. This term implies
that the database does the conversion itself (as opposed to relying on middleware).
2. Native XML (NXD): the internal model of such databases depends on XML and uses XML
documents as the fundamental unit of storage, which are, however, not necessarily
stored in the form of text files.
Rationale for XML in databases
O'Connell (2005, 9.2) gives one reason for the use of XML in databases: the increasingly
common use of XML for data transport, which has meant that "data is extracted from
databases and put into XML documents and vice‐versa". It may prove more efficient (in terms
of conversion costs) and easier to store the data in XML format.
Native XML databases
The term "native XML database" (NXD) can lead to confusion. Many NXDs do not function as
standalone databases at all, and do not really store the native (text) form.
The formal definition from the XML:DB initiative states that a native XML database:[2]
• Defines a (logical) model for an XML document — as opposed to the data in that
document — and stores and retrieves documents according to that model. At a
minimum, the model must include elements, attributes, PCDATA, and document order.
Examples of such models include the XPath data model, the XML Infoset, and the
models implied by the DOM and the events in SAX 1.0.
• Has an XML document as its fundamental unit of (logical) storage, just as a relational
database has a row in a table as its fundamental unit of (logical) storage.
• Need not have any particular underlying physical storage model. For example, NXDs can
use relational, hierarchical, or object‐oriented database structures, or use a proprietary
storage format (such as indexed, compressed files).
53
XML representation of a relational database
A relational database consists of a set of tables, where each table is a set of records. A record in
turn is a set of fields and each field is a pair field‐name/field‐value. All records in a particular
table have the same number of fields with the same field‐names.
This article describes an application of (a simple subset of) XML that can be used to represent
such a database.
The relational data‐model also defines certain constraints on the tables and defines operations
on them. We are not concerned with the constraints and operations here. In other words, we
are not trying to create a query language or a data‐definition language, just a language that
captures the data in a database or in a particular view of the database.
Several such languages are possible, of course, and it not hard to come up with alternative and
equally valid ones as the one described below.
Introduction
The description of the database above suggests a simple nesting of fields inside records inside
tables inside databases. Here is an example of a single database with two tables:
<!doctype mydata "http://www.w3.org/mydata">
<mydata>
<authors>
<author>
<name>Robert Roberts</name>
<address>10 Tenth St, Decapolis</address>
<editor>Ella Ellis</editor>
<ms type="blob">ftp://docs/rr‐10</ms>
<born>1960/05/26</born>
</author>
<author>
<name>Tom Thomas</name>
<address>2 Second Av, Duo‐Duo</address>
<editor>Ella Ellis</editor>
<ms type="blob">ftp://docs/tt‐2</ms>
</author>
<author>
<name>Mark Marks</name>
<address>1 Premier, Maintown</address>
<editor>Ella Ellis</editor>
<ms type="blob">ftp://docs/mm‐1</ms>
</author>
</authors>
54
<editors>
<editor>
<name>Ella Ellis</name>
<telephone>7356</telephone>
</editor>
</editors>
</mydata>
The format is verbose, since XML is verbose. On the other hand, it compresses well with
standard compression tools. It is also easy to print the database (or a part of it) with standard
XML browsers and a simple style sheet.
The database
A relational can be modeled as a hierarchy of depth four: the database consists of a set of
tables, which in turn consist of records, which in turn consist of fields.
We can model the database with a document node and its associated element node:
<!doctype name "url">
<name>
table1
table2
...
tablen
</name>
The name is arbitrary. The url is optional, but can be used to point to information about the
database. We don't define what it points to. [Or should we?]
The order of the tables is also arbitrary, since a relational database defines no ordering on
them.
The table
Each table of the database is represented by an element node with the records as its children:
<name>
record1
record2
...
recordm
</name>
The name is the name of the table. The order of the records is arbitrary, since the relational
data model defines no ordering on them.
55
The record
A record is also represented by an element node, with its fields as children:
<name>
field1
field2
...
fieldm
</name>
The name is arbitrary, since the relational data model doesn't define a name for a record type.
However, in XML it cannot be omitted. One scheme is to re‐use the name of the table, or, if the
table has a name that is a plural, to use the singular form (`persons' ‐> `person', `parts' ‐>
`part').
The order of the fields is again immaterial.
The field
A field is represented as an element node with a data node as its only child:
<name type="t">
d
</name
If d is omitted, it means the value of the fields is the empty string.
The value of t indicates the type of the value (such as string, number, boolean, date). [Should
we give a complete list?] If the type attribute is omitted, the type is assumed to be `string.'
Null values
Null values are represented by the absence of the field.
Note that this is different from leaving the field empty, which indicates that the field contains a
string of length zero. Null values have special properties in relational databases. For example,
two fields both with null values are not equal (in contrast to two fields with zero‐length strings,
which are).
Strong typing
Tim Bray has written a proposal for adding strong typing to XML, using a set of fixed attributes.
The above example would get attributes declared as follows:
<!doctype mydata "http://www.w3.org/mydata">
<mydata>
<authors>
56
<?xml default name
xml‐sqltype="varchar"
xml‐sqlsize="40"
?>
<?xml default address
xml‐sqltype="varchar"
xml‐sqlsize="40"
?>
...
<?xml default born
xml‐sqltype="date"
xml‐sqlmin="1900/01/01"
xml‐sqlmax="1990/01/01"
?>
...
</authors>
<editors>
<?xml default name
xml‐sqltype="varchar"
xml‐sqlsize="40"
?>
...
</editors>
</mydata>
etc. This will allow an application that knows about these attributes to check the content of
each field.
Introduction: XML and Data
XML stands for eXtensible Markup Language. XML is a meta‐markup language developed by the
World Wide Web Consortium(W3C) to deal with a number of the shortcomings of HTML. As
more and more functionality was added to HTML to account for the diverse needs of users of
the Web, the language began to grow increasingly complex and unwieldy. The need for a way
to create domain‐specific markup languages that did not contain all the cruft of HTML became
increasingly necessary and XML was born.
The main difference between HTML and XML is that whereas in HTML the semantics and syntax
of tags is fixed, in XML the author of the document is free to create tags whose syntax and
semantics are specific to the target application. Also the semantics of a tag is not tied down but
is instead dependent on the context of the application that processes the document. The other
57
significant differences between HTML and XML is that the an XML document must be well‐
formed.
Although the original purpose of XML was as a way to mark up content, it became clear that
XML also provided a way to describe structured data thus making it important as a data storage
and interchange format. XML provides many advantages as a data format over others,
including:
1. Built in support for internationalization due to the fact that it utilizes unicode.
2. Platform independence (for instance, no need to worry about endianess).
3. Human readable format makes it easier for developers to locate and fix errors than with
previous data storage formats.
4. Extensibility in a manner that allows developers to add extra information to a format
without breaking applications that where based on older versions of the format.
5. Large number of off‐the‐shelf tools for processing XML documents already exist.
The world of traditional data storage and XML have never been closer together. To better
understand how data storage and retrievel works in an XML world, this paper will first discuss
the past, present, and future of structuring XML documents. Then we will delve into the
languages that add the ability to query an XML document similar to a traditional data store. This
will be followed by an exploration of how the most popular RDBMSs have recognized the
importance of this new data storage format and have integrated XML into their latest releases.
Finally the rise of new data storage and retrieval systems specifically designed for handling XML
will be shown.
Structuring XML: DTDs and XML Schemas
Since XML is a way to describe structured data there should be a means to specify the structure
of an XML document. Document Type Definitions (DTDs) and XML Schemas are different
mechanisms that are used to specify valid elements that can occur in a document, the order in
which they can occur and constrain certain aspects of these elements. An XML document that
conforms to a DTD or schema is considered to be valid. Below is listing of the different means of
constraining the contents of an XML document.
SAMPLE XML FRAGMENT
<gatech_student gtnum="gt000x">
<name>George Burdell</name>
<age>21</age>
</gatech_student>
1. Document Type Definitions (DTD): DTDs were the original means of specifying the
structure of an XML document and a holdover from XML's roots as a subset of the
Standardized and General Markup Language(SGML). DTDs have a different syntax from
XML and are used to specify the order and occurence of elements in an XML document.
Below is a DTD for the above XML fragment.
2. DTD FOR SAMPLE XML FRAGMENT
58
3.
4. <!ELEMENT gatech_student (name, age)>
5. <!ATTLIST gatech_student gtnum CDATA>
6. <!ELEMENT name (#PCDATA)>
7. <!ELEMENT age (#PCDATA)>
8.
The DTD specifies that the gatech_student element has two child elements, name and
age, that contain character data as well as a gtnum attribute that contains character
data.
9. XML Data Reduced (XDR): DTDs proved to be inadequate for the needs of users of XML
due to to a number of reasons. The main reasons behind the criticisms of DTDs were the
fact that they used a different syntax than XML and their non‐existent support for
datatypes. XDR, a recommendation for XML schemas, was submitted to the W3C by the
Microsoft Corporation as a potential XML schema standard which but was eventually
rejected. XDR tackled some of the problems of DTDs by being XML based as well as
supporting a number of datatypes analogous to those used in relational database
management systems and popular programming languages. Below is an XML schema,
using XDR, for the above XML fragment.
10. XDR FOR SAMPLE XML FRAGMENT
11.
12. <Schema name="myschema" xmlns="urn:schemas‐microsoft‐com:xml‐data"
13. xmlns:dt="urn:schemas‐microsoft‐com:datatypes">
14. <ElementType name="age" dt:type="ui1" />
15. <ElementType name="name" dt:type="string" />
16. <AttributeType name="gtnum" dt:type="string" />
17. <ElementType name="gatech_student" order="seq">
18. <element type="name" minOccurs="1" maxOccurs="1"/>
19. <element type="age" minOccurs="1" maxOccurs="1"/>
20. <attribute type="gtnum" />
21. </ElementType>
22. </Schema>
23.
The above schema specifies types for a name element that contains a string as its
content, an age element that contains an unsigned integer value of size one byte (i.e.
btw 0 and 255), and a gtnum attribute that is a string value. It also specifies a
gatech_student element that has one occurence each of a name and an age element in
sequence as well as a gtnum attribute.
24. XML Schema Definitions (XSD) : The W3C XML schema recommendation provides a
sophisticated means of describing the structure and constraints on the content model of
XML documents. W3C XML schema support more datatypes than XDR, allow for the
creation of custom data types, and support object oriented programming concepts like
inheritance and polymorphism. Currently XDR is used more widely than than W3C XML
schema but this is primarily because the XML Schema recommendation is fairly new and
will thus take time to become accepted by the software industry.
59
25. XSD FOR SAMPLE XML FRAGMENT
26.
27. <schema xmlns="http://www.w3.org/2001/XMLSchema" >
28. <element name="gatech_student">
29. <complexType>
30. <sequence>
31. <element name="name" type="string"/>
32. <element name="age" type="unsignedInt"/>
33. </sequence>
34. <attribute name="gtnum">
35. <simpleType>
36. <restriction base="string">
37. <pattern value="gt\d{3}[A‐Za‐z]{1}"/>
38. </restriction>
39. </simpleType>
40. </attribute>
41. </complexType>
42. </element>
43. </schema>
44.
The above schema specifies a gatech_student complex type (meaning it can have
elements as children) that contains a name and an age element in sequence as well as a
gtnum attribute. The name element has to have a string as content, the age attribute
has an unsigned integer value while the gtnum element has to be matched by a regular
expression that matches the letters "gt" followed by 3 digits and a letter.
The above examples show that DTDs give the least control over how one can constrain and
structure data within an XML document while W3C XML schemas give the most.
XML Querying: XPath and XQuery
It is sometimes necessary to extract subsets of the data stored within an XML document. A
number of languages have been created for querying XML documents including Lorel, Quilt,
UnQL, XDuce, XML‐QL, XPath, XQL, XQuery and YaTL. Since XPath is already a W3C
recommendation while XQuery is on its way to becoming one, the focus of this section will be
on both these languages. Both languages can be used to retrieve and manipulate data from an
XML document.
1. XML Path Language (XPath): XPath is a language for addressing parts of an XML
document that utilizes a syntax that resembles hierarchical paths used to address parts
of a filesystem or URL. XPath also supports the use of functions for interacting with the
selected data from the document. It provides functions for the accessing information
about document nodes as well as for the manipulation of strings, numbers and
booleans. XPath is extensible with regards to functions which allows developers to add
functions that manipulate the data retrieved by an XPath query to the library of
functions available by default. XPath uses a compact, non‐XML syntax in order to
facilitate the use of XPath within URIs and XML attribute values (this is important for
other W3C recommendations like XML schema and XSLT that use XPath within
60
attributes).
XPath operates on the abstract, logical structure of an XML document, rather than its
surface syntax. XPath is designed to operate on a single XML document which it views as
a tree of nodes and the values returned by an XPath query are considered conceptually
to be nodes. The types of nodes that exist in the XPath data model of a document are
text nodes, element nodes, attribute nodes, root nodes, namespace nodes, processing
instruction nodes, and comment nodes.
Sample XPath Queries Against Sample XML Fragment
a. /gatech_student/name
Selects all name elements that are children of the root element gatech_student.
b. //age
Selects all age elements in the document.
c. /gatech_student/*
Selects all child elements of the root element gatech_student.
d. /gatech_student[@gtnum]
Selects all gtnum attributes of the gatech_student elements in the document.
e. //*[name()='age']
Selects all elements that are named "age".
f. /gatech_student/age/ancestor::*
Selects all ancestors of all the age elements that are children of the
gatech_student element (which should select the gatech_student element).
2.
3. XML Query Language (XQuery): XQuery is an attempt to provide a query language that
provides the same breadth of functionality and underlying formalism as SQL does for
relational databases. XQuery is a functional language where each query is an expression.
XQuery expressions fall into seven broad types; path expressions, element constructors,
FLWR expressions, expressions involving operators and functions, conditional
expressions, quantified expressions or expressions that test or modify datatypes. The
syntax and semantics of the different kinds of XQuery expressions vary significantly
which is a testament to the numerous influences in the design of XQuery.
XQuery has a sophisticated type system based on XML schema datatypes and supports
the manipulation of the document nodes unlike XPath. Also the data model of XQuery is
61
not only designed to operate on a single XML document but also a well‐formed
fragment of a document, a sequence of documents, or a sequence of document
fragments.
W3C is also working towards creating an alternate version of XQuery that has the same
semantics but uses XML based syntax instead called XQueryX.
Sample XQuery Queries and Expressions Taken From W3C Working Draft
a. path expressions: XQuery supports path expressions that are a superset of those
currently being proposed for the next version of XPath.
i. //emp[name="Fred"]/salary * 12
From a document that contains employees and their monthly salaries,
extract the annual salary of the employee named "Fred".
ii. document("zoo.xml")//chapter[2 TO 5]//figure
Find all the figures in chapters 2 through 5 of the document named
"zoo.xml."
b. element constructors: In some situations, it is necessary for a query to create or
generate elements. Such elements can be embeded directly into a query in an
expression called an element constructor.
i. <emp empid = {$id}>
ii. {$name}
iii. {$job}
iv. </emp>
v. Generate an <emp> element that has an "empid" attribute. The value of
the attribute and the content of the element are specified by variables
that are bound in other parts of the query.
c. FLWR expressions: A FLWR (pronounced "flower") expression is a query
construct composed of FOR, LET, WHERE, and a RETURN clauses. A FOR clause is
an iteration construct that binds a variable to a sequence of values returned by a
query (typically a path expression). A LET clause similarly binds variables to
values but instead of a series of bindings only one occurs similar to an
assignment statement in a programming language. A WHERE clause contains one
or more predicates that are used on the nodes returned by preceding LET or FOR
clauses. The RETURN clause generates the output of the FLWR expression, which
may be any sequence of nodes or primitive values. The RETURN clause is
executed once for each node returned by the FOR and LET clauses that passes
the WHERE clause. The results of these multiple executions is concatenated and
returned as the result of the expression.
i. FOR $b IN document("bib.xml")//book
ii. WHERE $b/publisher = "Morgan Kaufmann"
iii. AND $b/year = "1998"
iv. RETURN $b/title
62
v. List the titles of books published by Morgan Kaufmann in 1998.
vi. <big_publishers>
vii. {
viii. FOR $p IN distinct(document("bib.xml")//publisher)
ix. LET $b := document("bib.xml")//book[publisher = $p]
x. WHERE count($b) > 100
xi. RETURN $p
xii. }
xiii. </big_publishers>
xiv. List the publishers who have published more than 100 books.
d. conditional expressions: A conditional expression evaluates a test expression
and then returns one of two result expressions. If the value of the test
expression is true, the value of the first result expression is returned otherwise,
the value of the second result expression is returned.
i. FOR $h IN //holding
ii. RETURN
iii. <holding>
iv. {$h/title,
v. IF ($h/@type = "Journal")
vi. THEN $h/editor
vii. ELSE $h/author
viii. }
ix. </holding>
x. SORTBY (title)
xi.
xii. Make a list of holdings, ordered by title. For journals, include the editor,
and for all other holdings, include the author.
e. quantified expressions: XQuery has constructs that are equivalent to quantifiers
used in mathematics and logic. The SOME clause is an existential quantifier used
for testing to see if a series of values contains at least one node that satisfies a
predicate. The EVERY clause is a universal quantifier used to test to see if all
nodes in a series of values satisfy a predicate.
i. FOR $b IN //book
ii. WHERE SOME $p IN $b//para SATISFIES
iii. (contains($p, "sailing") AND contains($p, "windsurfing"))
iv. RETURN $b/title
v.
vi. Find titles of books in which both sailing and windsurfing are mentioned
in the same paragraph.
vii. FOR $b IN //book
viii. WHERE EVERY $p IN $b//para SATISFIES
ix. contains($p, "sailing")
x. RETURN $b/title
63
xi.
xii. Find titles of books in where sailing is mentioned in every paragraph.
f. expressions involving user defined functions: Besides providing a core library of
functions similar to those in XPath, XQuery also allows user defined functions to
be used to extend the core function library.
i. NAMESPACE xsd = "http://www.w3.org/2001/XMLSchema"
ii.
iii. DEFINE FUNCTION depth($e) RETURNS xsd:integer
iv. {
v. # An empty element has depth 1
vi. # Otherwise, add 1 to max depth of children
vii. IF (empty($e/*)) THEN 1
viii. ELSE max(depth($e/*)) + 1
ix. }
x.
xi. depth(document("partlist.xml"))
xii.
xiii. Find the maximum depth of the document named "partlist.xml."
XML and Databases
As was mentioned in the introduction, there is a dichotomy in how XML is used in
industry. On one hand there is the document‐centric model of XML where XML is
typically used as a means to creating semi‐structured documents with irregular content
that are meant for human consumption. An example of document‐centric usage of XML
is XHTML which is the XML based successor to HTML.
SAMPLE XHTML DOCUMENT
<html xmlns ="http://www.w3.org/1999/xhtml">
<head>
<title>Sample Web Page</title>
</head>
<body>
<h1>My Sample Web Page</h1>
<p> All XHTML documents must be well‐formed and valid. </p>
<img src="http://www.example.com/sample.jpg" height ="50" width = "25"/>
<br />
<br />
</body>
</html>
The other primary usage of XML is in a data‐centric model. In a data‐centric model, XML
is used as a storage or interchange format for data that is structured, appears in a
64
regular order and is most likely to be machine processed instead of read by a human. In
a data‐centric model, the fact that the data is stored or transferred as XML is typically
incidental since it could be stored or transferred in a number of other formats which
may or may not be better suited for the task depending on the data and how it is used.
An example of a data‐centric usage of XML is SOAP. SOAP is an XML based protocol used
for exchanging information in a decentralized, distributed environment. A SOAP
message consists of three parts: an envelope that defines a framework for describing
what is in a message and how to process it, a set of encoding rules for expressing
instances of application‐defined datatypes, and a convention for representing remote
procedure calls and responses.
SAMPLE SOAP MESSAGE TAKEN FROM W3C SOAP RECOMMENDATION
<SOAP‐ENV:Envelope xmlns:SOAP‐
ENV="http://schemas.xmlsoap.org/soap/envelope/"
SOAP‐ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<SOAP‐ENV:Body>
<m:GetLastTradePrice xmlns:m="Some‐URI">
<symbol>DIS</symbol>
</m:GetLastTradePrice>
</SOAP‐ENV:Body>
</SOAP‐ENV:Envelope>
In both models where XML is used, it is sometimes necessary to store the XML in some
sort of repository or database that allows for more sophisticated storage and retrieval of
the data especially if the XML is to be accessed by multiple users. Below is a description
of storage options based on what model of XML usage is required.
7. Data‐centric model: In a data‐centric model where data is stored in a relational
database or similar repository; one may want to extract data from a database as XML, store
XML into a database or both. For situations where one only needs to extract XML from the
database one may use a middleware application or component that retrieves data from the
database and returns it as XML. Middleware components that transform relational data to XML
and back vary widely in the functionality they provide and how they provide it.
8. Document‐centric model: Content management systems are typically the
tool of choice when considering storing, updating and retrieving various XML
documents in a shared repository. A content management system typically
consists of a repository that stores a variety of XML documents, an editor and an
engine that provides one or more of the following features:
version, revison and access control
ability to reuse documents in different formats
collaboration
65
web publishing facilities
support for a variety of text editors (e.g. Microsoft Word, Adobe
Framemaker, etc)
indexing and search capabilities
Content management systems have been primarily of benefit for workflow
management in corporate environments where information sharing is vital and
as a way to manage the creation of web content in a modular fashion allowing
web developers and content creators to perform their tasks with less
interdependence than exists in a traditional web authoring environment.
Examples of XML based content management systems are SyCOMAX, Content@,
Frontier, Entrepid, XDisect, and SiberSafe.
9. Hybrid model: In situations where both documentric‐centric and data‐
centric models of XML usage will occur, the best data storage choice is usually a
native XML database. What actually constitutes a native XML database has been
a topic of some debate in various fora which has been compounded by the
blurred lines that many see between XML‐enabled databases, XML query
engines, XML servers and native XML databases. The most coherrent definition
so far is one that was reached by consensus amongst members of the XML:DB
mailing list which defines a native XML database as a database that has an XML
document as its fundamental unit of (logical) storage and defines a (logical)
model for an XML document, as opposed to the data in that document, and
stores and retrieves documents according to that model. At a minimum, the
model must include elements, attributes, PCDATA, and document order.
Described below are two examples of native XML databases with the intent of
showing the breadth of functionality and variety that can be expected in the
native XML database arena.
Querying the XML documents within the system is done using XPath and the
documents can be indexed to improve query performance.
dbXML is written in Java but supports access from other languages by exposing a
CORBA API thus allowing interaction with any language that supports a CORBA
binding. It also ships with a Java implementation of the XML:DB XML Database
API which is designed to be a vendor neutral API for XML databases. A number of
command line tools for managing documents and collections are also provided.
dbXML is mostly still in development (version at time of writing was 1.0 beta 2)
and does not currently support transactions or the use of schemas but these
features are currently being developed for future versions.
XML has evolved into a viable alternative for representing data. As more applications use XML,
the big question becomes how to combine XML with relational databases. Let’s dive deeper
into the issues involved in combining XML with databases and look at how all that data can be
stored and queried.
XML database types
66
There are two categories to consider when deciding which type of XML database fits a
particular application:
• Data‐centric: Products that actually store the data or content in non‐XML format
• Document‐centric: Products that store complete XML documents in relational tables or
on disk in file structures
Data‐centric databases store data separate from the XML schema, usually just transforming the
original content into relational tables. These products are referred to as XML‐enabled
databases. If an XML document is needed, the data stored in relational tables can be queried
and an XML document created. Most major relational databases (Sybase, Oracle, and SQL
Server) fall into this category.
Document‐centric databases store the entire XML document in a relational, text, or proprietary
format. These are called native XML databases. A couple of popular native XML databases are
the Xindice (zeen‐dee‐chay) open source product from Apache and eXist, which is also open
source.
Querying, XML style
Support for XPath or XML queries is a primary feature in XML databases. The major relational
database vendors provide XPath support, while native XML databases provide support for
querying with XPath, usually via the XML:DB API. Finding developers who understand XPath,
much less database administrators, is a problem. However, for a simply structured database or
hierarchical database, or for XML documents, XPath is more efficient than SQL. Unfortunately,
the necessary string and date functions to manipulate the results don’t exist in SQL (String and
Date functions are used in the XSLT code). For more complete queries, XML Query is more like
SQL but is less supported.
For example, the SQL query below can’t be represented with XPath:
SELECT left(name,3) from employees
However, the following SQL:
SELECT * FROM employees WHERE left(name,3) = 'hoo'
can be queried with something like:
//employees/name[starts‐with(last,'hoo')]
and with this XML Query:
for $t in document("employeeList.xml")//(employee)/name where contains($t/text(), "hoo")
return $t }
Bear in mind that XML:DB and XPath are more efficient at querying XML documents, not
relational data structures.
XML support in relational databases
67
XML‐enabled applications support creating information as XML, and reading XML is an
important feature. Large vendors like Microsoft, Oracle, and IBM (and more) have succeeded in
transforming relational data into XML and have XPath or XML Query implementations. Each
platform also offers tools to compliment its database offerings. Programming is often required
to maintain XML content, and SDKs (Software Development Kits) are available.
Most systems (excluding XML databases) lack methods to directly import or read XML
documents. You can program SQL Server 2000 with stored procedures to import XML content
directly into one or more tables. DB2 and Oracle have similar functionality.
XML‐centric applications, such as BizTalk Server and XML Spy, do a much better job at reading
XML documents since they act as a bridge between the XML and database. However, these
programs require a serious commitment and substantial investment. One reason that these
applications import XML so easily is that they support XML Schemas and data type definitions
(DTDs). As more mature XML applications are developed, translating XML data (reading and
writing) based on a DTD or Schema will allow much more flexibility in how the data can be
used, because the DTD or Schema is easily mapped to relational tables or the needed data
model.
Document content and Web pages
Content delivered on Web sites is still basically stored in static HTML pages or relational
databases, even though this type of "informational" content is probably best suited for XML.
One of the more popular products that stores such content in XML is Cocoon from Apache.
Enhydra is another Java/XML‐based application server, and the eXist and Xindice database
products easily integrate with Cocoon.
For catalogs, documents, and other data, XML delivers on the promise of an efficient data store
and transport. More content is available with native XML databases or XML‐enabled relational
databases. Web sites and online content will benefit most from a native XML database. For
more information, check out the XML:DB Web site.
Is XML a database?
XML document is a collection of data. In other words it doesn’t make much difference between
the other files that store data. A XML in a database format is a self describing, portable, and can
describe data in tree or graph structure. XML is a sort of Database Management System
(DBMS).
XML provides storage, schemas, query languages, programming interfaces and so on .It lacks in
triggers, queries, multi‐user access that a real database constitutes. The main advantage of XML
is that the data is portable and it allows you to have nested entries.
XML allows you to preserve physical document structure, supports document level transactions
and execute queries in an XML query language.
Mapping the XML document schema to the database schema does the transfer of data
between XML documents and a database. Mappings between document schemas and database
schemas are performed on attributes and text. There are 2 mappings that are generally used to
map on XML document schema to the database schema:
68
I) TABLE BASED MAPPING
ii) OBJECT RELATIONAL MAPPING
Native XML databases are designed especially to store XML documents .It is always possible to
store data in XML documents in a native XML database. This is done so, when your data is semi‐
structured. Although, this kind of data can be stored in object oriented and hierarchical
databases, it is always better to store it in a native XML database. It enables us to retrieve data
much faster than a relational database. One more reason is to store data in a native XML
database is to exploit XML specification capabilities, such as executing XML queries.
Advantages of web services built on XML based standards
Web services built on XML based standards has a lot of benefits over the other web services
that are based on RPC. The RPCs' are platform dependent but the web services built using the
XML standards are platform and language independent. With this advantage you can use it for
communication between any types of application that resides on any platform.
The invocation information is passed to the service provider in the form an XML document and
hence it is platform independent. The protocol used for such a transfer is the HTTP that is
supported by all the browsers.
Hence you can just pass on the information regarding the object that is to be executed through
the browser itself using the HTTP. This is one of the major advantages of using the XML based
web services as it can easily pass through the firewalls.
Standard Description
and
Comments
XML 1.0 (4th The Extensible Markup Language (XML) is a subset
Ed.) of SGML that is completely described in this
document from the W3C. Its goal is to enable
generic SGML to be served, received, and processed
on the Web in the way that is now possible with
HTML. XML has been designed for ease of
implementation and for interoperability with both
SGML and HTML.
Namespaces XML namespaces provide a simple method for
in XML 1.0
qualifying element and attribute names used in
(2nd Ed.) Extensible Markup Language (XML) documents by
69
associating them with namespaces identified by URI
references.
XML Infoset XML Information Set (Infoset) provides a set of
1.0 (2nd Ed.)
definitions for use in other specifications that need
to refer to the information in an XML document.
XSLT 1.0 XSLT 1.0 is designed for use as part of XSL, which is
a stylesheet language for XML. In addition to XSLT,
XSL includes an XML vocabulary for specifying
formatting. XSL specifies the styling of an XML
document by using XSLT to describe how the
document is transformed into another XML
document that uses the formatting vocabulary.
The rarely used elements xsl:strip‐space and
xsl:preserve‐space are currently ignored.
XSLT 2.0 XSLT 2.0 is the long‐awaited upgrade to XSLT 1.0
and includes important new schema‐aware
functions, grouping, aggregation, node‐set, "for"
loops, and much more. For a detailed description of
the new capabilities, please see this comparison.
The rarely used elements xsl:strip‐space and
xsl:preserve‐space are currently ignored. Also, the
attribute input‐type‐annotations is not yet
supported.
XPath 1.0 XPath 1.0 is a language for addressing parts of an
XML document, designed to be used by both XSLT
and XPointer.
XPath 2.0 XPath 2.0 is a superset of [XPath 1.0], with the
added capability to support a richer set of data
types, and to take advantage of the type
information that becomes available when
documents are validated using XML Schema. For a
detailed description of the new capabilities, please
see this comparison.
XQuery 1.0 An extension of the XPath 2.0 specification, XQuery
is a language for extracting information from XML
documents and databases.
XInclude 1.0 XInclude specifies a processing model and syntax
(2nd Ed.)
for general purpose inclusion. Inclusion is
accomplished by merging a number of XML
information sets into a single composite infoset.
70
Specification of the XML documents (infosets) to be
merged and control over the merging process is
expressed in XML‐friendly syntax (elements,
attributes, URI references).
XPointer 1.0 XML Pointer Language (XPointer) is the language to
be used as the basis for a fragment identifier for any
URI reference that locates a resource whose
Internet media type is one of text/xml,
application/xml, text/xml‐external‐parsed‐entity, or
application/xml‐external‐parsed‐entity.
XML Schema XML Schema specifies the XML Schema definition
1.0 (2nd Ed.) language, which offers facilities for describing the
structure and constraining the contents of XML 1.0
documents, including those which exploit the XML
Namespace facility. The schema language, which is
itself represented in XML 1.0 and uses namespaces,
substantially reconstructs and considerably extends
the capabilities found in XML 1.0 document type
definitions (DTDs).
SOAP 1.2 SOAP is a lightweight protocol for exchange of
information in a decentralized, distributed
environment. It is an XML based protocol that
consists of three parts: an envelope that defines a
framework for describing what is in a message and
how to process it, a set of encoding rules for
expressing instances of application‐defined
datatypes, and a convention for representing
remote procedure calls and responses. SOAP can
potentially be used in combination with a variety of
other protocols; however, the only bindings defined
in this document describe how to use SOAP in
combination with HTTP and HTTP Extension
Framework.
WSDL 1.1 WSDL is an XML format for describing network
services as a set of endpoints operating on
messages containing either document‐oriented or
procedure‐oriented information. The operations
and messages are described abstractly, and then
bound to a concrete network protocol and message
format to define an endpoint. Related concrete
endpoints are combined into abstract endpoints
(services). WSDL is extensible to allow description
71
of endpoints and their messages regardless of what
message formats or network protocols are used to
communicate, however, the only bindings described
in this document describe how to use WSDL in
conjunction with SOAP 1.1, HTTP GET/POST, and
MIME.
RDF Resource Description Framework (RDF) is a family
of W3C specifications originally designed as a
metadata data model, that has come to be used as
a general method of modeling information through
a variety of syntax formats.
OWL The OWL Web Ontology Language is designed for
use by applications that need to process the
content of information instead of just presenting
information to humans. OWL facilitates greater
machine interpretability of Web content than that
supported by XML, RDF, and RDF Schema (RDF‐S) by
providing additional vocabulary along with a formal
semantics. OWL has three increasingly‐expressive
sublanguages: OWL Lite, OWL DL, and OWL Full.
XML Catalogs In order to make optimal use of the information
about an XML external resource, there needs to be
some interoperable way to map the information in
an XML external identifier into a URI reference for
the desired resource.
This OASIS XML Catalog Standard defines an entity
catalog that handles two simple cases:
• Mapping an external entity's public
identifier and/or system identifier to a URI
reference.
• Mapping the URI reference of a resource (a
namespace name, stylesheet, image, etc.) to
another URI reference.
Unicode 4.1.0 The Unicode Standard, Version 4.1.0, defined by:
The Unicode Standard, Version 4.0 (Boston, MA,
Addison‐Wesley, 2003. ISBN 0‐321‐18578‐1), as
amended by Unicode 4.0.1 and Unicode 4.1.0.
UML 2.2 UML is a graphical language for organizing,
analyzing, and planning object‐oriented or
component‐based software projects. The UML 2.2
specification defines thirteen major different
72
diagram types and over one thousand graphical and
textual language elements, as well as additional
extension mechanisms.
XMI 2.1 XMI is a model driven XML Integration framework
for defining, interchanging, manipulating and
integrating XML data and objects. XMI‐based
standards are in use for integrating tools,
repositories, applications and data warehouses. XMI
provides rules by which a schema can be generated
for any valid XMI‐transmissible MOF‐based
metamodel.
BPMN 1.0 The Business Process Modeling Notation (BPMN) is
a graphical notation that depicts the steps in a
business process. BPMN depicts the end to end flow
of a business process. The notation has been
specifically designed to coordinate the sequence of
processes and the messages that flow between
different process participants in a related set of
activities.
CSS 2.1 CSS 2.1 is a style sheet language that allows authors
and users to attach style (e.g., fonts and spacing) to
structured documents (e.g., HTML documents and
XML applications). By separating the presentation
style of documents from the content of documents,
CSS 2.1 simplifies Web authoring and site
maintenance.
HTML 4.01 HTML 4 supports more multimedia options,
scripting languages, style sheets, better printing
facilities, and documents that are more accessible
to users with disabilities. HTML 4 also takes great
strides towards the internationalization of
documents, with the goal of making the Web truly
World Wide.
JavaScript JavaScript is a scripting language that is often used
for client‐side Web development to write functions
that are embedded or included from HTML pages
for dynamic presentation features such as pop‐up
windows, form validation, and mouse‐over effects.
JavaScript is a superset of the ECMA‐262 Edition 3
(ECMAScript) standard scripting language, with only
mild differences from the published standard.
73
EDIFACT D EDIFACT is a set of United Nations rules for
1993A ‐ D Electronic Data Interchange for Administration,
2007B Commerce and Transport. They comprise a set of
internationally agreed standards, directories and
guidelines for the electronic interchange of
structured data, and in particular that related to
trade in goods and services between independent,
computerized information systems.
X12 3040 ‐ ASC X12 brings together business and industry
5030 professionals in a cross‐industry forum to develop
and support electronic data exchange standards
and related documents for the national and
international marketplace to enhance business
processes, reduce costs and expand organizational
reach.
WebDAV WebDAV stands for "Web‐based Distributed
Authoring and Versioning". It is a set of extensions
to the HTTP protocol which allows users to
collaboratively edit and manage files on remote
web servers.
SQL ISO/IEC 9075 defines the SQL database language.
The scope of SQL is the definition of data structure
and the operations on data stored in that structure.
ISO/IEC 9075‐1, ‐2 and ‐11 encompass the minimum
requirements of the language. Other parts define
extensions.
Output formats:
Standard Description and Comments
RTF 1.9 The Rich Text Format (RTF) Specification provides a format for text and
graphics interchange that can be used with different output devices,
operating environments, and operating systems. Version 1.9.1 of the
specification contains the latest updates introduced by Microsoft Office
Word 2007.
PDF 1.7 PDF is now a formal open standard known as ISO 32000. Maintained by the
International Organization for Standardization, ISO 32000 will continue to be
developed with the objective of protecting the integrity and longevity of PDF,
providing an open standard for the more than one billion PDF files in
existence today.
74
XSL:FO
XSL:FO, is a markup language for XML document formatting which is most
often used to generate PDFs. XSL:FO is part of XSL, a set of W3C technologies
designed for the transformation and formatting of XML data. The other parts
of XSL are XSLT and XPath.
DTD –DOCUMENT TYPE DEFINITION
The purpose of a DTD (Document Type Definition) is to define the legal building blocks of an
XML document.
A DTD defines the document structure with a list of legal elements and attributes.
DTD Newspaper Example
<!DOCTYPE NEWSPAPER [
<!ELEMENT NEWSPAPER (ARTICLE+)>
<!ELEMENT ARTICLE (HEADLINE,BYLINE,LEAD,BODY,NOTES)>
<!ELEMENT HEADLINE (#PCDATA)>
<!ELEMENT BYLINE (#PCDATA)>
<!ELEMENT LEAD (#PCDATA)>
<!ELEMENT BODY (#PCDATA)>
<!ELEMENT NOTES (#PCDATA)>
<!ATTLIST ARTICLE AUTHOR CDATA #REQUIRED>
<!ATTLIST ARTICLE EDITOR CDATA #IMPLIED>
<!ATTLIST ARTICLE DATE CDATA #IMPLIED>
<!ATTLIST ARTICLE EDITION CDATA #IMPLIED>
]>
Introduction to DTD
A Document Type Definition (DTD) defines the legal building blocks of an XML document. It
defines the document structure with a list of legal elements and attributes.
A DTD can be declared inline inside an XML document, or as an external reference.
Internal DTD Declaration
If the DTD is declared inside the XML file, it should be wrapped in a DOCTYPE definition with the
following syntax:
<!DOCTYPE root‐element [element‐declarations]>
75
Example XML document with an internal DTD:
<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note>
Open the XML file above in your browser (select "view source" or "view page source" to view
the DTD)
The DTD above is interpreted like this:
• !DOCTYPE note defines that the root element of this document is note
• !ELEMENT note defines that the note element contains four elements:
"to,from,heading,body"
• !ELEMENT to defines the to element to be of type "#PCDATA"
• !ELEMENT from defines the from element to be of type "#PCDATA"
• !ELEMENT heading defines the heading element to be of type "#PCDATA"
• !ELEMENT body defines the body element to be of type "#PCDATA"
External DTD Declaration
If the DTD is declared in an external file, it should be wrapped in a DOCTYPE definition with the
following syntax: <!DOCTYPE root‐element SYSTEM "filename">
This is the same XML document as above, but with an external DTD (Open it, and select view
source):
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
And this is the file "note.dtd" which contains the DTD:
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
76
Why Use a DTD?
With a DTD, each of your XML files can carry a description of its own format.
With a DTD, independent groups of people can agree to use a standard DTD for interchanging
data.
Your application can use a standard DTD to verify that the data you receive from the outside
world is valid.You can also use a DTD to verify your own data.
DTD ‐ XML Building Blocks
The main building blocks of both XML and HTML documents are elements.
The Building Blocks of XML Documents
Seen from a DTD point of view, all XML documents (and HTML documents) are made up by the
following building blocks:
• Elements
• Attributes
• Entities
• PCDATA
• CDATA
Elements
Elements are the main building blocks of both XML and HTML documents.
Examples of HTML elements are "body" and "table". Examples of XML elements could be "note"
and "message". Elements can contain text, other elements, or be empty. Examples of empty
HTML elements are "hr", "br" and "img".
Examples:
<body>some text</body>
<message>some text</message>
Attributes
Attributes provide extra information about elements.
Attributes are always placed inside the opening tag of an element. Attributes always come in
name/value pairs. The following "img" element has additional information about a source file:
<img src="computer.gif" />
The name of the element is "img". The name of the attribute is "src". The value of the attribute
is "computer.gif". Since the element itself is empty it is closed by a " /".
Entities
Some characters have a special meaning in XML, like the less than sign (<) that defines the start
of an XML tag.
Most of you know the HTML entity: " ". This "no‐breaking‐space" entity is used in HTML
to insert an extra space in a document. Entities are expanded when a document is parsed by an
XML parser.
The following entities are predefined in XML:
Entity References Character
77
< <
> >
& &
" "
' '
PCDATA
PCDATA means parsed character data.
Think of character data as the text found between the start tag and the end tag of an XML
element.
PCDATA is text that WILL be parsed by a parser. The text will be examined by the parser for
entities and markup.
Tags inside the text will be treated as markup and entities will be expanded.
However, parsed character data should not contain any &, <, or > characters; these need to be
represented by the & < and > entities, respectively.
CDATA
CDATA means character data.
CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as
markup and entities will not be expanded.
DTD ‐ Elements
In a DTD, elements are declared with an ELEMENT declaration.
Declaring Elements
In a DTD, XML elements are declared with an element declaration with the following syntax:
<!ELEMENT element‐name category>
or
<!ELEMENT element‐name (element‐content)>
Empty Elements
Empty elements are declared with the category keyword EMPTY:
<!ELEMENT element‐name EMPTY>
Example:
<!ELEMENT br EMPTY>
XML example:
<br />
Elements with Parsed Character Data
Elements with only parsed character data are declared with #PCDATA inside parentheses:
78
<!ELEMENT element‐name (#PCDATA)>
Example:
<!ELEMENT from (#PCDATA)>
Elements with any Contents
Elements declared with the category keyword ANY, can contain any combination of parsable
data:
<!ELEMENT element‐name ANY>
Example:
<!ELEMENT note ANY>
Elements with Children (sequences)
Elements with one or more children are declared with the name of the children elements inside
parentheses:
<!ELEMENT element‐name (child1)>
or
<!ELEMENT element‐name (child1,child2,...)>
Example:
<!ELEMENT note (to,from,heading,body)>
When children are declared in a sequence separated by commas, the children must appear in
the same sequence in the document. In a full declaration, the children must also be declared,
and the children can also have children. The full declaration of the "note" element is:
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
Declaring Only One Occurrence of an Element
<!ELEMENT element‐name (child‐name)>
Example:
<!ELEMENT note (message)>
The example above declares that the child element "message" must occur once, and only once
inside the "note" element.
Declaring Minimum One Occurrence of an Element
<!ELEMENT element‐name (child‐name+)>
Example:
79
<!ELEMENT note (message+)>
The + sign in the example above declares that the child element "message" must occur one or
more times inside the "note" element.
Declaring Zero or More Occurrences of an Element
<!ELEMENT element‐name (child‐name*)>
Example:
<!ELEMENT note (message*)>
The * sign in the example above declares that the child element "message" can occur zero or
more times inside the "note" element.
Declaring Zero or One Occurrences of an Element
<!ELEMENT element‐name (child‐name?)>
Example:
<!ELEMENT note (message?)>
The ? sign in the example above declares that the child element "message" can occur zero or
one time inside the "note" element.
Declaring either/or Content
Example:
<!ELEMENT note (to,from,header,(message|body))>
The example above declares that the "note" element must contain a "to" element, a "from"
element, a "header" element, and either a "message" or a "body" element.
Declaring Mixed Content
Example:
<!ELEMENT note (#PCDATA|to|from|header|message)*>
The example above declares that the "note" element can contain zero or more occurrences of
parsed character data, "to", "from", "header", or "message" elements.
DTD ‐ Attributes
In a DTD, attributes are declared with an ATTLIST declaration.
Declaring Attributes
An attribute declaration has the following syntax:
80
<!ATTLIST element‐name attribute‐name attribute‐type default‐value>
DTD example:
<!ATTLIST payment type CDATA "check">
XML example:
<payment type="check" />
The attribute‐type can be one of the following:
Type Description
CDATA The value is character data
(en1|en2|..) The value must be one from an enumerated list
ID The value is a unique id
IDREF The value is the id of another element
IDREFS The value is a list of other ids
NMTOKEN The value is a valid XML name
NMTOKENS The value is a list of valid XML names
ENTITY The value is an entity
ENTITIES The value is a list of entities
NOTATION The value is a name of a notation
xml: The value is a predefined xml value
The default‐value can be one of the following:
Value Explanation
value The default value of the attribute
#REQUIRED The attribute is required
#IMPLIED The attribute is not required
#FIXED value The attribute value is fixed
A Default Attribute Value
DTD:
<!ELEMENT square EMPTY>
<!ATTLIST square width CDATA "0">
Valid XML:
<square width="100" />
81
In the example above, the "square" element is defined to be an empty element with a "width"
attribute of type CDATA. If no width is specified, it has a default value of 0.
#REQUIRED
Syntax
<!ATTLIST element‐name attribute‐name attribute‐type #REQUIRED>
Example
DTD:
<!ATTLIST person number CDATA #REQUIRED>
Valid XML:
<person number="5677" />
Invalid XML:
<person />
Use the #REQUIRED keyword if you don't have an option for a default value, but still want to
force the attribute to be present.
#IMPLIED
Syntax
<!ATTLIST element‐name attribute‐name attribute‐type #IMPLIED>
Example
DTD:
<!ATTLIST contact fax CDATA #IMPLIED>
Valid XML:
<contact fax="555‐667788" />
Valid XML:
<contact />
Use the #IMPLIED keyword if you don't want to force the author to include an attribute, and
you don't have an option for a default value.
#FIXED
Syntax
<!ATTLIST element‐name attribute‐name attribute‐type #FIXED "value">
Example
DTD:
82
<!ATTLIST sender company CDATA #FIXED "Microsoft">
Valid XML:
<sender company="Microsoft" />
Invalid XML:
<sender company="W3Schools" />
Use the #FIXED keyword when you want an attribute to have a fixed value without allowing the
author to change it. If an author includes another value, the XML parser will return an error.
Enumerated Attribute Values
Syntax
<!ATTLIST element‐name attribute‐name (en1|en2|..) default‐value>
Example
DTD:
<!ATTLIST payment type (check|cash) "cash">
XML example:
<payment type="check" />
or
<payment type="cash" />
Use enumerated attribute values when you want the attribute value to be one of a fixed set of
legal values.
XML Elements vs. Attributes
In XML, there are no rules about when to use attributes, and when to use child elements.
Use of Elements vs. Attributes
Data can be stored in child elements or in attributes.
Take a look at these examples:
<person sex="female">
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
<person>
<sex>female</sex>
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
83
In the first example sex is an attribute. In the last, sex is a child element. Both examples provide
the same information.
There are no rules about when to use attributes, and when to use child elements. My
experience is that attributes are handy in HTML, but in XML you should try to avoid them. Use
child elements if the information feels like data.
My Favorite Way
I like to store data in child elements.
The following three XML documents contain exactly the same information:
A date attribute is used in the first example:
<note date="12/11/2002">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
A date element is used in the second example:
<note>
<date>12/11/2002</date>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
An expanded date element is used in the third: (THIS IS MY FAVORITE):
<note>
<date>
<day>12</day>
<month>11</month>
<year>2002</year>
</date>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
Avoid using attributes?
Should you avoid using attributes?
Some of the problems with attributes are:
• attributes cannot contain multiple values (child elements can)
• attributes are not easily expandable (for future changes)
• attributes cannot describe structures (child elements can)
• attributes are more difficult to manipulate by program code
• attribute values are not easy to test against a DTD
84
If you use attributes as containers for data, you end up with documents that are difficult to read
and maintain. Try to use elements to describe data. Use attributes only to provide information
that is not relevant to the data.
Don't end up like this (this is not how XML should be used):
<note day="12" month="11" year="2002"
to="Tove" from="Jani" heading="Reminder"
body="Don't forget me this weekend!">
</note>
An Exception to my Attribute Rule
Rules always have exceptions.
My rule about attributes has one exception:
Sometimes I assign ID references to elements. These ID references can be used to access XML
elements in much the same way as the NAME or ID attributes in HTML. This example
demonstrates this:
<messages>
<note id="p501">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
<note id="p502">
<to>Jani</to>
<from>Tove</from>
<heading>Re: Reminder</heading>
<body>I will not!</body>
</note>
</messages>
The ID in these examples is just a counter, or a unique identifier, to identify the different notes
in the XML file, and not a part of the note data.
What I am trying to say here is that metadata (data about data) should be stored as attributes,
and that data itself should be stored as elements.
DTD ‐ Entities
Entities are variables used to define shortcuts to standard text or special characters.
• Entity references are references to entities
• Entities can be declared internal or external
85
An Internal Entity Declaration
Syntax
<!ENTITY entity‐name "entity‐value">
DTD Example:
<!ENTITY writer "Donald Duck.">
<!ENTITY copyright "Copyright W3Schools.">
XML example: <author>&writer;©right;</author>
Note: An entity has three parts: an ampersand (&), an entity name, and a semicolon (;).
An External Entity Declaration
Syntax
<!ENTITY entity‐name SYSTEM "URI/URL">
Example
DTD Example:
<!ENTITY writer SYSTEM "http://www.w3schools.com/entities.dtd">
<!ENTITY copyright SYSTEM "http://www.w3schools.com/entities.dtd">
XML example:
<author>&writer;©right;</author>
DTD Validation
With Internet Explorer 5+ you can validate your XML against a DTD.
Validating With the XML Parser
If you try to open an XML document, the XML Parser might generate an error. By accessing the
parseError object, you can retrieve the error code, the error text, or even the line that caused
the error.
Note: The load( ) method is used for files, while the loadXML( ) method is used for strings.
Example
var xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
xmlDoc.async="false";
xmlDoc.validateOnParse="true";
xmlDoc.load("note_dtd_error.xml");
document.write("<br />Error Code: ");
document.write(xmlDoc.parseError.errorCode);
document.write("<br />Error Reason: ");
86
document.write(xmlDoc.parseError.reason);
document.write("<br />Error Line: ");
document.write(xmlDoc.parseError.line);
Turn Validation Off
Validation can be turned off by setting the XML parser's validateOnParse="false".
Example
var xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
xmlDoc.async="false";
xmlDoc.validateOnParse="false";
xmlDoc.load("note_dtd_error.xml");
document.write("<br />Error Code: ");
document.write(xmlDoc.parseError.errorCode);
document.write("<br />Error Reason: ");
document.write(xmlDoc.parseError.reason);
document.write("<br />Error Line: ");
document.write(xmlDoc.parseError.line);
DTD ‐ Examples from the internet
TV Schedule DTD
By David Moisan. Copied from http://www.davidmoisan.org/
<!DOCTYPE TVSCHEDULE [
<!ELEMENT TVSCHEDULE (CHANNEL+)>
<!ELEMENT CHANNEL (BANNER,DAY+)>
<!ELEMENT BANNER (#PCDATA)>
<!ELEMENT DAY (DATE,(HOLIDAY|PROGRAMSLOT+)+)>
<!ELEMENT HOLIDAY (#PCDATA)>
<!ELEMENT DATE (#PCDATA)>
<!ELEMENT PROGRAMSLOT (TIME,TITLE,DESCRIPTION?)>
<!ELEMENT TIME (#PCDATA)>
<!ELEMENT TITLE (#PCDATA)>
<!ELEMENT DESCRIPTION (#PCDATA)>
<!ATTLIST TVSCHEDULE NAME CDATA #REQUIRED>
<!ATTLIST CHANNEL CHAN CDATA #REQUIRED>
<!ATTLIST PROGRAMSLOT VTR CDATA #IMPLIED>
<!ATTLIST TITLE RATING CDATA #IMPLIED>
<!ATTLIST TITLE LANGUAGE CDATA #IMPLIED>
]>
Newspaper Article DTD
Copied from http://www.vervet.com/
87
<!DOCTYPE NEWSPAPER [
<!ELEMENT NEWSPAPER (ARTICLE+)>
<!ELEMENT ARTICLE (HEADLINE,BYLINE,LEAD,BODY,NOTES)>
<!ELEMENT HEADLINE (#PCDATA)>
<!ELEMENT BYLINE (#PCDATA)>
<!ELEMENT LEAD (#PCDATA)>
<!ELEMENT BODY (#PCDATA)>
<!ELEMENT NOTES (#PCDATA)>
<!ATTLIST ARTICLE AUTHOR CDATA #REQUIRED>
<!ATTLIST ARTICLE EDITOR CDATA #IMPLIED>
<!ATTLIST ARTICLE DATE CDATA #IMPLIED>
<!ATTLIST ARTICLE EDITION CDATA #IMPLIED>
<!ENTITY NEWSPAPER "Vervet Logic Times">
<!ENTITY PUBLISHER "Vervet Logic Press">
<!ENTITY COPYRIGHT "Copyright 1998 Vervet Logic Press">
]>
Product Catalog DTD
Copied from http://www.vervet.com/
<!DOCTYPE CATALOG [
<!ENTITY AUTHOR "John Doe">
<!ENTITY COMPANY "JD Power Tools, Inc.">
<!ENTITY EMAIL "jd@jd‐tools.com">
<!ELEMENT CATALOG (PRODUCT+)>
<!ELEMENT PRODUCT
(SPECIFICATIONS+,OPTIONS?,PRICE+,NOTES?)>
<!ATTLIST PRODUCT
NAME CDATA #IMPLIED
CATEGORY (HandTool|Table|Shop‐Professional) "HandTool"
PARTNUM CDATA #IMPLIED
PLANT (Pittsburgh|Milwaukee|Chicago) "Chicago"
INVENTORY (InStock|Backordered|Discontinued) "InStock">
<!ELEMENT SPECIFICATIONS (#PCDATA)>
<!ATTLIST SPECIFICATIONS
WEIGHT CDATA #IMPLIED
POWER CDATA #IMPLIED>
<!ELEMENT OPTIONS (#PCDATA)>
<!ATTLIST OPTIONS
FINISH (Metal|Polished|Matte) "Matte"
ADAPTER (Included|Optional|NotApplicable) "Included"
CASE (HardShell|Soft|NotApplicable) "HardShell">
<!ELEMENT PRICE (#PCDATA)>
<!ATTLIST PRICE
MSRP CDATA #IMPLIED
WHOLESALE CDATA #IMPLIED
88
STREET CDATA #IMPLIED
SHIPPING CDATA #IMPLIED>
<!ELEMENT NOTES (#PCDATA)>
]>
DTD Summary
This tutorial has taught you how to describe the structure of an XML document.
You have learned how to use a DTD to define the legal elements of an XML document, and how
a DTD can be declared inside your XML document, or as an external reference.
You have learned how to declare the legal elements, attributes, entities, and CDATA sections
for XML documents.
You have also seen how to validate an XML document against a DTD.
Now You Know DTD, What's Next?
The next step is to learn about XML Schema.
XML Schema is used to define the legal elements of an XML document, just like a DTD. We think
that very soon XML Schemas will be used in most Web applications as a replacement for DTDs.
XML Schema is an XML‐based alternative to DTD.
Unlike DTD, XML Schemas has support for data types and namespaces.
Limitations / Problems with DTD
Top 15 reasons for avoiding DTD:
1. not itself using XML syntax (the SGML heritage can be very unintuitive + if using XML,
DTDs could potentially themselves be syntax checked with a "meta DTD")
2. mixed into the XML 1.0 spec (would be much less confusing if specified separately +
even non‐validating processors must look at the DTD)
3. no constraints on character data (if character data is allowed, any character data is
allowed)
4. too simple attribute value models (enumerations are clearly insufficient)
5. cannot mix character data and regexp content models (and the content models are
generally hard to use for complex requirements)
6. no support for Namespaces (of course, XML 1.0 was defined before Namespaces)
7. very limited support for modularity and reuse (the entity mechanism is too low‐level)
8. no support for schema evolution, extension, or inheritance of declarations (difficult to
write, maintain, and read large DTDs, and to define families of related schemas)
9. limited white‐space control (xml:space is rarely used)
10. no embedded, structured self‐documentation (<!‐‐ comments ‐‐> are not enough)
11. content and attribute declarations cannot depend on attributes or element context
(many XML languages use that, but their DTDs have to "allow too much")
12. too simple ID attribute mechanism (no points‐to requirements, uniqueness scope, etc.)
13. only defaults for attributes, not for elements (but that would often be convenient)
14. cannot specify "any element" or "any attribute" (useful for partial specifications and
during schema development)
15. defaults cannot be specified separate from the declarations (would be convenient to
have defaults in separate modules)
89
XML Schema
An XML Schema describes the structure of an XML document.
In this tutorial you will learn how to create XML Schemas, why XML Schemas are more powerful
than DTDs, and how to use XML Schema in your application.
XML Schema Example
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Introduction to XML Schema
XML Schema is an XML‐based alternative to DTD.
An XML schema describes the structure of an XML document.
The XML Schema language is also referred to as XML Schema Definition (XSD).
What You Should Already Know
Before you continue you should have a basic understanding of the following:
• HTML / XHTML
• XML and XML Namespaces
• A basic understanding of DTD
90
What is an XML Schema?
The purpose of an XML Schema is to define the legal building blocks of an XML document, just
like a DTD.
An XML Schema:
• defines elements that can appear in a document
• defines attributes that can appear in a document
• defines which elements are child elements
• defines the order of child elements
• defines the number of child elements
• defines whether an element is empty or can include text
• defines data types for elements and attributes
• defines default and fixed values for elements and attributes
XML Schemas are the Successors of DTDs
We think that very soon XML Schemas will be used in most Web applications as a replacement
for DTDs. Here are some reasons:
• XML Schemas are extensible to future additions
• XML Schemas are richer and more powerful than DTDs
• XML Schemas are written in XML
• XML Schemas support data types
• XML Schemas support namespaces
XML Schema is a W3C Recommendation
XML Schema became a W3C Recommendation 02. May 2001.
Why Use XML Schemas?
XML Schemas are much more powerful than DTDs.
XML Schemas Support Data Types
One of the greatest strength of XML Schemas is the support for data types.
With support for data types:
• It is easier to describe allowable document content
• It is easier to validate the correctness of data
• It is easier to work with data from a database
• It is easier to define data facets (restrictions on data)
91
• It is easier to define data patterns (data formats)
• It is easier to convert data between different data types
XML Schemas use XML Syntax
Another great strength about XML Schemas is that they are written in XML.
Some benefits of that XML Schemas are written in XML:
• You don't have to learn a new language
• You can use your XML editor to edit your Schema files
• You can use your XML parser to parse your Schema files
• You can manipulate your Schema with the XML DOM
• You can transform your Schema with XSLT
XML Schemas Secure Data Communication
When sending data from a sender to a receiver, it is essential that both parts have the same
"expectations" about the content.
With XML Schemas, the sender can describe the data in a way that the receiver will understand.
A date like: "03‐11‐2004" will, in some countries, be interpreted as 3.November and in other
countries as 11.March.
However, an XML element with a data type like this:
<date type="date">2004‐03‐11</date>
ensures a mutual understanding of the content, because the XML data type "date" requires the
format "YYYY‐MM‐DD".
XML Schemas are Extensible
XML Schemas are extensible, because they are written in XML.
With an extensible Schema definition you can:
• Reuse your Schema in other Schemas
• Create your own data types derived from the standard types
• Reference multiple schemas in the same document
Well‐Formed is not Enough
A well‐formed XML document is a document that conforms to the XML syntax rules, like:
• it must begin with the XML declaration
• it must have one unique root element
• start‐tags must have matching end‐tags
• elements are case sensitive
92
• all elements must be closed
• all elements must be properly nested
• all attribute values must be quoted
• entities must be used for special characters
Even if documents are well‐formed they can still contain errors, and those errors can have
serious consequences.
Think of the following situation: you order 5 gross of laser printers, instead of 5 laser printers.
With XML Schemas, most of these errors can be caught by your validating software.
XSD How To?
XML documents can have a reference to a DTD or to an XML Schema.
A Simple XML Document
Look at this simple XML document called "note.xml":
<?xml version="1.0"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
A DTD File
The following example is a DTD file called "note.dtd" that defines the elements of the XML
document above ("note.xml"):
<!ELEMENT note (to, from, heading, body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
The first line defines the note element to have four child elements: "to, from, heading, body".
Line 2‐5 defines the to, from, heading, body elements to be of type "#PCDATA".
93
An XML Schema
The following example is an XML Schema file called "note.xsd" that defines the elements of the
XML document above ("note.xml"):
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
The note element is a complex type because it contains other elements. The other elements
(to, from, heading, body) are simple types because they do not contain other elements. You
will learn more about simple and complex types in the following chapters.
A Reference to a DTD
This XML document has a reference to a DTD:
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM
"http://www.w3schools.com/dtd/note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
94
A Reference to an XML Schema
This XML document has a reference to an XML Schema:
<?xml version="1.0"?>
<note
xmlns="http://www.w3schools.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance"
xsi:schemaLocation="http://www.w3schools.com note.xsd">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
XSD ‐ The <schema> Element
The <schema> element is the root element of every XML Schema.
The <schema> Element
The <schema> element is the root element of every XML Schema:
<?xml version="1.0"?>
<xs:schema>
...
...
</xs:schema>
The <schema> element may contain some attributes. A schema declaration often looks
something like this:
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">
...
...
95
</xs:schema>
The following fragment:
xmlns:xs="http://www.w3.org/2001/XMLSchema"
indicates that the elements and data types used in the schema come from the
"http://www.w3.org/2001/XMLSchema" namespace. It also specifies that the elements and
data types that come from the "http://www.w3.org/2001/XMLSchema" namespace should be
prefixed with xs:
This fragment:
targetNamespace="http://www.w3schools.com"
indicates that the elements defined by this schema (note, to, from, heading, body.) come from
the "http://www.w3schools.com" namespace.
This fragment:
xmlns="http://www.w3schools.com"
indicates that the default namespace is "http://www.w3schools.com".
This fragment:
elementFormDefault="qualified"
indicates that any elements used by the XML instance document which were declared in this
schema must be namespace qualified.
Referencing a Schema in an XML Document
This XML document has a reference to an XML Schema:
<?xml version="1.0"?>
<note xmlns="http://www.w3schools.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance"
xsi:schemaLocation="http://www.w3schools.com note.xsd">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
The following fragment:
96
xmlns="http://www.w3schools.com"
specifies the default namespace declaration. This declaration tells the schema‐validator that all
the elements used in this XML document are declared in the "http://www.w3schools.com"
namespace.
Once you have the XML Schema Instance namespace available:
xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance"
you can use the schemaLocation attribute. This attribute has two values. The first value is the
namespace to use. The second value is the location of the XML schema to use for that
namespace:
xsi:schemaLocation="http://www.w3schools.com note.xsd"
XSD Simple Elements
XML Schemas define the elements of your XML files.
A simple element is an XML element that contains only text. It cannot contain any other
elements or attributes.
What is a Simple Element?
A simple element is an XML element that can contain only text. It cannot contain any other
elements or attributes.
However, the "only text" restriction is quite misleading. The text can be of many different
types. It can be one of the types included in the XML Schema definition (boolean, string, date,
etc.), or it can be a custom type that you can define yourself.
You can also add restrictions (facets) to a data type in order to limit its content, or you can
require the data to match a specific pattern.
Defining a Simple Element
The syntax for defining a simple element is:
<xs:element name="xxx" type="yyy"/>
where xxx is the name of the element and yyy is the data type of the element.
XML Schema has a lot of built‐in data types. The most common types are:
• xs:string
• xs:decimal
97
• xs:integer
• xs:boolean
• xs:date
• xs:time
Example
Here are some XML elements:
<lastname>Refsnes</lastname>
<age>36</age>
<dateborn>1970‐03‐27</dateborn>
And here are the corresponding simple element definitions:
<xs:element name="lastname" type="xs:string"/>
<xs:element name="age" type="xs:integer"/>
<xs:element name="dateborn" type="xs:date"/>
Default and Fixed Values for Simple Elements
Simple elements may have a default value OR a fixed value specified.
A default value is automatically assigned to the element when no other value is specified.
In the following example the default value is "red":
<xs:element name="color" type="xs:string" default="red"/>
A fixed value is also automatically assigned to the element, and you cannot specify another
value.
In the following example the fixed value is "red":
<xs:element name="color" type="xs:string" fixed="red"/>
XSD Attributes
All attributes are declared as simple types.
What is an Attribute?
Simple elements cannot have attributes. If an element has attributes, it is considered to be of a
complex type. But the attribute itself is always declared as a simple type.
98
How to Define an Attribute?
The syntax for defining an attribute is:
<xs:attribute name="xxx" type="yyy"/>
where xxx is the name of the attribute and yyy specifies the data type of the attribute.
XML Schema has a lot of built‐in data types. The most common types are:
• xs:string
• xs:decimal
• xs:integer
• xs:boolean
• xs:date
• xs:time
Example
Here is an XML element with an attribute:
<lastname lang="EN">Smith</lastname>
And here is the corresponding attribute definition:
<xs:attribute name="lang" type="xs:string"/>
Default and Fixed Values for Attributes
Attributes may have a default value OR a fixed value specified.
A default value is automatically assigned to the attribute when no other value is specified.
In the following example the default value is "EN":
<xs:attribute name="lang" type="xs:string" default="EN"/>
A fixed value is also automatically assigned to the attribute, and you cannot specify another
value.
In the following example the fixed value is "EN":
<xs:attribute name="lang" type="xs:string" fixed="EN"/>
99
Optional and Required Attributes
Attributes are optional by default. To specify that the attribute is required, use the "use"
attribute:
<xs:attribute name="lang" type="xs:string" use="required"/>
Restrictions on Content
When an XML element or attribute has a data type defined, it puts restrictions on the element's
or attribute's content.
If an XML element is of type "xs:date" and contains a string like "Hello World", the element will
not validate.
With XML Schemas, you can also add your own restrictions to your XML elements and
attributes. These restrictions are called facets. You can read more about facets in the next
chapter.
XSD Restrictions/Facets
Restrictions are used to define acceptable values for XML elements or attributes. Restrictions
on XML elements are called facets.
Restrictions on Values
The following example defines an element called "age" with a restriction. The value of age
cannot be lower than 0 or greater than 120:
<xs:element name="age">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive value="0"/>
<xs:maxInclusive value="120"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on a Set of Values
To limit the content of an XML element to a set of acceptable values, we would use the
enumeration constraint.
100
The example below defines an element called "car" with a restriction. The only acceptable
values are: Audi, Golf, BMW:
<xs:element name="car">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Audi"/>
<xs:enumeration value="Golf"/>
<xs:enumeration value="BMW"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The example above could also have been written like this:
<xs:element name="car" type="carType"/>
<xs:simpleType name="carType">
<xs:restriction base="xs:string">
<xs:enumeration value="Audi"/>
<xs:enumeration value="Golf"/>
<xs:enumeration value="BMW"/>
</xs:restriction>
</xs:simpleType>
Note: In this case the type "carType" can be used by other elements because it is not a part of
the "car" element.
Restrictions on a Series of Values
To limit the content of an XML element to define a series of numbers or letters that can be
used, we would use the pattern constraint.
The example below defines an element called "letter" with a restriction. The only acceptable
value is ONE of the LOWERCASE letters from a to z:
<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a‐z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example defines an element called "initials" with a restriction. The only acceptable
value is THREE of the UPPERCASE letters from a to z:
101
<xs:element name="initials">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[A‐Z][A‐Z][A‐Z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example also defines an element called "initials" with a restriction. The only
acceptable value is THREE of the LOWERCASE OR UPPERCASE letters from a to z:
<xs:element name="initials">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a‐zA‐Z][a‐zA‐Z][a‐zA‐Z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example defines an element called "choice" with a restriction. The only acceptable
value is ONE of the following letters: x, y, OR z:
<xs:element name="choice">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[xyz]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example defines an element called "prodid" with a restriction. The only acceptable
value is FIVE digits in a sequence, and each digit must be in a range from 0 to 9:
<xs:element name="prodid">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:pattern value="[0‐9][0‐9][0‐9][0‐9][0‐9]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Other Restrictions on a Series of Values
The example below defines an element called "letter" with a restriction. The acceptable value is
zero or more occurrences of lowercase letters from a to z:
102
<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="([a‐z])*"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example also defines an element called "letter" with a restriction. The acceptable
value is one or more pairs of letters, each pair consisting of a lower case letter followed by an
upper case letter. For example, "sToP" will be validated by this pattern, but not "Stop" or
"STOP" or "stop":
<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="([a‐z][A‐Z])+"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example defines an element called "gender" with a restriction. The only acceptable
value is male OR female:
<xs:element name="gender">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="male|female"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example defines an element called "password" with a restriction. There must be
exactly eight characters in a row and those characters must be lowercase or uppercase letters
from a to z, or a number from 0 to 9:
<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a‐zA‐Z0‐9]{8}"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
103
Restrictions on Whitespace Characters
To specify how whitespace characters should be handled, we would use the whiteSpace
constraint.
This example defines an element called "address" with a restriction. The whiteSpace constraint
is set to "preserve", which means that the XML processor WILL NOT remove any white space
characters:
<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="preserve"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
This example also defines an element called "address" with a restriction. The whiteSpace
constraint is set to "replace", which means that the XML processor WILL REPLACE all white
space characters (line feeds, tabs, spaces, and carriage returns) with spaces:
<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="replace"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
This example also defines an element called "address" with a restriction. The whiteSpace
constraint is set to "collapse", which means that the XML processor WILL REMOVE all white
space characters (line feeds, tabs, spaces, carriage returns are replaced with spaces, leading
and trailing spaces are removed, and multiple spaces are reduced to a single space):
<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="collapse"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on Length
To limit the length of a value in an element, we would use the length, maxLength, and
minLength constraints.
104
This example defines an element called "password" with a restriction. The value must be
exactly eight characters:
<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:length value="8"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
This example defines another element called "password" with a restriction. The value must be
minimum five characters and maximum eight characters:
<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:minLength value="5"/>
<xs:maxLength value="8"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions for Datatypes
Constraint Description
enumeration Defines a list of acceptable values
fractionDigits Specifies the maximum number of decimal places allowed. Must be equal to or
greater than zero
length Specifies the exact number of characters or list items allowed. Must be equal to
or greater than zero
maxExclusive Specifies the upper bounds for numeric values (the value must be less than this
value)
maxInclusive Specifies the upper bounds for numeric values (the value must be less than or
equal to this value)
maxLength Specifies the maximum number of characters or list items allowed. Must be
equal to or greater than zero
minExclusive Specifies the lower bounds for numeric values (the value must be greater than
this value)
minInclusive Specifies the lower bounds for numeric values (the value must be greater than or
equal to this value)
minLength Specifies the minimum number of characters or list items allowed. Must be equal
to or greater than zero
pattern Defines the exact sequence of characters that are acceptable
105
totalDigits Specifies the exact number of digits allowed. Must be greater than zero
whiteSpace Specifies how white space (line feeds, tabs, spaces, and carriage returns) is
handled
XSD Complex Elements
A complex element contains other elements and/or attributes.
What is a Complex Element?
A complex element is an XML element that contains other elements and/or attributes.
There are four kinds of complex elements:
• empty elements
• elements that contain only other elements
• elements that contain only text
• elements that contain both other elements and text
Note: Each of these elements may contain attributes as well!
Examples of Complex Elements
A complex XML element, "product", which is empty:
<product pid="1345"/>
A complex XML element, "employee", which contains only other elements:
<employee>
<firstname>John</firstname>
<lastname>Smith</lastname>
</employee>
A complex XML element, "food", which contains only text:
<food type="dessert">Ice cream</food>
A complex XML element, "description", which contains both elements and text:
<description>
It happened on <date lang="norwegian">03.03.99</date> ....
</description>
106
How to Define a Complex Element
Look at this complex XML element, "employee", which contains only other elements:
<employee>
<firstname>John</firstname>
<lastname>Smith</lastname>
</employee>
We can define a complex element in an XML Schema two different ways:
1. The "employee" element can be declared directly by naming the element, like this:
<xs:element name="employee">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
If you use the method described above, only the "employee" element can use the specified
complex type. Note that the child elements, "firstname" and "lastname", are surrounded by the
<sequence> indicator. This means that the child elements must appear in the same order as
they are declared. You will learn more about indicators in the XSD Indicators chapter.
2. The "employee" element can have a type attribute that refers to the name of the complex
type to use:
<xs:element name="employee" type="personinfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
If you use the method described above, several elements can refer to the same complex type,
like this:
<xs:element name="employee" type="personinfo"/>
<xs:element name="student" type="personinfo"/>
<xs:element name="member" type="personinfo"/>
107
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
You can also base a complex element on an existing complex element and add some elements,
like this:
<xs:element name="employee" type="fullpersoninfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="fullpersoninfo">
<xs:complexContent>
<xs:extension base="personinfo">
<xs:sequence>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
XSD Empty Elements
An empty complex element cannot have contents, only attributes.
Complex Empty Elements
An empty XML element:
<product prodid="1345" />
The "product" element above has no content at all. To define a type with no content, we must
define a type that allows elements in its content, but we do not actually declare any elements,
like this:
108
<xs:element name="product">
<xs:complexType>
<xs:complexContent>
<xs:restriction base="xs:integer">
<xs:attribute name="prodid" type="xs:positiveInteger"/>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
</xs:element>
In the example above, we define a complex type with a complex content. The complexContent
element signals that we intend to restrict or extend the content model of a complex type, and
the restriction of integer declares one attribute but does not introduce any element content.
However, it is possible to declare the "product" element more compactly, like this:
<xs:element name="product">
<xs:complexType>
<xs:attribute name="prodid" type="xs:positiveInteger"/>
</xs:complexType>
</xs:element>
Or you can give the complexType element a name, and let the "product" element have a type
attribute that refers to the name of the complexType (if you use this method, several elements
can refer to the same complex type):
<xs:element name="product" type="prodtype"/>
<xs:complexType name="prodtype">
<xs:attribute name="prodid" type="xs:positiveInteger"/>
</xs:complexType>
XSD Elements Only
An "elements‐only" complex type contains an element that contains only other elements.
Complex Types Containing Elements Only
An XML element, "person", that contains only other elements:
<person>
<firstname>John</firstname>
<lastname>Smith</lastname>
</person>
109
You can define the "person" element in a schema, like this:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Notice the <xs:sequence> tag. It means that the elements defined ("firstname" and "lastname")
must appear in that order inside a "person" element.
Or you can give the complexType element a name, and let the "person" element have a type
attribute that refers to the name of the complexType (if you use this method, several elements
can refer to the same complex type):
<xs:element name="person" type="persontype"/>
<xs:complexType name="persontype">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
XSD Text‐Only Elements
A complex text‐only element can contain text and attributes.
Complex Text‐Only Elements
This type contains only simple content (text and attributes), therefore we add a simpleContent
element around the content. When using simple content, you must define an extension OR a
restriction within the simpleContent element, like this:
<xs:element name="somename">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="basetype">
....
....
</xs:extension>
</xs:simpleContent>
110
</xs:complexType>
</xs:element>
OR
<xs:element name="somename">
<xs:complexType>
<xs:simpleContent>
<xs:restriction base="basetype">
....
....
</xs:restriction>
</xs:simpleContent>
</xs:complexType>
</xs:element>
Tip: Use the extension/restriction element to expand or to limit the base simple type for the
element.
Here is an example of an XML element, "shoesize", that contains text‐only:
<shoesize country="france">35</shoesize>
The following example declares a complexType, "shoesize". The content is defined as an integer
value, and the "shoesize" element also contains an attribute named "country":
<xs:element name="shoesize">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:integer">
<xs:attribute name="country" type="xs:string" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
We could also give the complexType element a name, and let the "shoesize" element have a
type attribute that refers to the name of the complexType (if you use this method, several
elements can refer to the same complex type):
<xs:element name="shoesize" type="shoetype"/>
<xs:complexType name="shoetype">
<xs:simpleContent>
<xs:extension base="xs:integer">
<xs:attribute name="country" type="xs:string" />
</xs:extension>
111
</xs:simpleContent>
</xs:complexType>
XSD Mixed Content
A mixed complex type element can contain attributes, elements, and text.
Complex Types with Mixed Content
An XML element, "letter", that contains both text and other elements:
<letter>
Dear Mr.<name>John Smith</name>.
Your order <orderid>1032</orderid>
will be shipped on <shipdate>2001‐07‐13</shipdate>.
</letter>
The following schema declares the "letter" element:
<xs:element name="letter">
<xs:complexType mixed="true">
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="orderid" type="xs:positiveInteger"/>
<xs:element name="shipdate" type="xs:date"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Note: To enable character data to appear between the child‐elements of "letter", the mixed
attribute must be set to "true". The <xs:sequence> tag means that the elements defined (name,
orderid and shipdate) must appear in that order inside a "letter" element.
We could also give the complexType element a name, and let the "letter" element have a type
attribute that refers to the name of the complexType (if you use this method, several elements
can refer to the same complex type):
<xs:element name="letter" type="lettertype"/>
<xs:complexType name="lettertype" mixed="true">
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="orderid" type="xs:positiveInteger"/>
<xs:element name="shipdate" type="xs:date"/>
</xs:sequence>
112
</xs:complexType>
XSD Indicators
We can control HOW elements are to be used in documents with indicators.
Indicators
There are seven indicators:
Order indicators:
• All
• Choice
• Sequence
Occurrence indicators:
• maxOccurs
• minOccurs
Group indicators:
• Group name
• attributeGroup name
Order Indicators
Order indicators are used to define the order of the elements.
All Indicator
The <all> indicator specifies that the child elements can appear in any order, and that each child
element must occur only once:
<xs:element name="person">
<xs:complexType>
<xs:all>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:all>
</xs:complexType>
</xs:element>
113
Note: When using the <all> indicator you can set the <minOccurs> indicator to 0 or 1 and the
<maxOccurs> indicator can only be set to 1 (the <minOccurs> and <maxOccurs> are described
later).
Choice Indicator
The <choice> indicator specifies that either one child element or another can occur:
<xs:element name="person">
<xs:complexType>
<xs:choice>
<xs:element name="employee" type="employee"/>
<xs:element name="member" type="member"/>
</xs:choice>
</xs:complexType>
</xs:element>
Sequence Indicator
The <sequence> indicator specifies that the child elements must appear in a specific order:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Occurrence Indicators
Occurrence indicators are used to define how often an element can occur.
Note: For all "Order" and "Group" indicators (any, all, choice, sequence, group name, and group
reference) the default value for maxOccurs and minOccurs is 1.
maxOccurs Indicator
The <maxOccurs> indicator specifies the maximum number of times an element can occur:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string" maxOccurs="10"/>
114
</xs:sequence>
</xs:complexType>
</xs:element>
The example above indicates that the "child_name" element can occur a minimum of one time
(the default value for minOccurs is 1) and a maximum of ten times in the "person" element.
minOccurs Indicator
The <minOccurs> indicator specifies the minimum number of times an element can occur:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string"
maxOccurs="10" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
The example above indicates that the "child_name" element can occur a minimum of zero
times and a maximum of ten times in the "person" element.
Tip: To allow an element to appear an unlimited number of times, use the
maxOccurs="unbounded" statement:
A working example:
An XML file called "Myfamily.xml":
<?xml version="1.0" encoding="ISO‐8859‐1"?>
<persons xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance"
xsi:noNamespaceSchemaLocation="family.xsd">
<person>
<full_name>Hege Refsnes</full_name>
<child_name>Cecilie</child_name>
</person>
<person>
<full_name>Tove Refsnes</full_name>
<child_name>Hege</child_name>
<child_name>Stale</child_name>
<child_name>Jim</child_name>
<child_name>Borge</child_name>
</person>
115
<person>
<full_name>Stale Refsnes</full_name>
</person>
</persons>
The XML file above contains a root element named "persons". Inside this root element we have
defined three "person" elements. Each "person" element must contain a "full_name" element
and it can contain up to five "child_name" elements.
Here is the schema file "family.xsd":
<?xml version="1.0" encoding="ISO‐8859‐1"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">
<xs:element name="persons">
<xs:complexType>
<xs:sequence>
<xs:element name="person" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string"
minOccurs="0" maxOccurs="5"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Group Indicators
Group indicators are used to define related sets of elements.
Element Groups
Element groups are defined with the group declaration, like this:
<xs:group name="groupname">
...
</xs:group>
116
You must define an all, choice, or sequence element inside the group declaration. The following
example defines a group named "persongroup", that defines a group of elements that must
occur in an exact sequence:
<xs:group name="persongroup">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
<xs:element name="birthday" type="xs:date"/>
</xs:sequence>
</xs:group>
After you have defined a group, you can reference it in another definition, like this:
<xs:group name="persongroup">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
<xs:element name="birthday" type="xs:date"/>
</xs:sequence>
</xs:group>
<xs:element name="person" type="personinfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:group ref="persongroup"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:complexType>
Attribute Groups
Attribute groups are defined with the attributeGroup declaration, like this:
<xs:attributeGroup name="groupname">
...
</xs:attributeGroup>
The following example defines an attribute group named "personattrgroup":
<xs:attributeGroup name="personattrgroup">
<xs:attribute name="firstname" type="xs:string"/>
<xs:attribute name="lastname" type="xs:string"/>
<xs:attribute name="birthday" type="xs:date"/>
</xs:attributeGroup>
117
After you have defined an attribute group, you can reference it in another definition, like this:
<xs:attributeGroup name="personattrgroup">
<xs:attribute name="firstname" type="xs:string"/>
<xs:attribute name="lastname" type="xs:string"/>
<xs:attribute name="birthday" type="xs:date"/>
</xs:attributeGroup>
<xs:element name="person">
<xs:complexType>
<xs:attributeGroup ref="personattrgroup"/>
</xs:complexType>
</xs:element>
XSD The <any> Element
The <any> element enables us to extend the XML document with elements not specified by the
schema!
The <any> Element
The <any> element enables us to extend the XML document with elements not specified by the
schema.
The following example is a fragment from an XML schema called "family.xsd". It shows a
declaration for the "person" element. By using the <any> element we can extend (after
<lastname>) the content of "person" with any element:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
<xs:any minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Now we want to extend the "person" element with a "children" element. In this case we can do
so, even if the author of the schema above never declared any "children" element.
Look at this schema file, called "children.xsd":
<?xml version="1.0" encoding="ISO‐8859‐1"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
118
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">
<xs:element name="children">
<xs:complexType>
<xs:sequence>
<xs:element name="childname" type="xs:string"
maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
The XML file below (called "Myfamily.xml"), uses components from two different schemas;
"family.xsd" and "children.xsd":
<?xml version="1.0" encoding="ISO‐8859‐1"?>
<persons xmlns="http://www.microsoft.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance"
xsi:SchemaLocation="http://www.microsoft.com family.xsd
http://www.w3schools.com children.xsd">
<person>
<firstname>Hege</firstname>
<lastname>Refsnes</lastname>
<children>
<childname>Cecilie</childname>
</children>
</person>
<person>
<firstname>Stale</firstname>
<lastname>Refsnes</lastname>
</person>
</persons>
The XML file above is valid because the schema "family.xsd" allows us to extend the "person"
element with an optional element after the "lastname" element.
The <any> and <anyAttribute> elements are used to make EXTENSIBLE documents! They allow
documents to contain additional elements that are not declared in the main XML schema.
119
XSD The <anyAttribute> Element
The <anyAttribute> element enables us to extend the XML document with attributes not
specified by the schema!
The <anyAttribute> Element
The <anyAttribute> element enables us to extend the XML document with attributes not
specified by the schema.
The following example is a fragment from an XML schema called "family.xsd". It shows a
declaration for the "person" element. By using the <anyAttribute> element we can add any
number of attributes to the "person" element:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
<xs:anyAttribute/>
</xs:complexType>
</xs:element>
Now we want to extend the "person" element with a "gender" attribute. In this case we can do
so, even if the author of the schema above never declared any "gender" attribute.
Look at this schema file, called "attribute.xsd":
<?xml version="1.0" encoding="ISO‐8859‐1"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">
<xs:attribute name="gender">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="male|female"/>
</xs:restriction>
</xs:simpleType>
</xs:attribute>
</xs:schema>
The XML file below (called "Myfamily.xml"), uses components from two different schemas;
"family.xsd" and "attribute.xsd":
120
<?xml version="1.0" encoding="ISO‐8859‐1"?>
<persons xmlns="http://www.microsoft.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance"
xsi:SchemaLocation="http://www.microsoft.com family.xsd
http://www.w3schools.com attribute.xsd">
<person gender="female">
<firstname>Hege</firstname>
<lastname>Refsnes</lastname>
</person>
<person gender="male">
<firstname>Stale</firstname>
<lastname>Refsnes</lastname>
</person>
</persons>
The XML file above is valid because the schema "family.xsd" allows us to add an attribute to the
"person" element.
The <any> and <anyAttribute> elements are used to make EXTENSIBLE documents! They allow
documents to contain additional elements that are not declared in the main XML schema.
XSD Element Substitution
With XML Schemas, one element can substitute another element.
Element Substitution
Let's say that we have users from two different countries: England and Norway. We would like
the ability to let the user choose whether he or she would like to use the Norwegian element
names or the English element names in the XML document.
To solve this problem, we could define a substitutionGroup in the XML schema. First, we
declare a head element and then we declare the other elements which state that they are
substitutable for the head element.
<xs:element name="name" type="xs:string"/>
<xs:element name="navn" substitutionGroup="name"/>
In the example above, the "name" element is the head element and the "navn" element is
substitutable for "name".
Look at this fragment of an XML schema:
121
<xs:element name="name" type="xs:string"/>
<xs:element name="navn" substitutionGroup="name"/>
<xs:complexType name="custinfo">
<xs:sequence>
<xs:element ref="name"/>
</xs:sequence>
</xs:complexType>
<xs:element name="customer" type="custinfo"/>
<xs:element name="kunde" substitutionGroup="customer"/>
A valid XML document (according to the schema above) could look like this:
<customer>
<name>John Smith</name>
</customer>
or like this:
<kunde>
<navn>John Smith</navn>
</kunde>
Blocking Element Substitution
To prevent other elements from substituting with a specified element, use the block attribute:
<xs:element name="name" type="xs:string" block="substitution"/>
Look at this fragment of an XML schema:
<xs:element name="name" type="xs:string" block="substitution"/>
<xs:element name="navn" substitutionGroup="name"/>
<xs:complexType name="custinfo">
<xs:sequence>
<xs:element ref="name"/>
</xs:sequence>
</xs:complexType>
<xs:element name="customer" type="custinfo" block="substitution"/>
<xs:element name="kunde" substitutionGroup="customer"/>
A valid XML document (according to the schema above) looks like this:
122
<customer>
<name>John Smith</name>
</customer>
BUT THIS IS NO LONGER VALID:
<kunde>
<navn>John Smith</navn>
</kunde>
Using substitutionGroup
The type of the substitutable elements must be the same as, or derived from, the type of the
head element. If the type of the substitutable element is the same as the type of the head
element you will not have to specify the type of the substitutable element.
Note that all elements in the substitutionGroup (the head element and the substitutable
elements) must be declared as global elements, otherwise it will not work!
What are Global Elements?
Global elements are elements that are immediate children of the "schema" element! Local
elements are elements nested within other elements.
An XSD Example
This chapter will demonstrate how to write an XML Schema. You will also learn that a schema
can be written in different ways.
An XML Document
Let's have a look at this XML document called "shiporder.xml":
<?xml version="1.0" encoding="ISO‐8859‐1"?>
<shiporder orderid="889923"
xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance"
xsi:noNamespaceSchemaLocation="shiporder.xsd">
<orderperson>John Smith</orderperson>
<shipto>
<name>Ola Nordmann</name>
<address>Langgt 23</address>
123
<city>4000 Stavanger</city>
<country>Norway</country>
</shipto>
<item>
<title>Empire Burlesque</title>
<note>Special Edition</note>
<quantity>1</quantity>
<price>10.90</price>
</item>
<item>
<title>Hide your heart</title>
<quantity>1</quantity>
<price>9.90</price>
</item>
</shiporder>
The XML document above consists of a root element, "shiporder", that contains a required
attribute called "orderid". The "shiporder" element contains three different child elements:
"orderperson", "shipto" and "item". The "item" element appears twice, and it contains a "title",
an optional "note" element, a "quantity", and a "price" element.
The line above: xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance" tells the XML
parser that this document should be validated against a schema. The line:
xsi:noNamespaceSchemaLocation="shiporder.xsd" specifies WHERE the schema resides (here it
is in the same folder as "shiporder.xml").
Create an XML Schema
Now we want to create a schema for the XML document above.
We start by opening a new file that we will call "shiporder.xsd". To create the schema we could
simply follow the structure in the XML document and define each element as we find it. We will
start with the standard XML declaration followed by the xs:schema element that defines a
schema:
<?xml version="1.0" encoding="ISO‐8859‐1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
...
</xs:schema>
In the schema above we use the standard namespace (xs), and the URI associated with this
namespace is the Schema language definition, which has the standard value of
http://www.w3.org/2001/XMLSchema.
Next, we have to define the "shiporder" element. This element has an attribute and it contains
other elements, therefore we consider it as a complex type. The child elements of the
124
"shiporder" element is surrounded by a xs:sequence element that defines an ordered sequence
of sub elements:
<xs:element name="shiporder">
<xs:complexType>
<xs:sequence>
...
</xs:sequence>
</xs:complexType>
</xs:element>
Then we have to define the "orderperson" element as a simple type (because it does not
contain any attributes or other elements). The type (xs:string) is prefixed with the namespace
prefix associated with XML Schema that indicates a predefined schema data type:
<xs:element name="orderperson" type="xs:string"/>
Next, we have to define two elements that are of the complex type: "shipto" and "item". We
start by defining the "shipto" element:
<xs:element name="shipto">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
With schemas we can define the number of possible occurrences for an element with the
maxOccurs and minOccurs attributes. maxOccurs specifies the maximum number of
occurrences for an element and minOccurs specifies the minimum number of occurrences for
an element. The default value for both maxOccurs and minOccurs is 1!
Now we can define the "item" element. This element can appear multiple times inside a
"shiporder" element. This is specified by setting the maxOccurs attribute of the "item" element
to "unbounded" which means that there can be as many occurrences of the "item" element as
the author wishes. Notice that the "note" element is optional. We have specified this by setting
the minOccurs attribute to zero:
<xs:element name="item" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="title" type="xs:string"/>
<xs:element name="note" type="xs:string" minOccurs="0"/>
125
<xs:element name="quantity" type="xs:positiveInteger"/>
<xs:element name="price" type="xs:decimal"/>
</xs:sequence>
</xs:complexType>
</xs:element>
We can now declare the attribute of the "shiporder" element. Since this is a required attribute
we specify use="required".
Note: The attribute declarations must always come last:
<xs:attribute name="orderid" type="xs:string" use="required"/>
Here is the complete listing of the schema file called "shiporder.xsd":
<?xml version="1.0" encoding="ISO‐8859‐1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="shiporder">
<xs:complexType>
<xs:sequence>
<xs:element name="orderperson" type="xs:string"/>
<xs:element name="shipto">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="item" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="title" type="xs:string"/>
<xs:element name="note" type="xs:string" minOccurs="0"/>
<xs:element name="quantity" type="xs:positiveInteger"/>
<xs:element name="price" type="xs:decimal"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="orderid" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
126
</xs:schema>
Divide the Schema
The previous design method is very simple, but can be difficult to read and maintain when
documents are complex.
The next design method is based on defining all elements and attributes first, and then
referring to them using the ref attribute.
Here is the new design of the schema file ("shiporder.xsd"):
<?xml version="1.0" encoding="ISO‐8859‐1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!‐‐ definition of simple elements ‐‐>
<xs:element name="orderperson" type="xs:string"/>
<xs:element name="name" type="xs:string"/>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
<xs:element name="title" type="xs:string"/>
<xs:element name="note" type="xs:string"/>
<xs:element name="quantity" type="xs:positiveInteger"/>
<xs:element name="price" type="xs:decimal"/>
<!‐‐ definition of attributes ‐‐>
<xs:attribute name="orderid" type="xs:string"/>
<!‐‐ definition of complex elements ‐‐>
<xs:element name="shipto">
<xs:complexType>
<xs:sequence>
<xs:element ref="name"/>
<xs:element ref="address"/>
<xs:element ref="city"/>
<xs:element ref="country"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="item">
<xs:complexType>
<xs:sequence>
<xs:element ref="title"/>
<xs:element ref="note" minOccurs="0"/>
127
<xs:element ref="quantity"/>
<xs:element ref="price"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="shiporder">
<xs:complexType>
<xs:sequence>
<xs:element ref="orderperson"/>
<xs:element ref="shipto"/>
<xs:element ref="item" maxOccurs="unbounded"/>
</xs:sequence>
<xs:attribute ref="orderid" use="required"/>
</xs:complexType>
</xs:element>
</xs:schema>
Using Named Types
The third design method defines classes or types, that enables us to reuse element definitions.
This is done by naming the simpleTypes and complexTypes elements, and then point to them
through the type attribute of the element.
Here is the third design of the schema file ("shiporder.xsd"):
<?xml version="1.0" encoding="ISO‐8859‐1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:simpleType name="stringtype">
<xs:restriction base="xs:string"/>
</xs:simpleType>
<xs:simpleType name="inttype">
<xs:restriction base="xs:positiveInteger"/>
</xs:simpleType>
<xs:simpleType name="dectype">
<xs:restriction base="xs:decimal"/>
</xs:simpleType>
<xs:simpleType name="orderidtype">
<xs:restriction base="xs:string">
<xs:pattern value="[0‐9]{6}"/>
128
</xs:restriction>
</xs:simpleType>
<xs:complexType name="shiptotype">
<xs:sequence>
<xs:element name="name" type="stringtype"/>
<xs:element name="address" type="stringtype"/>
<xs:element name="city" type="stringtype"/>
<xs:element name="country" type="stringtype"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="itemtype">
<xs:sequence>
<xs:element name="title" type="stringtype"/>
<xs:element name="note" type="stringtype" minOccurs="0"/>
<xs:element name="quantity" type="inttype"/>
<xs:element name="price" type="dectype"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="shipordertype">
<xs:sequence>
<xs:element name="orderperson" type="stringtype"/>
<xs:element name="shipto" type="shiptotype"/>
<xs:element name="item" maxOccurs="unbounded" type="itemtype"/>
</xs:sequence>
<xs:attribute name="orderid" type="orderidtype" use="required"/>
</xs:complexType>
<xs:element name="shiporder" type="shipordertype"/>
</xs:schema>
The restriction element indicates that the datatype is derived from a W3C XML Schema
namespace datatype. So, the following fragment means that the value of the element or
attribute must be a string value:
<xs:restriction base="xs:string">
The restriction element is more often used to apply restrictions to elements. Look at the
following lines from the schema above:
<xs:simpleType name="orderidtype">
<xs:restriction base="xs:string">
<xs:pattern value="[0‐9]{6}"/>
129
</xs:restriction>
</xs:simpleType>
This indicates that the value of the element or attribute must be a string, it must be exactly six
characters in a row, and those characters must be a number from 0 to 9.
XSD String Data Types
String data types are used for values that contains character strings.
String Data Type
The string data type can contain characters, line feeds, carriage returns, and tab characters.
The following is an example of a string declaration in a schema:
<xs:element name="customer" type="xs:string"/>
An element in your document might look like this:
<customer>John Smith</customer>
Or it might look like this:
<customer> John Smith </customer>
Note: The XML processor will not modify the value if you use the string data type.
NormalizedString Data Type
The normalizedString data type is derived from the String data type.
The normalizedString data type also contains characters, but the XML processor will remove
line feeds, carriage returns, and tab characters.
The following is an example of a normalizedString declaration in a schema:
<xs:element name="customer" type="xs:normalizedString"/>
An element in your document might look like this:
<customer>John Smith</customer>
Or it might look like this:
<customer> John Smith </customer>
130
Note: In the example above the XML processor will replace the tabs with spaces.
Token Data Type
The token data type is also derived from the String data type.
The token data type also contains characters, but the XML processor will remove line feeds,
carriage returns, tabs, leading and trailing spaces, and multiple spaces.
The following is an example of a token declaration in a schema:
<xs:element name="customer" type="xs:token"/>
An element in your document might look like this:
<customer>John Smith</customer>
Or it might look like this:
<customer> John Smith </customer>
Note: In the example above the XML processor will remove the tabs.
String Data Types
Note that all of the data types below derive from the String data type (except for string itself)!
Name Description
ENTITIES
ENTITY
ID A string that represents the ID attribute in XML (only used with schema
attributes)
IDREF A string that represents the IDREF attribute in XML (only used with
schema attributes)
IDREFS
language A string that contains a valid language id
Name A string that contains a valid XML name
NCName
NMTOKEN A string that represents the NMTOKEN attribute in XML (only used with
schema attributes)
NMTOKENS
normalizedString A string that does not contain line feeds, carriage returns, or tabs
QName
131
string A string
token A string that does not contain line feeds, carriage returns, tabs, leading or
trailing spaces, or multiple spaces
Restrictions on String Data Types
Restrictions that can be used with String data types:
• enumeration
• length
• maxLength
• minLength
• pattern (NMTOKENS, IDREFS, and ENTITIES cannot use this constraint)
• whiteSpace
XSD Date and Time Data Types
Date and time data types are used for values that contain date and time.
Date Data Type
The date data type is used to specify a date.
The date is specified in the following form "YYYY‐MM‐DD" where:
• YYYY indicates the year
• MM indicates the month
• DD indicates the day
Note: All components are required!
The following is an example of a date declaration in a schema:
<xs:element name="start" type="xs:date"/>
An element in your document might look like this:
<start>2002‐09‐24</start>
Time Zones
To specify a time zone, you can either enter a date in UTC time by adding a "Z" behind the date
‐ like this:
132
<start>2002‐09‐24Z</start>
or you can specify an offset from the UTC time by adding a positive or negative time behind the
date ‐ like this:
<start>2002‐09‐24‐06:00</start>
or
<start>2002‐09‐24+06:00</start>
Time Data Type
The time data type is used to specify a time.
The time is specified in the following form "hh:mm:ss" where:
• hh indicates the hour
• mm indicates the minute
• ss indicates the second
Note: All components are required!
The following is an example of a time declaration in a schema:
<xs:element name="start" type="xs:time"/>
An element in your document might look like this:
<start>09:00:00</start>
Or it might look like this:
<start>09:30:10.5</start>
Time Zones
To specify a time zone, you can either enter a time in UTC time by adding a "Z" behind the time
‐ like this:
<start>09:30:10Z</start>
or you can specify an offset from the UTC time by adding a positive or negative time behind the
time ‐ like this:
<start>09:30:10‐06:00</start>
133
or
<start>09:30:10+06:00</start>
DateTime Data Type
The dateTime data type is used to specify a date and a time.
The dateTime is specified in the following form "YYYY‐MM‐DDThh:mm:ss" where:
• YYYY indicates the year
• MM indicates the month
• DD indicates the day
• T indicates the start of the required time section
• hh indicates the hour
• mm indicates the minute
• ss indicates the second
Note: All components are required!
The following is an example of a dateTime declaration in a schema:
<xs:element name="startdate" type="xs:dateTime"/>
An element in your document might look like this:
<startdate>2002‐05‐30T09:00:00</startdate>
Or it might look like this:
<startdate>2002‐05‐30T09:30:10.5</startdate>
Time Zones
To specify a time zone, you can either enter a dateTime in UTC time by adding a "Z" behind the
time ‐ like this:
<startdate>2002‐05‐30T09:30:10Z</startdate>
or you can specify an offset from the UTC time by adding a positive or negative time behind the
time ‐ like this:
<startdate>2002‐05‐30T09:30:10‐06:00</startdate>
or
<startdate>2002‐05‐30T09:30:10+06:00</startdate>
134
Duration Data Type
The duration data type is used to specify a time interval.
The time interval is specified in the following form "PnYnMnDTnHnMnS" where:
• P indicates the period (required)
• nY indicates the number of years
• nM indicates the number of months
• nD indicates the number of days
• T indicates the start of a time section (required if you are going to specify hours,
minutes, or seconds)
• nH indicates the number of hours
• nM indicates the number of minutes
• nS indicates the number of seconds
The following is an example of a duration declaration in a schema:
<xs:element name="period" type="xs:duration"/>
An element in your document might look like this:
<period>P5Y</period>
The example above indicates a period of five years.
Or it might look like this:
<period>P5Y2M10D</period>
The example above indicates a period of five years, two months, and 10 days.
Or it might look like this:
<period>P5Y2M10DT15H</period>
The example above indicates a period of five years, two months, 10 days, and 15 hours.
Or it might look like this:
<period>PT15H</period>
The example above indicates a period of 15 hours.
Negative Duration
To specify a negative duration, enter a minus sign before the P:
135
<period>‐P10D</period>
The example above indicates a period of minus 10 days.
Date and Time Data Types
Name Description
date Defines a date value
dateTime Defines a date and time value
duration Defines a time interval
gDay Defines a part of a date ‐ the day (DD)
gMonth Defines a part of a date ‐ the month (MM)
gMonthDay Defines a part of a date ‐ the month and day (MM‐DD)
gYear Defines a part of a date ‐ the year (YYYY)
gYearMonth Defines a part of a date ‐ the year and month (YYYY‐MM)
time Defines a time value
Restrictions on Date Data Types
Restrictions that can be used with Date data types:
• enumeration
• maxExclusive
• maxInclusive
• minExclusive
• minInclusive
• pattern
• whiteSpace
XSD Numeric Data Types
Decimal data types are used for numeric values.
Decimal Data Type
The decimal data type is used to specify a numeric value.
The following is an example of a decimal declaration in a schema:
<xs:element name="prize" type="xs:decimal"/>
136
An element in your document might look like this:
<prize>999.50</prize>
Or it might look like this:
<prize>+999.5450</prize>
Or it might look like this:
<prize>‐999.5230</prize>
Or it might look like this:
<prize>0</prize>
Or it might look like this:
<prize>14</prize>
Note: The maximum number of decimal digits you can specify is 18.
Integer Data Type
The integer data type is used to specify a numeric value without a fractional component.
The following is an example of an integer declaration in a schema:
<xs:element name="prize" type="xs:integer"/>
An element in your document might look like this:
<prize>999</prize>
Or it might look like this:
<prize>+999</prize>
Or it might look like this:
<prize>‐999</prize>
Or it might look like this:
<prize>0</prize>
137
Numeric Data Types
Note that all of the data types below derive from the Decimal data type (except for decimal
itself)!
Name Description
byte A signed 8‐bit integer
decimal A decimal value
int A signed 32‐bit integer
integer An integer value
long A signed 64‐bit integer
negativeInteger An integer containing only negative values (..,‐2,‐1)
nonNegativeInteger An integer containing only non‐negative values (0,1,2,..)
nonPositiveInteger An integer containing only non‐positive values (..,‐2,‐1,0)
positiveInteger An integer containing only positive values (1,2,..)
short A signed 16‐bit integer
unsignedLong An unsigned 64‐bit integer
unsignedInt An unsigned 32‐bit integer
unsignedShort An unsigned 16‐bit integer
unsignedByte An unsigned 8‐bit integer
Restrictions on Numeric Data Types
Restrictions that can be used with Numeric data types:
• enumeration
• fractionDigits
• maxExclusive
• maxInclusive
• minExclusive
• minInclusive
• pattern
• totalDigits
• whiteSpace
XSD Miscellaneous Data Types
Other miscellaneous data types are boolean, base64Binary, hexBinary, float, double, anyURI,
QName, and NOTATION.
138
Boolean Data Type
The boolean data type is used to specify a true or false value.
The following is an example of a boolean declaration in a schema:
<xs:attribute name="disabled" type="xs:boolean"/>
An element in your document might look like this:
<prize disabled="true">999</prize>
Note: Legal values for boolean are true, false, 1 (which indicates true), and 0 (which indicates
false).
Binary Data Types
Binary data types are used to express binary‐formatted data.
We have two binary data types:
• base64Binary (Base64‐encoded binary data)
• hexBinary (hexadecimal‐encoded binary data)
The following is an example of a hexBinary declaration in a schema:
<xs:element name="blobsrc" type="xs:hexBinary"/>
AnyURI Data Type
The anyURI data type is used to specify a URI.
The following is an example of an anyURI declaration in a schema:
<xs:attribute name="src" type="xs:anyURI"/>
An element in your document might look like this:
<pic src="http://www.w3schools.com/images/smiley.gif" />
Note: If a URI has spaces, replace them with %20.
Miscellaneous Data Types
Name Description
139
anyURI
base64Binary
boolean
double
float
hexBinary
NOTATION
QName
Restrictions on Miscellaneous Data Types
Restrictions that can be used with the other data types:
• enumeration (a Boolean data type cannot use this constraint)
• length (a Boolean data type cannot use this constraint)
• maxLength (a Boolean data type cannot use this constraint)
• minLength (a Boolean data type cannot use this constraint)
• pattern
• whiteSpace
XML Editors
If you are serious about XML, you will benefit from using a professional XML Editor.
XML is Text‐based
XML is a text‐based markup language.
One great thing about XML is that XML files can be created and edited using a simple text‐
editor like Notepad.
However, when you start working with XML, you will soon find that it is better to edit XML
documents using a professional XML editor.
Why Not Notepad?
Many web developers use Notepad to edit both HTML and XML documents because Notepad is
included with the most common OS and it is simple to use. Personally I often use Notepad for
quick editing of simple HTML, CSS, and XML files.
But, if you use Notepad for XML editing, you will soon run into problems.
Notepad does not know that you are writing XML, so it will not be able to assist you.
140
Why an XML Editor?
Today XML is an important technology, and development projects use XML‐based technologies
like:
• XML Schema to define XML structures and data types
• XSLT to transform XML data
• SOAP to exchange XML data between applications
• WSDL to describe web services
• RDF to describe web resources
• XPath and XQuery to access XML data
• SMIL to define graphics
To be able to write error‐free XML documents, you will need an intelligent XML editor!
XML Editors
Professional XML editors will help you to write error‐free XML documents, validate your XML
against a DTD or a schema, and force you to stick to a valid XML structure.
An XML editor should be able to:
• Add closing tags to your opening tags automatically
• Force you to write valid XML
• Verify your XML against a DTD
• Verify your XML against a Schema
• Color code your XML syntax
XML Processing
• SAX (Simple API for XML). Low‐level approach viewing an XML document as a sequence
of tags to which actions are assigned.
<svg ...> <circle .../> <rect .../> <rect .../> </svg>
• DOM (Document Object Model) Views a document as a hierarchy of elements.
<svg ...> </svg>
|
+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐‐‐‐‐‐‐‐‐+
141
| | |
<circle .../> <rect .../> <rect .../>
• XSLT (Extensible Stylesheet Language Transformation) Provides a template‐oriented
instead of procedural‐oriented approach
• JDOM (Java Document Object Model): a variant of DOM adjusted for streamlined for
java.
• JAXB (Java API for XML Building): XML translation into java classes
<svg xmlns="http://www.w3.org/2000/svg" >
<circle cx="50" cy="50" r="40" fill="yellow"/>
<rect x="50" y="50" width="40" height="40" fill="green"/>
<rect x="10" y="10" width="40" height="40" fill="green"/>
</svg>
<..svg into java graphics..>
if( QName.equals( "rect" ) ){
g.setColor( Color.getColor( atts.getValue("fill") ));
g.fillRect(
Integer.getInteger( atts.getValue( "x" )).intValue(),
Integer.getInteger( atts.getValue( "y" )).intValue(),
Integer.getInteger( atts.getValue("width" )).intValue(),
Integer.getInteger( atts.getValue("height")).intValue()
);
} else if( QName.equals( "circle" ) ){
g.setColor( Color.getColor( atts.getValue("fill") ));
int r = Integer.getInteger( atts.getValue("r")).intValue();
g.fillRect(
Integer.getInteger( atts.getValue( "cx" )).intValue() ‐ r,
Integer.getInteger( atts.getValue( "cy" )).intValue() ‐ r,
2*r, 2*r
);
}
‐_‐_‐
<..MySVGBrowser.java..>
import javax.xml.parsers.*;
import org.xml.sax.XMLReader;
import java.io.File;
import org.xml.sax.Attributes;
142
import org.xml.sax.helpers.DefaultHandler;
import java.awt.*;
import javax.swing.*;
class MySVGBrowser {
static public void main(String[] args) {
new WebPage( args[0]);
} }
class MyContentHandler extends DefaultHandler {
Graphics g;
MyContentHandler(Graphics g){ this.g = g; }
public void startElement(String namespace, String localName,
String QName, Attributes atts) {
<.svg into java graphics.>
}
}
class WebPage extends JFrame {
String fileName;
WebPage ( String fileName) {
this.fileName = fileName;
setSize(200,200); setVisible(true);
}
public void paint(Graphics g) {
try{
SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setNamespaceAware( true );
SAXParser saxParser = factory.newSAXParser();
XMLReader xmlReader = saxParser.getXMLReader();
xmlReader.setContentHandler( new MyContentHandler(g) );
xmlReader.parse( new File(fileName).toURL().toString() );
} catch( Exception e ){}
}
}
143
‐_‐_‐
The painting componnet is similar to that in the following program.
<..JavaPaint.java..>
import java.awt.*;
import javax.swing.*;
class JavaPaint {
public static void main(String args[]) {
new Pic();
}
}
class Pic extends JFrame {
Pic() { setSize(200,200); setVisible(true); }
public void paint(Graphics g) {
g.fillRect(0, 0, 50, 60);
}
}
Specifications
• Entities and Unicode‐‐data representation
• XML namespaces‐‐mixed vocabularies
• DTDs, XML schemas, RELAX NG‐‐structural constraints and data types
• XLinks, XPointers, XPath‐‐Linking and addressing
• CSS, XSL‐FO‐‐Presentation of XML
• Web Accessibility
<or:oranges xmlns:or="fruit://osu/or"
xmlns:ap="fruit://osu/ap" >
<ap:apples count="3" >...</ap:apples>
<cocktail or:count="5" ap:count="2" >...</cocktail>
</or:oranges>
How SAX processing works
144
SAX analyzes an XML stream as it goes by, much like an old tickertape. Consider the
following XML code snippet:
<?xml version="1.0"?>
<samples>
<server>UNIX</server>
<monitor>color</monitor>
</samples>
A SAX processor analyzing this code snipped would generate, in general, the following
events:
Start document
Start element (samples)
Characters (white space)
Start element (server)
Characters (UNIX)
End element (server)
Characters (white space)
Start element (monitor)
145
Characters (color)
End element (monitor)
Characters (white space)
End element (samples)
The SAX API allows a developer to capture these events and act on them.
SAX processing involves the following steps:
* Create an event handler.
* Create the SAX parser.
* Assign the event handler to the parser.
* Parse the document, sending each event to the handler.
The pros and cons of event‐based processing
The advantages of this kind of processing are much like the advantages of streaming media;
analysis can get started immediately, rather than having to wait for all of the data to be
processed. Also, because the application is simply examining the data as it goes by, it
doesn't need to store it in memory. This is a huge advantage when it comes to large
documents. In general, SAX is also much faster than the alternative, the Document Object
Model.
On the other hand, because the application is not storing the data in any way, it is impossible
to make changes to it using SAX, or to move "backward" in the data stream.
Presented by developerWorks, your source for great tutorials ibm.com/developerWorks
Understanding SAX Page 4
DOM and tree‐based processing
146
The Document Object Model, or DOM, is the "traditional" way of handling XML data. With DOM
the data is loaded into memory in a tree‐like structure.
147
For instance, the same document used as an example in the preceding panel would be
represented as nodes, as shown to the left. The rectangular boxes represent element nodes,
and the ovals represent text nodes. DOM uses a root node and parent‐child relationships. For
instance, in this case, samples
would be the root node with five children: three text nodes (the white space), and the two
element nodes, server and monitor. One important thing to realize is that the server
and monitor actually have values of null.
Instead, they have text nodes for children, UNIX 0and color.
Pros and cons of tree‐based processing
DOM, and by extension tree‐based processing, has several advantages. First, because the
tree is persistent in memory, it can be modified so an application can make changes to the
data and the structure. It can also work its way up and down the tree at any time, as opposed
to the "one‐shot deal" of SAX. DOM can also be much simpler to use.
On the other hand, there is a lot of overhead involved in building these trees in memory. It's
not unusual for large files to completely overrun a system's capacity. In addition, creating a
DOM tree can be a very slow process.
How to choose between SAX and DOM
Whether you choose DOM or SAX is going to depend on several factors:
* Purpose of the application: If you are going to have to make changes to the data and
output it as XML, then in most cases, DOM is the way to go. This is particularly true if
the changes are to the data itself, as opposed to a simple structural change that can be
accomplished with XSL transformations.
* Amount of data: For large files, SAX is a better bet.
* How the data will be used: If only a small amount of the data will actually be used, you
may be better off using SAX to extract it into your application. On the other hand, if you
Presented by developerWorks, your source for great tutorials ibm.com/developerWorks
Understanding SAX Page 5
know that you will need to refer back to information that has already been processed,
SAX is probably not the right choice.
* The need for speed: SAX implementations are normally faster than DOM
implementations.
It's important to remember that SAX and DOM are not mutually exclusive. You can use DOM
to create a SAX stream of events, and you can use SAX to create a DOM tree. In fact, most
parsers used to create DOM trees are actually using SAX to do it!
Presented by developerWorks, your source for great tutorials ibm.com/developerWorks
Understanding SAX Page 6
Disadvantages of SAX are:
* Easily forgets previous elements it worked on
* Not easy to re‐order elements
* Cannot validate an XML document
* Canot easily verify ID‐REF links
148
DOM versus SAX parsing:
Practical differences are the following
1. DOM APIs map the XML document into an internal tree structure and allows you to
refer to the nodes and the elements in any way you want and as many times as you
want. This usually means less programming and planning ahead but also means bad
performance in terms of memory or CPU cycles.
2. SAX APIs on the other hand are event based ie they traverse the XML document and
allows you to trap the events as it passes through the document. You can trap start of
the document, start of an element and the start of any characters within an element.
This usually means more programming and planning on your part but is compensated by
the fact that it will take less memory and less CPU cycles.
3. DOM performance may not be an issue if it used in a batch environment because the
performance impact will be felt once and may be negligible compared to the rest of the
batch process.
4. DOM performance may become an issue in an on line transaction processing
environment because the performance impact will be felt for each and every
transaction. It may not be negligible compared to the rest of the on line processing,
since by nature they are short living process.
5. Elapsed time difference in DOM vs SAX
A XML document 13kb long with 2354 elements or tags. This message represents an
accounting G/L entries sent from one Banking system to another.
Windows 2000 running in Pentium
SAX version ‐ 1 sec
DOM version ‐ 4 secs
IBM mainframe under CICS 1.3
SAX version‐ 2 secs
DOM version 10 secs
IBM mainframe under CICS 2.2
SAX version‐ 1 sec
DOM version 2 secs
The significant reduction in under CICS2.2 is due to the fact that the JVM is reusable and
it uses jdk1.3 vs jdk1.1
149
6. Examples of the difference in coding
Sample XML Document
<?xml version="1.0"?>
<doc>
<para>3R Computer XML Help Page</para>
</doc>
PRESENTATION TECHNOLOGIES IN XML
The Extensible Stylesheet Language Family (XSL)
XSL is a family of recommendations for defining XML document transformation and
presentation. It consists of three parts:
XSL Transformations (XSLT)
a language for transforming XML
the XML Path Language (XPath)
an expression language used by XSLT to access or refer to parts of an XML document.
(XPath is also used by the XML Linking specification)
XSL Formatting Objects (XSL‐FO)
an XML vocabulary for specifying formatting semantics
An XSLT stylesheet specifies the presentation of a class of XML documents by describing how an
instance of the class is transformed into an XML document that uses a formatting vocabulary,
such as (X)HTML or XSL‐FO. For a more detailed explanation of how XSL works, see the What Is
XSL page.
For background information on style sheets, see the Web style sheets resource page. XSL is
developed by the W3C XSL Working Group (members only) whose charter is to develop the
next version of XSL. XSL is part of W3C's XML Activity, whose work is described in the XML
Activity Statement.
Sample XSL file Sample ShoXS from sample XSL file
<?xml version="1.0" encoding="utf‐8"?>
150
<xsl:stylesheet xmlns:xsl=
"http://www.w3.org/1999/XSL/Transform"
xmlns=
"http://www.w3.org/TR/xhtml1/strict"
version="1.0">
<xsl:strip‐space
elements="doc chapter section"/>
<xsl:output method="xml" indent="yes"
encoding="iso‐8859‐1"/>
<xsl:template match="doc">
<html>
<head>
<title>
<xsl:value‐of select="title"/>
</title>
</head>
<body>
<xsl:apply‐templates/>
</body>
</html>
</xsl:template>
<xsl:template match="doc/title">
<h1>
<xsl:apply‐templates/>
</h1>
</xsl:template>
<xsl:template match="chapter/title">
<h2>
<xsl:apply‐templates/>
</h2>
</xsl:template>
<xsl:template match="section/title">
<h3>
<xsl:apply‐templates/>
</h3>
</xsl:template>
<xsl:template match="para">
<p>
<xsl:apply‐templates/>
</p>
151
</xsl:template>
<xsl:template match="note">
<p class="note">
<b>NOTE: </b>
<xsl:apply‐templates/>
</p>
</xsl:template>
<xsl:template match="emph">
<em>
<xsl:apply‐templates/>
</em>
</xsl:template>
</xsl:stylesheet>
X FORMS
XForms ‐XForms is the next generation of HTML forms.
XForms uses XML to create input forms on the Web.
From XForms 1.1:
XForms is an XML application that represents the next generation of forms for the Web. XForms
is not a free‐standing document type, but is intended to be integrated into other markup
languages, such as XHTML, ODF or SVG. An XForms‐based web form gathers and processes XML
data using an architecture that separates presentation, purpose and content. The underlying
data of a form is organized into instances of data schema (though formal schema definitions are
not required). An XForm allows processing of data to occur using three mechanisms:
152
• a declarative model composed of formulae for data calculations and constraints, data
type and other property declarations, and data submission parameters
• a view layer composed of intent‐based user interface controls
• an imperative controller for orchestrating data manipulations, interactions between the
model and view layers, and data submissions.
Before you continue you should have a basic understanding of the following:
• HTML
• HTML Forms
• XHTML
• XML
If you want to study these subjects first, find the tutorials on our Home Page.
What Is XForms?
• XForms is the next generation of HTML forms
• XForms is richer and more flexible than HTML forms
• XForms will be the forms standard in XHTML 2.0
• XForms is platform and device independent
• XForms separates data and logic from presentation
• XForms uses XML to define form data
• XForms stores and transports data in XML documents
• XForms contains features like calculations and validations of forms
• XForms reduces or eliminates the need for scripting
• XForms is a W3C Recommendation
XForms Is The Successors Of HTML Forms
Forms are an important part of many web applications today. An HTML form makes it possible
for web applications to accept input from a user.
Today, ten years after HTML forms became a part of the HTML standard, web users do complex
transactions that are starting to exceed the limitations of standard HTML forms.
XForms provides a richer, more secure, and device independent way of handling web input. We
should expect future web solutions to demand the use of XForms‐enabled browsers (All future
browsers should support XForms).
153
XForms Separate Data From Presentation
XForms uses XML for data definition and HTML or XHTML for data display. XForms separates
the data logic of a form from its presentation. This way the XForms data can be defined
independent of how the end‐user will interact with the application.
XForms Uses XML To Define Form Data
With XForms, the rules for describing and validating data are expressed in XML.
XForms Uses XML To Store And Transport Data
With XForms, the data displayed in a form are stored in an XML document, and the data
submitted from the form, are transported over the internet using XML.
The data content is coded in, and transported as Unicode bytes.
XForms Is Device Independent
Separating data from presentation makes XForms device independent, because the data model
can be used for all devices. The presentation can be customized for different user interfaces,
like mobile phones, handheld devices, and Braille readers for the blind.
Since XForms is device independent and based on XML, it is also possible to add XForms
elements directly into other XML applications like VoiceXML (speaking web data), WML
(Wireless Markup Language), and SVG (Scalable Vector Graphics).
The XForms Framework
The purpose of an HTML form is to collect data. XForms has the same purpose.
With XForms, input data is described in two different parts:
• The XForm model ‐ defines what the form is, what it should do, what data it contains
• The XForm user interface ‐ defines the input fields and how they should be displayed
154
The XForms Model
The XForms model describes the data.
The XForms model defines a data model inside a model element:
<model>
<instance>
<person>
<fname/>
<lname/>
</person>
</instance>
<submission id="form1" action="submit.asp" method="get"/>
</model>
In the example above, the XForms model uses an instance element to define the XML‐template
for the data to be collected, and a submission element to describe how to submit the data.
Note: The XForms model does not say anything about the visual part of the form (the user
interface).
XForms Namespace
If you are missing the XForms namespace in these examples, or if you don't know what a
namespace is, it will be introduced in the next chapter.
The instance Element
The instance element defines the data to be collected.
XForms is always collecting data for an XML document. The instance element in the XForms
model defines the XML document.
In the example above the "data instance" (the XML document) the form is collecting data for
looks like this:
<person>
<fname/>
<lname/>
155
</person>
After collecting the data, the XML document might look like this:
<person>
<fname>John</fname>
<lname>Smith</lname>
</person>
The submission Element
The submission element describes how to submit the data.
The submission element defines a form and how it should be submitted.
In the example above, the id="form1" identifies a form, the action="submit.asp" defines the
URL to where the form should be submitted, and the method="get" attribute defines the
method to use when submitting the form data.
The XForms User Interface
The XForms user interface defines the input fields and how they should be displayed.
The user interface elements are called controls (or input controls):
<input ref="fname"><label>First Name</label></input>
<input ref="lname"><label>Last Name</label></input>
<submit submission="form1"><label>Submit</label></submit>
In the example above the two <input> elements define two input fields. The ref="fname" and
ref="lname" attributes point to the <fname> and <lname> elements in the XForms model.
The <submit> element has a submission="form1" attribute which refers to the <submission>
element in the XForms model. A submit element is usually displayed as a button.
Notice the <label> elements in the example. With XForms every input control element has a
required <label> element.
156
XForms Example
You can test XForms with Internet Explorer (XForms will not work in IE prior version 5).
Just click on the "Try it Yourself" button under the example.
Example
<xforms>
<model>
<instance>
<person>
<fname/>
<lname/>
</person>
</instance>
<submission id="form1" method="get"
action="submit.asp"/>
</model>
<input ref="fname">
<label>First Name</label></input><br />
<input ref="lname">
<label>Last Name</label></input><br /><br />
<submit submission="form1">
<label>Submit</label></submit>
</xforms>
HTML/XHTML Forms and XForms
Function HTML/XHTML Forms XForms
Heavy reliance on scripting
Validation and
languages, both client‐side XPath, W3C XML Schema
Calculation
(Javascript) and server‐side.
157
User Feedback Scripting languages XML form model
Server‐side process to dynamically
Initializing Data XML instance data
generate form
XHTML, XHTML Mobile Profile,
Host Language HTML/XHTML
SVG, etc.
Using XForms
• Browser
o Native
o Plugin
o Javascript
• XForms Player
• Server‐side processing to XHTML/JavaScript/Ajax
HTML Forms
XForms
158
What Are
e XForms?
Tradition
nal HTML We eb forms don't separatee the purposee from the ppresentation of a form.
XForms, in contrast, are comprissed of separaate sections that describ
be what the form does, aand
how the form looks. This allows for flexible p
presentation
n options, including classsic XHTML
forms, to
o be attached to an XMLL form definition.
The following illustraates how a single device‐independen nt XML formm definition, called the
XForms M Model, has tthe capabilitty to work w
with a varietyy of standard
d or propriettary user
interfacees:
159
The XForrms User Intterface provides a standard set of visual controls that are taargeted towaard
replacingg today's XHTTML form co
ontrols. Thesse form conttrols are direectly usable inside XHTM
ML
and otheer XML documents, like SSVG. Other ggroups, suchh as the Voicce Browser W Working Group,
may alsoo independen ntly develop
p user interfaace components for XFo orms.
An imporrtant concep pt in XFormss is that form
ms collect data, which is expressed aas XML instance
data. Am mong other d duties, the XForms Modeel describes the structurre of the insttance data. TThis
is importtant, since likke XML, formms represent a structureed interchannge of data. W Workflow, aauto‐
fill, and p
pre‐fill form applicationss are supporrted throughh the use of iinstance datta.
Finally, th
here needs tto be a chan
nnel for instaance data to flow to and
d from the XFForms Proceessor.
For this, the XForms Submit Prottocol definees how XForm ms send andd receive datta, including the
o suspend an
ability to nd resume th he completio on of a form.
The following illustraation summaarizes the main aspects o
of XForms:
Key Goalls of XFormss
• Support for sttructured fo
orm data
• A
Advanced for rms logic witthout serverr round‐trippping
• D
Dynamic acce ess to serverr data sourcees during forrm execution
• D
Decoupled da ata, logic and
d presentatiion
• Seeamless inte
egration withh other XMLL tag sets
• Richer user in
nterface to mmeet the neeeds of business, consum mer and devicce control
applications
• Support for h
handheld, television, and d desktop brrowsers, plus printers an
nd scanners
• Im
mproved internationalization and acccessibility
• M
Multiple form
ms per page, and pages p per form
• Suspend and Resume cap pabilities
160
1. What is XHTML?
This section is informative.
XHTML is a family of current and future document types and modules that reproduce, subset,
and extend HTML 4 [HTML4]. XHTML family document types are XML based, and ultimately are
designed to work in conjunction with XML‐based user agents. The details of this family and its
evolution are discussed in more detail in [XHTMLMOD].
XHTML 1.0 (this specification) is the first document type in the XHTML family. It is a
reformulation of the three HTML 4 document types as applications of XML 1.0 [XML]. It is
intended to be used as a language for content that is both XML‐conforming and, if some simple
guidelines are followed, operates in HTML 4 conforming user agents. Developers who migrate
their content to XHTML 1.0 will realize the following benefits:
• XHTML documents are XML conforming. As such, they are readily viewed, edited, and
validated with standard XML tools.
• XHTML documents can be written to operate as well or better than they did before in
existing HTML 4‐conforming user agents as well as in new, XHTML 1.0 conforming user
agents.
• XHTML documents can utilize applications (e.g. scripts and applets) that rely upon either
the HTML Document Object Model or the XML Document Object Model [DOM].
• As the XHTML family evolves, documents conforming to XHTML 1.0 will be more likely
to interoperate within and among various XHTML environments.
The XHTML family is the next step in the evolution of the Internet. By migrating to XHTML
today, content developers can enter the XML world with all of its attendant benefits, while still
remaining confident in their content's backward and future compatibility.
1.1. What is HTML 4?
HTML 4 [HTML4] is an SGML (Standard Generalized Markup Language) application conforming
to International Standard ISO 8879, and is widely regarded as the standard publishing language
of the World Wide Web.
SGML is a language for describing markup languages, particularly those used in electronic
document exchange, document management, and document publishing. HTML is an example of
a language defined in SGML.
SGML has been around since the middle 1980's and has remained quite stable. Much of this
stability stems from the fact that the language is both feature‐rich and flexible. This flexibility,
however, comes at a price, and that price is a level of complexity that has inhibited its adoption
in a diversity of environments, including the World Wide Web.
161
HTML, as originally conceived, was to be a language for the exchange of scientific and other
technical documents, suitable for use by non‐document specialists. HTML addressed the
problem of SGML complexity by specifying a small set of structural and semantic tags suitable
for authoring relatively simple documents. In addition to simplifying the document structure,
HTML added support for hypertext. Multimedia capabilities were added later.
In a remarkably short space of time, HTML became wildly popular and rapidly outgrew its
original purpose. Since HTML's inception, there has been rapid invention of new elements for
use within HTML (as a standard) and for adapting HTML to vertical, highly specialized, markets.
This plethora of new elements has led to interoperability problems for documents across
different platforms.
1.2. What is XML?
XML™ is the shorthand name for Extensible Markup Language [XML].
XML was conceived as a means of regaining the power and flexibility of SGML without most of
its complexity. Although a restricted form of SGML, XML nonetheless preserves most of SGML's
power and richness, and yet still retains all of SGML's commonly used features.
While retaining these beneficial features, XML removes many of the more complex features of
SGML that make the authoring and design of suitable software both difficult and costly.
1.3. Why the need for XHTML?
The benefits of migrating to XHTML 1.0 are described above. Some of the benefits of migrating
to XHTML in general are:
• Document developers and user agent designers are constantly discovering new ways to
express their ideas through new markup. In XML, it is relatively easy to introduce new
elements or additional element attributes. The XHTML family is designed to
accommodate these extensions through XHTML modules and techniques for developing
new XHTML‐conforming modules (described in the XHTML Modularization
specification). These modules will permit the combination of existing and new feature
sets when developing content and when designing new user agents.
• Alternate ways of accessing the Internet are constantly being introduced. The XHTML
family is designed with general user agent interoperability in mind. Through a new user
agent and document profiling mechanism, servers, proxies, and user agents will be able
to perform best effort content transformation. Ultimately, it will be possible to develop
XHTML‐conforming content that is usable by any XHTML‐conforming user agent.
2. Definitions
This section is normative.
162
2.1. Terminology
The following terms are used in this specification. These terms extend the definitions in
[RFC2119] in ways based upon similar definitions in ISO/IEC 9945‐1:1990 [POSIX.1]:
May
With respect to implementations, the word "may" is to be interpreted as an optional
feature that is not required in this specification but can be provided. With respect to
Document Conformance, the word "may" means that the optional feature must not be
used. The term "optional" has the same definition as "may".
Must
In this specification, the word "must" is to be interpreted as a mandatory requirement
on the implementation or on Strictly Conforming XHTML Documents, depending upon
the context. The term "shall" has the same definition as "must".
Optional
See "May".
Reserved
A value or behavior is unspecified, but it is not allowed to be used by Conforming
Documents nor to be supported by Conforming User Agents.
Shall
See "Must".
Should
With respect to implementations, the word "should" is to be interpreted as an
implementation recommendation, but not a requirement. With respect to documents,
the word "should" is to be interpreted as recommended programming practice for
documents and a requirement for Strictly Conforming XHTML Documents.
Supported
Certain facilities in this specification are optional. If a facility is supported, it behaves as
specified by this specification.
Unspecified
When a value or behavior is unspecified, the specification defines no portability
requirements for a facility on an implementation even when faced with a document that
163
uses the facility. A document that requires specific behavior in such an instance, rather
than tolerating any behavior when using that facility, is not a Strictly Conforming XHTML
Document.
2.2. General Terms
Attribute
An attribute is a parameter to an element declared in the DTD. An attribute's type and
value range, including a possible default value, are defined in the DTD.
DTD
A DTD, or document type definition, is a collection of XML markup declarations that, as
a collection, defines the legal structure, elements, and attributes that are available for
use in a document that complies to the DTD.
Document
A document is a stream of data that, after being combined with any other streams it
references, is structured such that it holds information contained within elements that
are organized as defined in the associated DTD. See Document Conformance for more
information.
Element
An element is a document structuring unit declared in the DTD. The element's content
model is defined in the DTD, and additional semantics may be defined in the prose
description of the element.
Facilities
Facilities are elements, attributes, and the semantics associated with those elements
and attributes.
Implementation
See User Agent.
Parsing
Parsing is the act whereby a document is scanned, and the information contained within
the document is filtered into the context of the elements in which the information is
structured.
Rendering
164
Rendering is the act whereby the information in a document is presented. This
presentation is done in the form most appropriate to the environment (e.g. aurally,
visually, in print).
User Agent
A user agent is a system that processes XHTML documents in accordance with this
specification. See User Agent Conformance for more information.
Validation
Validation is a process whereby documents are verified against the associated DTD,
ensuring that the structure, use of elements, and use of attributes are consistent with
the definitions in the DTD.
Well‐formed
A document is well‐formed when it is structured according to the rules defined in
Section 2.1 of the XML 1.0 Recommendation [XML].
3. Normative Definition of XHTML 1.0
This section is normative.
3.1. Document Conformance
This version of XHTML provides a definition of strictly conforming XHTML 1.0 documents, which
are restricted to elements and attributes from the XML and XHTML 1.0 namespaces. See
Section 3.1.2 for information on using XHTML with other namespaces, for instance, to include
metadata expressed in RDF within XHTML documents.
3.1.1. Strictly Conforming Documents
A Strictly Conforming XHTML Document is an XML document that requires only the facilities
described as mandatory in this specification. Such a document must meet all of the following
criteria:
1. It must conform to the constraints expressed in one of the three DTDs found in DTDs
and in Appendix B.
2. The root element of the document must be html.
3. The root element of the document must contain an xmlns declaration for the XHTML
namespace [XMLNS]. The namespace for XHTML is defined to be
http://www.w3.org/1999/xhtml. An example root element might look like:
4. <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
165
5. There must be a DOCTYPE declaration in the document prior to the root element. The
public identifier included in the DOCTYPE declaration must reference one of the three
DTDs found in DTDs using the respective Formal Public Identifier. The system identifier
may be changed to reflect local system conventions.
6. <!DOCTYPE html
7. PUBLIC "‐//W3C//DTD XHTML 1.0 Strict//EN"
8. "http://www.w3.org/TR/xhtml1/DTD/xhtml1‐strict.dtd">
9.
10. <!DOCTYPE html
11. PUBLIC "‐//W3C//DTD XHTML 1.0 Transitional//EN"
12. "http://www.w3.org/TR/xhtml1/DTD/xhtml1‐transitional.dtd">
13.
14. <!DOCTYPE html
15. PUBLIC "‐//W3C//DTD XHTML 1.0 Frameset//EN"
16. "http://www.w3.org/TR/xhtml1/DTD/xhtml1‐frameset.dtd">
17. The DTD subset must not be used to override any parameter entities in the DTD.
An XML declaration is not required in all XML documents; however XHTML document authors
are strongly encouraged to use XML declarations in all their documents. Such a declaration is
required when the character encoding of the document is other than the default UTF‐8 or UTF‐
16 and no encoding was determined by a higher‐level protocol. Here is an example of an
XHTML document. In this example, the XML declaration is included.
<?xml version="1.0" encoding="UTF‐8"?>
<!DOCTYPE html
PUBLIC "‐//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1‐strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>Virtual Library</title>
</head>
<body>
<p>Moved to <a href="http://example.org/">example.org</a>.</p>
</body>
</html>
3.1.2. Using XHTML with other namespaces
The XHTML namespace may be used with other XML namespaces as per [XMLNS], although
such documents are not strictly conforming XHTML 1.0 documents as defined above. Work by
W3C is addressing ways to specify conformance for documents involving multiple namespaces.
For an example, see [XHTML+MathML].
The following example shows the way in which XHTML 1.0 could be used in conjunction with
the MathML Recommendation:
166
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>A Math Example</title>
</head>
<body>
<p>The following is MathML markup:</p>
<math xmlns="http://www.w3.org/1998/Math/MathML">
<apply> <log/>
<logbase>
<cn> 3 </cn>
</logbase>
<ci> x </ci>
</apply>
</math>
</body>
</html>
The following example shows the way in which XHTML 1.0 markup could be incorporated into
another XML namespace:
<?xml version="1.0" encoding="UTF‐8"?>
<!‐‐ initially, the default namespace is "books" ‐‐>
<book xmlns='urn:loc.gov:books'
xmlns:isbn='urn:ISBN:0‐395‐36341‐6' xml:lang="en" lang="en">
<title>Cheaper by the Dozen</title>
<isbn:number>1568491379</isbn:number>
<notes>
<!‐‐ make HTML the default namespace for a hypertext commentary ‐‐>
<p xmlns='http://www.w3.org/1999/xhtml'>
This is also available <a href="http://www.w3.org/">online</a>.
</p>
</notes>
</book>
3.2. User Agent Conformance
A conforming user agent must meet all of the following criteria:
1. In order to be consistent with the XML 1.0 Recommendation [XML], the user agent must
parse and evaluate an XHTML document for well‐formedness. If the user agent claims to
be a validating user agent, it must also validate documents against their referenced
DTDs according to [XML].
2. When the user agent claims to support facilities defined within this specification or
required by this specification through normative reference, it must do so in ways
consistent with the facilities' definition.
167
3. When a user agent processes an XHTML document as generic XML, it shall only
recognize attributes of type ID (i.e. the id attribute on most XHTML elements) as
fragment identifiers.
4. If a user agent encounters an element it does not recognize, it must process the
element's content.
5. If a user agent encounters an attribute it does not recognize, it must ignore the entire
attribute specification (i.e., the attribute and its value).
6. If a user agent encounters an attribute value it does not recognize, it must use the
default attribute value.
7. If it encounters an entity reference (other than one of the entities defined in this
recommendation or in the XML recommendation) for which the user agent has
processed no declaration (which could happen if the declaration is in the external subset
which the user agent hasn't read), the entity reference should be processed as the
characters (starting with the ampersand and ending with the semi‐colon) that make up
the entity reference.
8. When processing content, user agents that encounter characters or character entity
references that are recognized but not renderable may substitute another rendering
that gives the same meaning, or must display the document in such a way that it is
obvious to the user that normal rendering has not taken place.
9. White space is handled according to the following rules. The following characters are
defined in [XML] white space characters:
o SPACE ( )
o HORIZONTAL TABULATION (	)
o CARRIAGE RETURN (
)
o LINE FEED (
)
The XML processor normalizes different systems' line end codes into one single LINE
FEED character, that is passed up to the application.
The user agent must use the definition from CSS for processing whitespace characters
[CSS2]. Note that the CSS2 recommendation does not explicitly address the issue of
whitespace handling in non‐Latin character sets. This will be addressed in a future
version of CSS, at which time this reference will be updated.
Note that in order to produce a Canonical XHTML document, the rules above must be applied
and the rules in [XMLC14N] must also be applied to the document.
4. Differences with HTML 4
This section is informative.
Due to the fact that XHTML is an XML application, certain practices that were perfectly legal in
SGML‐based HTML 4 [HTML4] must be changed.
168
4.1. Documents must be well‐formed
Well‐formedness is a new concept introduced by [XML]. Essentially this means that all elements
must either have closing tags or be written in a special form (as described below), and that all
the elements must nest properly.
Although overlapping is illegal in SGML, it is widely tolerated in existing browsers.
CORRECT: nested elements.
<p>here is an emphasized <em>paragraph</em>.</p>
INCORRECT: overlapping elements
<p>here is an emphasized <em>paragraph.</p></em>
4.2. Element and attribute names must be in lower case
XHTML documents must use lower case for all HTML element and attribute names. This
difference is necessary because XML is case‐sensitive e.g. <li> and <LI> are different tags.
4.3. For non‐empty elements, end tags are required
In SGML‐based HTML 4 certain elements were permitted to omit the end tag; with the
elements that followed implying closure. XML does not allow end tags to be omitted. All
elements other than those declared in the DTD as EMPTY must have an end tag. Elements that
are declared in the DTD as EMPTY can have an end tag or can use empty element shorthand
(see Empty Elements).
CORRECT: terminated elements
<p>here is a paragraph.</p><p>here is another paragraph.</p>
INCORRECT: unterminated elements
<p>here is a paragraph.<p>here is another paragraph.
4.4. Attribute values must always be quoted
All attribute values must be quoted, even those which appear to be numeric.
CORRECT: quoted attribute values
<td rowspan="3">
INCORRECT: unquoted attribute values
169
<td rowspan=3>
4.5. Attribute Minimization
XML does not support attribute minimization. Attribute‐value pairs must be written in full.
Attribute names such as compact and checked cannot occur in elements without their value
being specified.
CORRECT: unminimized attributes
<dl compact="compact">
INCORRECT: minimized attributes
<dl compact>
4.6. Empty Elements
Empty elements must either have an end tag or the start tag must end with />. For instance,
<br/> or <hr></hr>. See HTML Compatibility Guidelines for information on ways to ensure this
is backward compatible with HTML 4 user agents.
CORRECT: terminated empty elements
<br/><hr/>
INCORRECT: unterminated empty elements
<br><hr>
4.7. White Space handling in attribute values
When user agents process attributes, they do so according to Section 3.3.3 of [XML]:
• Strip leading and trailing white space.
• Map sequences of one or more white space characters (including line breaks) to a single
inter‐word space.
4.8. Script and Style elements
In XHTML, the script and style elements are declared as having #PCDATA content. As a result, <
and & will be treated as the start of markup, and entities such as < and & will be
recognized as entity references by the XML processor to < and & respectively. Wrapping the
content of the script or style element within a CDATA marked section avoids the expansion of
these entities.
170
<script type="text/javascript">
<![CDATA[
... unescaped script content ...
]]>
</script>
CDATA sections are recognized by the XML processor and appear as nodes in the Document
Object Model, see Section 1.3 of the DOM Level 1 Recommendation [DOM].
An alternative is to use external script and style documents.
4.9. SGML exclusions
SGML gives the writer of a DTD the ability to exclude specific elements from being contained
within an element. Such prohibitions (called "exclusions") are not possible in XML.
For example, the HTML 4 Strict DTD forbids the nesting of an 'a' element within another 'a'
element to any descendant depth. It is not possible to spell out such prohibitions in XML. Even
though these prohibitions cannot be defined in the DTD, certain elements should not be
nested. A summary of such elements and the elements that should not be nested in them is
found in the normative Element Prohibitions.
4.10. The elements with 'id' and 'name' attributes
HTML 4 defined the name attribute for the elements a, applet, form, frame, iframe, img, and
map. HTML 4 also introduced the id attribute. Both of these attributes are designed to be used
as fragment identifiers.
In XML, fragment identifiers are of type ID, and there can only be a single attribute of type ID
per element. Therefore, in XHTML 1.0 the id attribute is defined to be of type ID. In order to
ensure that XHTML 1.0 documents are well‐structured XML documents, XHTML 1.0 documents
MUST use the id attribute when defining fragment identifiers on the elements listed above. See
the HTML Compatibility Guidelines for information on ensuring such anchors are backward
compatible when serving XHTML documents as media type text/html.
Note that in XHTML 1.0, the name attribute of these elements is formally deprecated, and will
be removed in a subsequent version of XHTML.
4.11. Attributes with pre‐defined value sets
HTML 4 and XHTML both have some attributes that have pre‐defined and limited sets of values
(e.g. the type attribute of the input element). In SGML and XML, these are called enumerated
attributes. Under HTML 4, the interpretation of these values was case‐insensitive, so a value of
TEXT was equivalent to a value of text. Under XML, the interpretation of these values is case‐
sensitive, and in XHTML 1 all of these values are defined in lower‐case.
171
4.12. Entity references as hex values
SGML and XML both permit references to characters by using hexadecimal values. In SGML
these references could be made using either &#Xnn; or &#xnn;. In XML documents, you must
use the lower‐case version (i.e. &#xnn;)
5. Compatibility Issues
This section is normative.
Although there is no requirement for XHTML 1.0 documents to be compatible with existing user
agents, in practice this is easy to accomplish. Guidelines for creating compatible documents can
be found in Appendix C.
5.1. Internet Media Type
XHTML Documents which follow the guidelines set forth in Appendix C, "HTML Compatibility
Guidelines" may be labeled with the Internet Media Type "text/html" [RFC2854], as they are
compatible with most HTML browsers. Those documents, and any other document conforming
to this specification, may also be labeled with the Internet Media Type "application/xhtml+xml"
as defined in [RFC3236]. For further information on using media types with XHTML, see the
informative note [XHTMLMIME].
A. DTDs
This appendix is normative.
These DTDs and entity sets form a normative part of this specification. The complete set of DTD
files together with an XML declaration and SGML Open Catalog is included in the zip file and the
gzip'd tar file for this specification. Users looking for local copies of the DTDs to work with
should download and use those archives rather than using the specific DTDs referenced below.
A.1. Document Type Definitions
These DTDs approximate the HTML 4 DTDs. The W3C recommends that you use the
authoritative versions of these DTDs at their defined SYSTEM identifiers when validating
content. If you need to use these DTDs locally you should download one of the archives of this
version. For completeness, the normative versions of the DTDs are included here:
A.1.1. XHTML‐1.0‐Strict
The file DTD/xhtml1‐strict.dtd is a normative part of this specification. The annotated contents
of this file are available in this separate section for completeness.
172
A.1.2. XHTML‐1.0‐Transitional
The file DTD/xhtml1‐transitional.dtd is a normative part of this specification. The annotated
contents of this file are available in this separate section for completeness.
A.1.3. XHTML‐1.0‐Frameset
The file DTD/xhtml1‐frameset.dtd is a normative part of this specification. The annotated
contents of this file are available in this separate section for completeness.
A.2. Entity Sets
The XHTML entity sets are the same as for HTML 4, but have been modified to be valid XML 1.0
entity declarations. Note the entity for the Euro currency sign (€ or € or €)
is defined as part of the special characters.
A.2.1. Latin‐1 characters
The file DTD/xhtml‐lat1.ent is a normative part of this specification. The annotated contents of
this file are available in this separate section for completeness.
A.2.2. Special characters
The file DTD/xhtml‐special.ent is a normative part of this specification. The annotated contents
of this file are available in this separate section for completeness.
A.2.3. Symbols
The file DTD/xhtml‐symbol.ent is a normative part of this specification. The annotated contents
of this file are available in this separate section for completeness.
B. Element Prohibitions
This appendix is normative.
The following elements have prohibitions on which elements they can contain (see SGML
Exclusions). This prohibition applies to all depths of nesting, i.e. it contains all the descendant
elements.
a
must not contain other a elements.
pre
173
must not contain the img, object, big, small, sub, or sup elements.
button
must not contain the input, select, textarea, label, button, form, fieldset, iframe or
isindex elements.
label
must not contain other label elements.
form
must not contain other form elements.
C. HTML Compatibility Guidelines
This appendix is informative.
This appendix summarizes design guidelines for authors who wish their XHTML documents to
render on existing HTML user agents. Note that this recommendation does not define how
HTML conforming user agents should process HTML documents. Nor does it define the meaning
of the Internet Media Type text/html. For these definitions, see [HTML4] and [RFC2854]
respectively.
C.1. Processing Instructions and the XML Declaration
Be aware that processing instructions are rendered on some user agents. Also, some user
agents interpret the XML declaration to mean that the document is unrecognized XML rather
than HTML, and therefore may not render the document as expected. For compatibility with
these types of legacy browsers, you may want to avoid using processing instructions and XML
declarations. Remember, however, that when the XML declaration is not included in a
document, the document can only use the default character encodings UTF‐8 or UTF‐16.
C.2. Empty Elements
Include a space before the trailing / and > of empty elements, e.g. <br />, <hr /> and <img
src="karen.jpg" alt="Karen" />. Also, use the minimized tag syntax for empty elements, e.g. <br
/>, as the alternative syntax <br></br> allowed by XML gives uncertain results in many existing
user agents.
C.3. Element Minimization and Empty Element Content
Given an empty instance of an element whose content model is not EMPTY (for example, an
empty title or paragraph) do not use the minimized form (e.g. use <p> </p> and not <p />).
174
C.4. Embedded Style Sheets and Scripts
Use external style sheets if your style sheet uses < or & or ]]> or ‐‐. Use external scripts if your
script uses < or & or ]]> or ‐‐. Note that XML parsers are permitted to silently remove the
contents of comments. Therefore, the historical practice of "hiding" scripts and style sheets
within "comments" to make the documents backward compatible is likely to not work as
expected in XML‐based user agents.
C.5. Line Breaks within Attribute Values
Avoid line breaks and multiple white space characters within attribute values. These are
handled inconsistently by user agents.
C.6. Isindex
Don't include more than one isindex element in the document head. The isindex element is
deprecated in favor of the input element.
C.7. The lang and xml:lang Attributes
Use both the lang and xml:lang attributes when specifying the language of an element. The
value of the xml:lang attribute takes precedence.
C.8. Fragment Identifiers
In XML, URI‐references [RFC2396] that end with fragment identifiers of the form "#foo" do not
refer to elements with an attribute name="foo"; rather, they refer to elements with an
attribute defined to be of type ID, e.g., the id attribute in HTML 4. Many existing HTML clients
don't support the use of ID‐type attributes in this way, so identical values may be supplied for
both of these attributes to ensure maximum forward and backward compatibility (e.g., <a
id="foo" name="foo">...</a>).
Further, since the set of legal values for attributes of type ID is much smaller than for those of
type CDATA, the type of the name attribute has been changed to NMTOKEN. This attribute is
constrained such that it can only have the same values as type ID, or as the Name production in
XML 1.0 Section 2.3, production 5. Unfortunately, this constraint cannot be expressed in the
XHTML 1.0 DTDs. Because of this change, care must be taken when converting existing HTML
documents. The values of these attributes must be unique within the document, valid, and any
references to these fragment identifiers (both internal and external) must be updated should
the values be changed during conversion.
Note that the collection of legal values in XML 1.0 Section 2.3, production 5 is much larger than
that permitted to be used in the ID and NAME types defined in HTML 4. When defining
fragment identifiers to be backward‐compatible, only strings matching the pattern [A‐Za‐z][A‐
Za‐z0‐9:_.‐]* should be used. See Section 6.2 of [HTML4] for more information.
175
Finally, note that XHTML 1.0 has deprecated the name attribute of the a, applet, form, frame,
iframe, img, and map elements, and it will be removed from XHTML in subsequent versions.
C.9. Character Encoding
Historically, the character encoding of an HTML document is either specified by a web server
via the charset parameter of the HTTP Content‐Type header, or via a meta element in the
document itself. In an XML document, the character encoding of the document is specified on
the XML declaration (e.g., <?xml version="1.0" encoding="EUC‐JP"?>). In order to portably
present documents with specific character encodings, the best approach is to ensure that the
web server provides the correct headers. If this is not possible, a document that wants to set its
character encoding explicitly must include both the XML declaration an encoding declaration
and a meta http‐equiv statement (e.g., <meta http‐equiv="Content‐type" content="text/html;
charset=EUC‐JP" />). In XHTML‐conforming user agents, the value of the encoding declaration
of the XML declaration takes precedence.
Note: be aware that if a document must include the character encoding declaration in a meta
http‐equiv statement, that document may always be interpreted by HTTP servers and/or user
agents as being of the internet media type defined in that statement. If a document is to be
served as multiple media types, the HTTP server must be used to set the encoding of the
document.
C.10. Boolean Attributes
Some HTML user agents are unable to interpret boolean attributes when these appear in their
full (non‐minimized) form, as required by XML 1.0. Note this problem doesn't affect user agents
compliant with HTML 4. The following attributes are involved: compact, nowrap, ismap,
declare, noshade, checked, disabled, readonly, multiple, selected, noresize, defer.
C.11. Document Object Model and XHTML
The Document Object Model level 1 Recommendation [DOM] defines document object model
interfaces for XML and HTML 4. The HTML 4 document object model specifies that HTML
element and attribute names are returned in upper‐case. The XML document object model
specifies that element and attribute names are returned in the case they are specified. In
XHTML 1.0, elements and attributes are specified in lower‐case. This apparent difference can be
addressed in two ways:
1. User agents that access XHTML documents served as Internet media type text/html via
the DOM can use the HTML DOM, and can rely upon element and attribute names being
returned in upper‐case from those interfaces.
2. User agents that access XHTML documents served as Internet media types text/xml,
application/xml, or application/xhtml+xml can also use the XML DOM. Elements and
attributes will be returned in lower‐case. Also, some XHTML elements may or may not
appear in the object tree because they are optional in the content model (e.g. the tbody
element within table). This occurs because in HTML 4 some elements were permitted to
176
be minimized such that their start and end tags are both omitted (an SGML feature).
This is not possible in XML. Rather than require document authors to insert extraneous
elements, XHTML has made the elements optional. User agents need to adapt to this
accordingly. For further information on this topic, see [DOM2]
C.12. Using Ampersands in Attribute Values (and Elsewhere)
In both SGML and XML, the ampersand character ("&") declares the beginning of an entity
reference (e.g., ® for the registered trademark symbol "®"). Unfortunately, many HTML
user agents have silently ignored incorrect usage of the ampersand character in HTML
documents ‐ treating ampersands that do not look like entity references as literal ampersands.
XML‐based user agents will not tolerate this incorrect usage, and any document that uses an
ampersand incorrectly will not be "valid", and consequently will not conform to this
specification. In order to ensure that documents are compatible with historical HTML user
agents and XML‐based user agents, ampersands used in a document that are to be treated as
literal characters must be expressed themselves as an entity reference (e.g. "&"). For
example, when the href attribute of the a element refers to a CGI script that takes parameters,
it must be expressed as http://my.site.dom/cgi‐bin/myscript.pl?class=guest&name=user
rather than as http://my.site.dom/cgi‐bin/myscript.pl?class=guest&name=user.
C.13. Cascading Style Sheets (CSS) and XHTML
The Cascading Style Sheets level 2 Recommendation [CSS2] defines style properties which are
applied to the parse tree of the HTML or XML documents. Differences in parsing will produce
different visual or aural results, depending on the selectors used. The following hints will
reduce this effect for documents which are served without modification as both media types:
1. CSS style sheets for XHTML should use lower case element and attribute names.
2. In tables, the tbody element will be inferred by the parser of an HTML user agent, but
not by the parser of an XML user agent. Therefore you should always explicitly add a
tbody element if it is referred to in a CSS selector.
3. Within the XHTML namespace, user agents are expected to recognize the "id" attribute
as an attribute of type ID. Therefore, style sheets should be able to continue using the
shorthand "#" selector syntax even if the user agent does not read the DTD.
4. Within the XHTML namespace, user agents are expected to recognize the "class"
attribute. Therefore, style sheets should be able to continue using the shorthand "."
selector syntax.
5. CSS defines different conformance rules for HTML and XML documents; be aware that
the HTML rules apply to XHTML documents delivered as HTML and the XML rules apply
to XHTML documents delivered as XML.
C.14. Referencing Style Elements when serving as XML
In HTML 4 and XHTML, the style element can be used to define document‐internal style rules.
In XML, an XML stylesheet declaration is used to define style rules. In order to be compatible
177
with this convention, style elements should have their fragment identifier set using the id
attribute, and an XML stylesheet declaration should reference this fragment. For example:
<?xml‐stylesheet href="http://www.w3.org/StyleSheets/TR/W3C‐REC.css" type="text/css"?>
<?xml‐stylesheet href="#internalStyle" type="text/css"?>
<!DOCTYPE html
PUBLIC "‐//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1‐strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>An internal stylesheet example</title>
<style type="text/css" id="internalStyle">
code {
color: green;
font‐family: monospace;
font‐weight: bold;
}
</style>
</head>
<body>
<p>
This is text that uses our
<code>internal stylesheet</code>.
</p>
</body>
</html>
C.15. White Space Characters in HTML vs. XML
Some characters that are legal in HTML documents, are illegal in XML document. For example,
in HTML, the Formfeed character (U+000C) is treated as white space, in XHTML, due to XML's
definition of characters, it is illegal.
C.16. The Named Character Reference '
The named character reference ' (the apostrophe, U+0027) was introduced in XML 1.0
but does not appear in HTML. Authors should therefore use ' instead of ' to work as
expected in HTML 4 user agents.
1. Overview
This document defines VoiceXML, the Voice Extensible Markup Language. Its background, basic
concepts and use are presented in Section 1. The dialog constructs of form, menu and link, and
the mechanism (Form Interpretation Algorithm) by which they are interpreted are then
introduced in Section 2. User input using DTMF and speech grammars is covered in Section 3,
178
while Section 4 covers system output using speech synthesis and recorded audio. Mechanisms
for manipulating dialog control flow, including variables, events, and executable elements, are
explained in Section 5. Environment features such as parameters and properties as well as
resource handling are specified in Section 6. The appendices provide additional information
including the VoiceXML Schema, a detailed specification of the Form Interpretation Algorithm
and timing, audio file formats, and statements relating to conformance, internationalization,
accessibility and privacy.
The origins of VoiceXML began in 1995 as an XML‐based dialog design language intended to
simplify the speech recognition application development process within an AT&T project called
Phone Markup Language (PML). As AT&T reorganized, teams at AT&T, Lucent and Motorola
continued working on their own PML‐like languages.
In 1998, W3C hosted a conference on voice browsers. By this time, AT&T and Lucent had
different variants of their original PML, while Motorola had developed VoxML, and IBM was
developing its own SpeechML. Many other attendees at the conference were also developing
similar languages for dialog design; for example, such as HP's TalkML and PipeBeach's
VoiceHTML.
The VoiceXML Forum was then formed by AT&T, IBM, Lucent, and Motorola to pool their
efforts. The mission of the VoiceXML Forum was to define a standard dialog design language
that developers could use to build conversational applications. They chose XML as the basis for
this effort because it was clear to them that this was the direction technology was going.
In 2000, the VoiceXML Forum released VoiceXML 1.0 to the public. Shortly thereafter,
VoiceXML 1.0 was submitted to the W3C as the basis for the creation of a new international
standard. VoiceXML 2.0 is the result of this work based on input from W3C Member companies,
other W3C Working Groups, and the public.
Developers familiar with VoiceXML 1.0 are particularly directed to Changes from Previous
Public Version which summarizes how VoiceXML 2.0 differs from VoiceXML 1.0.
1.1 Introduction
VoiceXML is designed for creating audio dialogs that feature synthesized speech, digitized
audio, recognition of spoken and DTMF key input, recording of spoken input, telephony, and
mixed initiative conversations. Its major goal is to bring the advantages of Web‐based
development and content delivery to interactive voice response applications.
Here are two short examples of VoiceXML. The first is the venerable "Hello World":
<?xml version="1.0" encoding="UTF‐8"?>
<vxml xmlns="http://www.w3.org/2001/vxml"
xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance"
xsi:schemaLocation="http://www.w3.org/2001/vxml
http://www.w3.org/TR/voicexml20/vxml.xsd"
179
version="2.0">
<form>
<block>Hello World!</block>
</form>
</vxml>
The top‐level element is <vxml>, which is mainly a container for dialogs. There are two types of
dialogs: forms and menus. Forms present information and gather input; menus offer choices of
what to do next. This example has a single form, which contains a block that synthesizes and
presents "Hello World!" to the user. Since the form does not specify a successor dialog, the
conversation ends.
Our second example asks the user for a choice of drink and then submits it to a server script:
<?xml version="1.0" encoding="UTF‐8"?>
<vxml xmlns="http://www.w3.org/2001/vxml"
xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance"
xsi:schemaLocation="http://www.w3.org/2001/vxml
http://www.w3.org/TR/voicexml20/vxml.xsd"
version="2.0">
<form>
<field name="drink">
<prompt>Would you like coffee, tea, milk, or nothing?</prompt>
<grammar src="drink.grxml" type="application/srgs+xml"/>
</field>
<block>
<submit next="http://www.drink.example.com/drink2.asp"/>
</block>
</form>
</vxml>
A field is an input field. The user must provide a value for the field before proceeding to the
next element in the form. A sample interaction is:
C (computer): Would you like coffee, tea, milk, or nothing?
H (human): Orange juice.
C: I did not understand what you said. (a platform‐specific default message.)
C: Would you like coffee, tea, milk, or nothing?
H: Tea
C: (continues in document drink2.asp)
180
1.2 Background
This section contains a high‐level architectural model, whose terminology is then used to
describe the goals of VoiceXML, its scope, its design principles, and the requirements it places
on the systems that support it.
1.2.1 Architectural Model
The architectural model assumed by this document has the following components:
Figure 1: Architectural Model
A document server (e.g. a Web server) processes requests from a client application, the
VoiceXML Interpreter, through the VoiceXML interpreter context. The server produces
VoiceXML documents in reply, which are processed by the VoiceXML interpreter. The VoiceXML
interpreter context may monitor user inputs in parallel with the VoiceXML interpreter. For
example, one VoiceXML interpreter context may always listen for a special escape phrase that
takes the user to a high‐level personal assistant, and another may listen for escape phrases that
alter user preferences like volume or text‐to‐speech characteristics.
The implementation platform is controlled by the VoiceXML interpreter context and by the
VoiceXML interpreter. For instance, in an interactive voice response application, the VoiceXML
interpreter context may be responsible for detecting an incoming call, acquiring the initial
VoiceXML document, and answering the call, while the VoiceXML interpreter conducts the
dialog after answer. The implementation platform generates events in response to user actions
(e.g. spoken or character input received, disconnect) and system events (e.g. timer expiration).
Some of these events are acted upon by the VoiceXML interpreter itself, as specified by the
VoiceXML document, while others are acted upon by the VoiceXML interpreter context.
181
1.2.2 Goals of VoiceXML
VoiceXML's main goal is to bring the full power of Web development and content delivery to
voice response applications, and to free the authors of such applications from low‐level
programming and resource management. It enables integration of voice services with data
services using the familiar client‐server paradigm. A voice service is viewed as a sequence of
interaction dialogs between a user and an implementation platform. The dialogs are provided
by document servers, which may be external to the implementation platform. Document
servers maintain overall service logic, perform database and legacy system operations, and
produce dialogs. A VoiceXML document specifies each interaction dialog to be conducted by a
VoiceXML interpreter. User input affects dialog interpretation and is collected into requests
submitted to a document server. The document server replies with another VoiceXML
document to continue the user's session with other dialogs.
VoiceXML is a markup language that:
• Minimizes client/server interactions by specifying multiple interactions per document.
• Shields application authors from low‐level, and platform‐specific details.
• Separates user interaction code (in VoiceXML) from service logic (e.g. CGI scripts).
• Promotes service portability across implementation platforms. VoiceXML is a common
language for content providers, tool providers, and platform providers.
• Is easy to use for simple interactions, and yet provides language features to support
complex dialogs.
While VoiceXML strives to accommodate the requirements of a majority of voice response
services, services with stringent requirements may best be served by dedicated applications
that employ a finer level of control.
1.2.3 Scope of VoiceXML
The language describes the human‐machine interaction provided by voice response systems,
which includes:
• Output of synthesized speech (text‐to‐speech).
• Output of audio files.
• Recognition of spoken input.
• Recognition of DTMF input.
• Recording of spoken input.
• Control of dialog flow.
• Telephony features such as call transfer and disconnect.
The language provides means for collecting character and/or spoken input, assigning the input
results to document‐defined request variables, and making decisions that affect the
interpretation of documents written in the language. A document may be linked to other
documents through Universal Resource Identifiers (URIs).
182
1.2.4 Principles of Design
VoiceXML is an XML application [XML].
1. The language promotes portability of services through abstraction of platform
resources.
2. The language accommodates platform diversity in supported audio file formats, speech
grammar formats, and URI schemes. While producers of platforms may support various
grammar formats the language requires a common grammar format, namely the XML
Form of the W3C Speech Recognition Grammar Specification [SRGS], to facilitate
interoperability. Similarly, while various audio formats for playback and recording may
be supported, the audio formats described in Appendix E must be supported
3. The language supports ease of authoring for common types of interactions.
4. The language has well‐defined semantics that preserves the author's intent regarding
the behavior of interactions with the user. Client heuristics are not required to
determine document element interpretation.
5. The language recognizes semantic interpretations from grammars and makes this
information available to the application.
6. The language has a control flow mechanism.
7. The language enables a separation of service logic from interaction behavior.
8. It is not intended for intensive computation, database operations, or legacy system
operations. These are assumed to be handled by resources outside the document
interpreter, e.g. a document server.
9. General service logic, state management, dialog generation, and dialog sequencing are
assumed to reside outside the document interpreter.
10. The language provides ways to link documents using URIs, and also to submit data to
server scripts using URIs.
11. VoiceXML provides ways to identify exactly which data to submit to the server, and
which HTTP method (GET or POST) to use in the submittal.
12. The language does not require document authors to explicitly allocate and deallocate
dialog resources, or deal with concurrency. Resource allocation and concurrent threads
of control are to be handled by the implementation platform.
1.2.5 Implementation Platform Requirements
This section outlines the requirements on the hardware/software platforms that will support a
VoiceXML interpreter.
Document acquisition. The interpreter context is expected to acquire documents for the
VoiceXML interpreter to act on. The "http" URI scheme must be supported. In some cases, the
document request is generated by the interpretation of a VoiceXML document, while other
requests are generated by the interpreter context in response to events outside the scope of
the language, for example an incoming phone call. When issuing document requests via http,
the interpreter context identifies itself using the "User‐Agent" header variable with the value
"<name>/<version>", for example, "acme‐browser/1.2"
183
Audio output. An implementation platform must support audio output using audio files and
text‐to‐speech (TTS). The platform must be able to freely sequence TTS and audio output. If an
audio output resource is not available, an error.noresource event must be thrown. Audio files
are referred to by a URI. The language specifies a required set of audio file formats which must
be supported (see Appendix E); additional audio file formats may also be supported.
Audio input. An implementation platform is required to detect and report character and/or
spoken input simultaneously and to control input detection interval duration with a timer
whose length is specified by a VoiceXML document. If an audio input resource is not available,
an error.noresource event must be thrown.
• It must report characters (for example, DTMF) entered by a user. Platforms must
support the XML form of DTMF grammars described in the W3C Speech Recognition
Grammar Specification [SRGS]. They should also support the Augmented BNF (ABNF)
form of DTMF grammars described in the W3C Speech Recognition Grammar
Specification [SRGS].
• It must be able to receive speech recognition grammar data dynamically. It must be able
to use speech grammar data in the XML Form of the W3C Speech Recognition Grammar
Specification [SRGS]. It should be able to receive speech recognition grammar data in
the ABNF form of the W3C Speech Recognition Grammar Specification [SRGS], and may
support other formats such as the JSpeech Grammar Format [JSGF] or proprietary
formats. Some VoiceXML elements contain speech grammar data; others refer to
speech grammar data through a URI. The speech recognizer must be able to
accommodate dynamic update of the spoken input for which it is listening through
either method of speech grammar data specification.
• It must be able to record audio received from the user. The implementation platform
must be able to make the recording available to a request variable. The language
specifies a required set of recorded audio file formats which must be supported (see
Appendix E); additional formats may also be supported.
Transfer The platform should be able to support making a third party connection through a
communications network, such as the telephone.
1.3 Concepts
A VoiceXML document (or a set of related documents called an application) forms a
conversational finite state machine. The user is always in one conversational state, or dialog, at
a time. Each dialog determines the next dialog to transition to. Transitions are specified using
URIs, which define the next document and dialog to use. If a URI does not refer to a document,
the current document is assumed. If it does not refer to a dialog, the first dialog in the
document is assumed. Execution is terminated when a dialog does not specify a successor, or if
it has an element that explicitly exits the conversation.
184
1.3.1 Dialogs and Subdialogs
There are two kinds of dialogs: forms and menus. Forms define an interaction that collects
values for a set of form item variables. Each field may specify a grammar that defines the
allowable inputs for that field. If a form‐level grammar is present, it can be used to fill several
fields from one utterance. A menu presents the user with a choice of options and then
transitions to another dialog based on that choice.
A subdialog is like a function call, in that it provides a mechanism for invoking a new
interaction, and returning to the original form. Variable instances, grammars, and state
information are saved and are available upon returning to the calling document. Subdialogs can
be used, for example, to create a confirmation sequence that may require a database query; to
create a set of components that may be shared among documents in a single application; or to
create a reusable library of dialogs shared among many applications.
1.3.2 Sessions
A session begins when the user starts to interact with a VoiceXML interpreter context,
continues as documents are loaded and processed, and ends when requested by the user, a
document, or the interpreter context.
1.3.3 Applications
An application is a set of documents sharing the same application root document. Whenever
the user interacts with a document in an application, its application root document is also
loaded. The application root document remains loaded while the user is transitioning between
other documents in the same application, and it is unloaded when the user transitions to a
document that is not in the application. While it is loaded, the application root document's
variables are available to the other documents as application variables, and its grammars
remain active for the duration of the application, subject to the grammar activation rules
discussed in Section 3.1.4.
Figure 2 shows the transition of documents (D) in an application that share a common
application root document (root).
Figure 2: Transitioning between documents in an application.
185
1.3.4 Grammars
Each dialog has one or more speech and/or DTMF grammars associated with it. In machine
directed applications, each dialog's grammars are active only when the user is in that dialog. In
mixed initiative applications, where the user and the machine alternate in determining what to
do next, some of the dialogs are flagged to make their grammars active (i.e., listened for) even
when the user is in another dialog in the same document, or on another loaded document in
the same application. In this situation, if the user says something matching another dialog's
active grammars, execution transitions to that other dialog, with the user's utterance treated as
if it were said in that dialog. Mixed initiative adds flexibility and power to voice applications.
1.3.5 Events
VoiceXML provides a form‐filling mechanism for handling "normal" user input. In addition,
VoiceXML defines a mechanism for handling events not covered by the form mechanism.
Events are thrown by the platform under a variety of circumstances, such as when the user
does not respond, doesn't respond intelligibly, requests help, etc. The interpreter also throws
events if it finds a semantic error in a VoiceXML document. Events are caught by catch
elements or their syntactic shorthand. Each element in which an event can occur may specify
catch elements. Furthermore, catch elements are also inherited from enclosing elements "as if
by copy". In this way, common event handling behavior can be specified at any level, and it
applies to all lower levels.
1.3.6 Links
A link supports mixed initiative. It specifies a grammar that is active whenever the user is in the
scope of the link. If user input matches the link's grammar, control transfers to the link's
destination URI. A link can be used to throw an event or go to a destination URI.
1.4 VoiceXML Elements
Element Purpose Section
186
Element Purpose Section
A dialog for presenting information and collecting
<form> 2.1
data
Go to another dialog in the same or different
<goto> 5.3.7
document
Declares initial logic upon entry into a (mixed
<initial> 2.3.3
initiative) form
Specify a transition common to all dialogs in the
<link> 2.5
link's scope
187
Element Purpose Section
A dialog for choosing amongst alternative
<menu> 2.2.1
destinations
Define metadata information using a metadata
<metadata> 6.2.2
schema
Queue speech synthesis and audio output to the
<prompt> 4.1
user
Play a field prompt when a field is re‐visited after an
<reprompt> 5.3.6
event
Specify a block of ECMAScript client‐side scripting
<script> 5.3.12
logic
188
Element Purpose Section
one
Table 1: VoiceXML Elements
1.5 Document Structure and Execution
A VoiceXML document is primarily composed of top‐level elements called dialogs. There are
two types of dialogs: forms and menus. A document may also have <meta> and <metadata>
elements, <var> and <script> elements, <property> elements, <catch> elements, and <link>
elements.
1.5.1 Execution within One Document
Document execution begins at the first dialog by default. As each dialog executes, it determines
the next dialog. When a dialog doesn't specify a successor dialog, document execution stops.
Here is "Hello World!" expanded to illustrate some of this. It now has a document level variable
called "hi" which holds the greeting. Its value is used as the prompt in the first form. Once the
first form plays the greeting, it goes to the form named "say_goodbye", which prompts the user
with "Goodbye!" Because the second form does not transition to another dialog, it causes the
document to be exited.
<?xml version="1.0" encoding="UTF‐8"?>
<vxml xmlns="http://www.w3.org/2001/vxml"
xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance"
xsi:schemaLocation="http://www.w3.org/2001/vxml
http://www.w3.org/TR/voicexml20/vxml.xsd"
version="2.0">
189
<meta name="author" content="John Doe"/>
<meta name="maintainer" content="hello‐support@hi.example.com"/>
<var name="hi" expr="'Hello World!'"/>
<form>
<block>
<value expr="hi"/>
<goto next="#say_goodbye"/>
</block>
</form>
<form id="say_goodbye">
<block>
Goodbye!
</block>
</form>
</vxml>
Alternatively the forms can be combined:
<?xml version="1.0" encoding="UTF‐8"?>
<vxml xmlns="http://www.w3.org/2001/vxml"
xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance"
xsi:schemaLocation="http://www.w3.org/2001/vxml
http://www.w3.org/TR/voicexml20/vxml.xsd"
version="2.0">
<meta name="author" content="John Doe"/>
<meta name="maintainer" content="hello‐support@hi.example.com"/>
<var name="hi" expr="'Hello World!'"/>
<form>
<block>
<value expr="hi"/> Goodbye!
</block>
</form>
</vxml>
Attributes of <vxml> include:
The version of VoiceXML of this document (required). The
version
current version number is 2.0.
The designated namespace for VoiceXML (required). The
xmlns namespace for VoiceXML is defined to be
http://www.w3.org/2001/vxml.
190
The base URI for this document as defined in [XML‐BASE]. As in
xml:base [HTML], a URI which all relative references within the
document take as their base.
The language identifier for this document . If omitted, the
xml:lang
value is a platform‐specific default.
application The URI of this document's application root document, if any.
Table 2: <vxml> Attributes
Language information is inherited down the document hierarchy: the value of "xml:lang" is
inherited by elements which also define the "xml:lang" attribute, such as <grammar> and
<prompt>, unless these elements specify an alternative value.
1.5.2 Executing a Multi‐Document Application
Normally, each document runs as an isolated application. In cases where you want multiple
documents to work together as one application, you select one document to be the application
root document, and the rest to be application leaf documents. Each leaf document names the
root document in its <vxml> element.
When this is done, every time the interpreter is told to load and execute a leaf document in this
application, it first loads the application root document if it is not already loaded. The
application root document remains loaded until the interpreter is told to load a document that
belongs to a different application. Thus one of the following two conditions always holds during
interpretation:
• The application root document is loaded and the user is executing in it: there is no leaf
document.
• The application root document and a single leaf document are both loaded and the user
is executing in the leaf document.
If there is a chain of subdialogs defined in separate documents, then there may be more than
one leaf document loaded although execution will only be in one of these documents.
When a leaf document load causes a root document load, none of the dialogs in the root
document are executed. Execution begins in the leaf document.
There are several benefits to multi‐document applications.
• The root document's variables are available for use by the leaf documents, so that
information can be shared and retained.
191
• Root document <property> elements specify default values for properties used in the
leaf documents.
• Common ECMAScript code can be defined in root document <script> elements and used
in the leaf documents.
• Root document <catch> elements define default event handling for the leaf documents.
• Document‐scoped grammars in the root document are active when the user is in a leaf
document, so that the user is able to interact with forms, links, and menus in the root
document.
Here is a two‐document application illustrating this:
Application root document (app‐root.vxml)
<?xml version="1.0" encoding="UTF‐8"?>
<vxml xmlns="http://www.w3.org/2001/vxml"
xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance"
xsi:schemaLocation="http://www.w3.org/2001/vxml
http://www.w3.org/TR/voicexml20/vxml.xsd"
version="2.0">
<var name="bye" expr="'Ciao'"/>
<link next="operator_xfer.vxml">
<grammar type="application/srgs+xml" root="root" version="1.0">
<rule id="root" scope="public">operator</rule>
</grammar>
</link>
</vxml>
Leaf document (leaf.vxml)
<?xml version="1.0" encoding="UTF‐8"?>
<vxml xmlns="http://www.w3.org/2001/vxml"
xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance"
xsi:schemaLocation="http://www.w3.org/2001/vxml
http://www.w3.org/TR/voicexml20/vxml.xsd"
version="2.0" application="app‐root.vxml">
<form id="say_goodbye">
<field name="answer">
<grammar type="application/srgs+xml" src="/grammars/boolean.grxml"/>
<prompt>Shall we say <value expr="application.bye"/>?</prompt>
<filled>
<if cond="answer">
<exit/>
</if>
<clear namelist="answer"/>
</filled>
</field>
192
</form>
</vxml>
In this example, the application is designed so that leaf.vxml must be loaded first. Its application
attribute specifies that app‐root.vxml should be used as the application root document. So,
app‐root.vxml is then loaded, which creates the application variable bye and also defines a link
that navigates to operator‐xfer.vxml whenever the user says "operator". The user starts out in
the say_goodbye form:
C: Shall we say Ciao?
H: Si.
C: I did not understand what you said. (a platform‐specific default message.)
C: Shall we say Ciao?
H: Ciao
C: I did not understand what you said.
H: Operator.
C: (Goes to operator_xfer.vxml, which transfers the caller to a human operator.)
Note that when the user is in a multi‐document application, at most two documents are loaded
at any one time: the application root document and, unless the user is actually interacting with
the application root document, an application leaf document. A root document's <vxml>
element does not have an application attribute specified. A leaf document's <vxml> element
does have an application attribute specified. An interpreter always has an application root
document loaded; it does not always have an application leaf document loaded.
The name of the interpreter's current application is the application root document's absolute
URI. The absolute URI includes a query string, if present, but it does not include a fragment
identifier. The interpreter remains in the same application as long as the name remains the
same. When the name changes, a new application is entered and its root context is initialized.
The application's root context consists of the variables, grammars, catch elements, scripts, and
properties in application scope.
During a user session an interpreter transitions from one document to another as requested by
<choice>, <goto> <link>, <subdialog>, and <submit> elements. Some transitions are within an
application, others are between applications. The preservation or initialization of the root
context depends on the type of transition:
Root to Leaf Within Application
193
A root to leaf transition within the same application occurs when the current document
is a root document and the target document's application attribute's value resolves to
the same absolute URI as the name of the current application. The application root
document and its context are preserved.
Leaf to Leaf Within Application
A leaf to leaf transition within the same application occurs when the current document
is a leaf document and the target document's application attribute's value resolves to
the same absolute URI as the name of the current application. The application root
document and its context are preserved.
Leaf to Root Within Application
A leaf to root transition within the same application occurs when the current document
is a leaf document and the target document's absolute URI is the same as the name of
the current application. The current application root document and its context are
preserved when the transition is caused by a <choice>, <goto>, or <link> element. The
root context is initialized when a <submit> element causes the leaf to root transition,
because a <submit> always results in a fetch of its URI.
Root to Root
A root to root transition occurs when the current document is a root document and the
target document is a root document, i.e. it does not have an application attribute. The
root context is initialized with the application root document returned by the caching
policy in Section 6.1.2. The caching policy is consulted even when the name of the target
application and the current application are the same.
Subdialog
A subdialog invocation occurs when a root or leaf document executes a <subdialog>
element. As discussed in Section 2.3.4, subdialog invocation creates a new execution
context. The application root document and its context in the calling document's
execution context are preserved untouched during subdialog execution, and are used
again once the subdialog returns. A subdialog's new execution context has its own root
context and, possibly, leaf context. When the subdialog is invoked with a non‐empty URI
reference, the caching policy in Section 6.1.2 is used to acquire the root and leaf
documents that will be used to initialize the new root and leaf contexts. If a subdialog is
invoked with an empty URI reference and a fragment identifier, e.g. "#sub1", the root
and leaf documents remain unchanged, and therefore the current root and leaf
documents will be used to initialize the new root and leaf contexts.
Inter‐Application Transitions
194
All other transitions are between applications which cause the application root context
to be initialized with the next application's root document.
If a document refers to a non‐existent application root document, an error.badfetch event is
thrown. If a document's application attribute refers to a document that also has an application
attribute specified, an error.semantic event is thrown.
The following diagrams illustrate the effect of the transitions between root and leaf documents
on the application root context. In these diagrams, boxes represent documents, box texture
changes identify root context initialization, solid arrows symbolize transitions to the URI in the
arrow's label, dashed vertical arrows indicate an application attribute whose URI is the arrow's
label.
Figure 3: Transitions that Preserve the Root Context
In this diagram, all the documents belong to the same application. The transitions are identified
by the numbers 1‐4 across the top of the figure. They are:
1. A transition to URI A results in document 1, the application context is initialized from
document 1's content. Assume that this is the first document in the session. The current
application's name is A.
2. Document 1 specifies a transition to URI B, which yields document 2. Document 2's
application attribute equals URI A. The root is document 1 with its context preserved.
This is a root to leaf transition within the same application.
3. Document 2 specifies a transition to URI C, which yields another leaf document,
document 3. Its application attribute also equals URI A. The root is document 1 with its
context preserved. This is a leaf to leaf transition within the same application.
4. Document 3 specifies a transition to URI A using a <choice>, <goto>, or <link>.
Document 1 is used with its root context intact. This is a leaf to root transition within the
same application.
The next diagram illustrates transitions which initialize the root context.
195
Figure 4: Transitions that Initialize the Root Context
5. Document 1 specifies a transition to its own URI A. The resulting document 4 does not
have an application attribute, so it is considered a root document, and the root context
is initialized. This is a root to root transition.
6. Document 4 specifies a transition to URI D, which yields a leaf document 5. Its
application attribute is different: URI E. A new application is being entered. URI E
produces the root document 6. The root context is initialized from the content of
document 6. This is an inter‐application transition.
7. Document 5 specifies a transition to URI A. The cache check returns document 4 which
does not have an application attribute and therefore belongs to application A, so the
root context is initialized. Initialization occurs even though this application and this root
document were used earlier in the session. This is an inter‐application transition.
1.5.3 Subdialogs
A subdialog is a mechanism for decomposing complex sequences of dialogs to better structure
them, or to create reusable components. For example, the solicitation of account information
may involve gathering several pieces of information, such as account number, and home
telephone number. A customer care service might be structured with several independent
applications that could share this basic building block, thus it would be reasonable to construct
it as a subdialog. This is illustrated in the example below. The first document, app.vxml, seeks to
adjust a customer's account, and in doing so must get the account information and then the
adjustment level. The account information is obtained by using a subdialog element that
invokes another VoiceXML document to solicit the user input. While the second document is
being executed, the calling dialog is suspended, awaiting the return of information. The second
document provides the results of its user interactions using a <return> element, and the
resulting values are accessed through the variable defined by the name attribute on the
<subdialog> element.
Customer Service Application (app.vxml)
<?xml version="1.0" encoding="UTF‐8"?>
196
<vxml xmlns="http://www.w3.org/2001/vxml"
xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance"
xsi:schemaLocation="http://www.w3.org/2001/vxml
http://www.w3.org/TR/voicexml20/vxml.xsd"
version="2.0">
<form id="billing_adjustment">
<var name="account_number"/>
<var name="home_phone"/>
<subdialog name="accountinfo" src="acct_info.vxml#basic">
<filled>
<!‐‐ Note the variable defined by "accountinfo" is
returned as an ECMAScript object and it contains two
properties defined by the variables specified in the
"return" element of the subdialog. ‐‐>
<assign name="account_number" expr="accountinfo.acctnum"/>
<assign name="home_phone" expr="accountinfo.acctphone"/>
</filled>
</subdialog>
<field name="adjustment_amount">
<grammar type="application/srgs+xml" src="/grammars/currency.grxml"/>
<prompt>
What is the value of your account adjustment?
</prompt>
<filled>
<submit next="/cgi‐bin/updateaccount"/>
</filled>
</field>
</form>
</vxml>
Document Containing Account Information Subdialog (acct_info.vxml)
<?xml version="1.0" encoding="UTF‐8"?>
<vxml xmlns="http://www.w3.org/2001/vxml"
xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance"
xsi:schemaLocation="http://www.w3.org/2001/vxml
http://www.w3.org/TR/voicexml20/vxml.xsd"
version="2.0">
<form id="basic">
<field name="acctnum">
<grammar type="application/srgs+xml" src="/grammars/digits.grxml"/>
<prompt> What is your account number? </prompt>
</field>
<field name="acctphone">
197
<grammar type="application/srgs+xml" src="/grammars/phone_numbers.grxml"/>
<prompt> What is your home telephone number? </prompt>
<filled>
<!‐‐ The values obtained by the two fields are supplied
to the calling dialog by the "return" element. ‐‐>
<return namelist="acctnum acctphone"/>
</filled>
</field>
</form>
</vxml>
Subdialogs add a new execution context when they are invoked.The subdialog could be a new
dialog within the existing document, or a new dialog within a new document.
Subdialogs can be composed of several documents. Figure 5 shows the execution flow where a
sequence of documents (D) transitions to a subdialog (SD) and then back.
Figure 5: Subdialog composed of several documents
returning from the last subdialog document.
The execution context in dialog D2 is suspended when it invokes the subdialog SD1 in document
sd1.vxml. This subdialog specifies execution is to be transfered to the dialog in sd2.vxml (using
<goto>). Consequently, when the dialog in sd2.vxml returns, control is returned directly to
dialog D2.
Figure 6 shows an example of a multi‐document subdialog where control is transferred from
one subdialog to another.
198
Figure 6: Subdialog composed of several documents
returning from the first subdialog document.
The subdialog in sd1.vxml specifies that control is to be transfered to a second subdialog, SD2,
in sd2.vxml. When executing SD2, there are two suspended contexts: the dialog context in D2 is
suspending awaiting SD1 to return; and the dialog context in SD1 awaiting SD2 to return. When
SD2 returns, control is returned to the SD1. It in turn returns control to dialog D2.
1.5.4 Final Processing
Under certain circumstances (in particular, while the VoiceXML interpreter is processing a
disconnect event) the interpreter may continue executing in the final processing state after
there is no longer a connection to allow the interpreter to interact with the end user. The
purpose of this state is to allow the VoiceXML application to perform any necessary final
cleanup, such as submitting information to the application server. For example, the following
<catch> element will catch the connection.disconnect.hangup event and execute in the final
processing state:
<catch event="connection.disconnect.hangup">
<submit namelist="myExit" next="http://mysite/exit.jsp"/>
</catch>
While in the final processing state the application must remain in the transitioning state and
may not enter the waiting state (as described in Section 4.1.8). Thus for example the application
should not enter <field>, <record>, or <transfer> while in the final processing state. The
VoiceXML interpreter must exit if the VoiceXML application attempts to enter the waiting state
while in the final processing state.
199
Aside from this restriction, execution of the VoiceXML application continues normally while in
the final processing state. Thus for example the application may transition between documents
while in the final processing state, and the interpreter must exit if no form item is eligible to be
selected (as described in Section 2.1.1).
1 Introduction
This specification defines the syntax and semantics of the XSLT language. A transformation in
the XSLT language is expressed as a well‐formed XML document [XML] conforming to the
Namespaces in XML Recommendation [XML Names], which may include both elements that are
defined by XSLT and elements that are not defined by XSLT. XSLT‐defined elements are
distinguished by belonging to a specific XML namespace (see [2.1 XSLT Namespace]), which is
referred to in this specification as the XSLT namespace. Thus this specification is a definition of
the syntax and semantics of the XSLT namespace.
A transformation expressed in XSLT describes rules for transforming a source tree into a result
tree. The transformation is achieved by associating patterns with templates. A pattern is
matched against elements in the source tree. A template is instantiated to create part of the
result tree. The result tree is separate from the source tree. The structure of the result tree can
be completely different from the structure of the source tree. In constructing the result tree,
elements from the source tree can be filtered and reordered, and arbitrary structure can be
added.
A transformation expressed in XSLT is called a stylesheet. This is because, in the case when XSLT
is transforming into the XSL formatting vocabulary, the transformation functions as a
stylesheet.
This document does not specify how an XSLT stylesheet is associated with an XML document. It
is recommended that XSL processors support the mechanism described in [XML Stylesheet].
When this or any other mechanism yields a sequence of more than one XSLT stylesheet to be
applied simultaneously to a XML document, then the effect should be the same as applying a
single stylesheet that imports each member of the sequence in order (see [2.6.2 Stylesheet
Import]).
A stylesheet contains a set of template rules. A template rule has two parts: a pattern which is
matched against nodes in the source tree and a template which can be instantiated to form
part of the result tree. This allows a stylesheet to be applicable to a wide class of documents
that have similar source tree structures.
A template is instantiated for a particular source element to create part of the result tree. A
template can contain elements that specify literal result element structure. A template can also
contain elements from the XSLT namespace that are instructions for creating result tree
fragments. When a template is instantiated, each instruction is executed and replaced by the
result tree fragment that it creates. Instructions can select and process descendant source
elements. Processing a descendant element creates a result tree fragment by finding the
applicable template rule and instantiating its template. Note that elements are only processed
200
when they have been selected by the execution of an instruction. The result tree is constructed
by finding the template rule for the root node and instantiating its template.
In the process of finding the applicable template rule, more than one template rule may have a
pattern that matches a given element. However, only one template rule will be applied. The
method for deciding which template rule to apply is described in [5.5 Conflict Resolution for
Template Rules].
A single template by itself has considerable power: it can create structures of arbitrary
complexity; it can pull string values out of arbitrary locations in the source tree; it can generate
structures that are repeated according to the occurrence of elements in the source tree. For
simple transformations where the structure of the result tree is independent of the structure of
the source tree, a stylesheet can often consist of only a single template, which functions as a
template for the complete result tree. Transformations on XML documents that represent data
are often of this kind (see [D.2 Data Example]). XSLT allows a simplified syntax for such
stylesheets (see [2.3 Literal Result Element as Stylesheet]).
When a template is instantiated, it is always instantiated with respect to a current node and a
current node list. The current node is always a member of the current node list. Many
operations in XSLT are relative to the current node. Only a few instructions change the current
node list or the current node (see [5 Template Rules] and [8 Repetition]); during the
instantiation of one of these instructions, the current node list changes to a new list of nodes
and each member of this new list becomes the current node in turn; after the instantiation of
the instruction is complete, the current node and current node list revert to what they were
before the instruction was instantiated.
XSLT makes use of the expression language defined by [XPath] for selecting elements for
processing, for conditional processing and for generating text.
XSLT provides two "hooks" for extending the language, one hook for extending the set of
instruction elements used in templates and one hook for extending the set of functions used in
XPath expressions. These hooks are both based on XML namespaces. This version of XSLT does
not define a mechanism for implementing the hooks. See [14 Extensions].
NOTE:The XSL WG intends to define such a mechanism in a future version of this specification
or in a separate specification.
The element syntax summary notation used to describe the syntax of XSLT‐defined elements is
described in [18 Notation].
The MIME media types text/xml and application/xml [RFC2376] should be used for XSLT
stylesheets. It is possible that a media type will be registered specifically for XSLT stylesheets; if
and when it is, that media type may also be used.
201
2 Stylesheet Structure
2.1 XSLT Namespace
The XSLT namespace has the URI http://www.w3.org/1999/XSL/Transform.
NOTE:The 1999 in the URI indicates the year in which the URI was allocated by the W3C. It does
not indicate the version of XSLT being used, which is specified by attributes (see [2.2 Stylesheet
Element] and [2.3 Literal Result Element as Stylesheet]).
XSLT processors must use the XML namespaces mechanism [XML Names] to recognize
elements and attributes from this namespace. Elements from the XSLT namespace are
recognized only in the stylesheet not in the source document. The complete list of XSLT‐defined
elements is specified in [B Element Syntax Summary]. Vendors must not extend the XSLT
namespace with additional elements or attributes. Instead, any extension must be in a separate
namespace. Any namespace that is used for additional instruction elements must be identified
by means of the extension element mechanism specified in [14.1 Extension Elements].
This specification uses a prefix of xsl: for referring to elements in the XSLT namespace.
However, XSLT stylesheets are free to use any prefix, provided that there is a namespace
declaration that binds the prefix to the URI of the XSLT namespace.
An element from the XSLT namespace may have any attribute not from the XSLT namespace,
provided that the expanded‐name of the attribute has a non‐null namespace URI. The presence
of such attributes must not change the behavior of XSLT elements and functions defined in this
document. Thus, an XSLT processor is always free to ignore such attributes, and must ignore
such attributes without giving an error if it does not recognize the namespace URI. Such
attributes can provide, for example, unique identifiers, optimization hints, or documentation.
It is an error for an element from the XSLT namespace to have attributes with expanded‐names
that have null namespace URIs (i.e. attributes with unprefixed names) other than attributes
defined for the element in this document.
NOTE:The conventions used for the names of XSLT elements, attributes and functions are that
names are all lower‐case, use hyphens to separate words, and use abbreviations only if they
already appear in the syntax of a related language such as XML or HTML.
2.2 Stylesheet Element
<xsl:stylesheet
id = id
extension‐element‐prefixes = tokens
exclude‐result‐prefixes = tokens
version = number>
<!‐‐ Content: (xsl:import*, top‐level‐elements) ‐‐>
</xsl:stylesheet>
202
<xsl:transform
id = id
extension‐element‐prefixes = tokens
exclude‐result‐prefixes = tokens
version = number>
<!‐‐ Content: (xsl:import*, top‐level‐elements) ‐‐>
</xsl:transform>
A stylesheet is represented by an xsl:stylesheet element in an XML document. xsl:transform is
allowed as a synonym for xsl:stylesheet.
An xsl:stylesheet element must have a version attribute, indicating the version of XSLT that the
stylesheet requires. For this version of XSLT, the value should be 1.0. When the value is not
equal to 1.0, forwards‐compatible processing mode is enabled (see [2.5 Forwards‐Compatible
Processing]).
The xsl:stylesheet element may contain the following types of elements:
• xsl:import
• xsl:include
• xsl:strip‐space
• xsl:preserve‐space
• xsl:output
• xsl:key
• xsl:decimal‐format
• xsl:namespace‐alias
• xsl:attribute‐set
• xsl:variable
• xsl:param
• xsl:template
An element occurring as a child of an xsl:stylesheet element is called a top‐level element.
This example shows the structure of a stylesheet. Ellipses (...) indicate where attribute values or
content have been omitted. Although this example shows one of each type of allowed element,
stylesheets may contain zero or more of each of these elements.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:import href="..."/>
<xsl:include href="..."/>
<xsl:strip‐space elements="..."/>
<xsl:preserve‐space elements="..."/>
203
<xsl:output method="..."/>
<xsl:key name="..." match="..." use="..."/>
<xsl:decimal‐format name="..."/>
<xsl:namespace‐alias stylesheet‐prefix="..." result‐prefix="..."/>
<xsl:attribute‐set name="...">
...
</xsl:attribute‐set>
<xsl:variable name="...">...</xsl:variable>
<xsl:param name="...">...</xsl:param>
<xsl:template match="...">
...
</xsl:template>
<xsl:template name="...">
...
</xsl:template>
</xsl:stylesheet>
The order in which the children of the xsl:stylesheet element occur is not significant except for
xsl:import elements and for error recovery. Users are free to order the elements as they prefer,
and stylesheet creation tools need not provide control over the order in which the elements
occur.
In addition, the xsl:stylesheet element may contain any element not from the XSLT namespace,
provided that the expanded‐name of the element has a non‐null namespace URI. The presence
of such top‐level elements must not change the behavior of XSLT elements and functions
defined in this document; for example, it would not be permitted for such a top‐level element
to specify that xsl:apply‐templates was to use different rules to resolve conflicts. Thus, an XSLT
processor is always free to ignore such top‐level elements, and must ignore a top‐level element
without giving an error if it does not recognize the namespace URI. Such elements can provide,
for example,
• information used by extension elements or extension functions (see [14 Extensions]),
• information about what to do with the result tree,
• information about how to obtain the source tree,
• metadata about the stylesheet,
• structured documentation for the stylesheet.
204
2.3 Literal Result Element as Stylesheet
A simplified syntax is allowed for stylesheets that consist of only a single template for the root
node. The stylesheet may consist of just a literal result element (see [7.1.1 Literal Result
Elements]). Such a stylesheet is equivalent to a stylesheet with an xsl:stylesheet element
containing a template rule containing the literal result element; the template rule has a match
pattern of /. For example
<html xsl:version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/TR/xhtml1/strict">
<head>
<title>Expense Report Summary</title>
</head>
<body>
<p>Total Amount: <xsl:value‐of select="expense‐report/total"/></p>
</body>
</html>
has the same meaning as
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/TR/xhtml1/strict">
<xsl:template match="/">
<html>
<head>
<title>Expense Report Summary</title>
</head>
<body>
<p>Total Amount: <xsl:value‐of select="expense‐report/total"/></p>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
A literal result element that is the document element of a stylesheet must have an xsl:version
attribute, which indicates the version of XSLT that the stylesheet requires. For this version of
XSLT, the value should be 1.0; the value must be a Number. Other literal result elements may
also have an xsl:version attribute. When the xsl:version attribute is not equal to 1.0, forwards‐
compatible processing mode is enabled (see [2.5 Forwards‐Compatible Processing]).
The allowed content of a literal result element when used as a stylesheet is no different from
when it occurs within a stylesheet. Thus, a literal result element used as a stylesheet cannot
contain top‐level elements.
205
In some situations, the only way that a system can recognize that an XML document needs to
be processed by an XSLT processor as an XSLT stylesheet is by examining the XML document
itself. Using the simplified syntax makes this harder.
NOTE:For example, another XML language (AXL) might also use an axl:version on the document
element to indicate that an XML document was an AXL document that required processing by
an AXL processor; if a document had both an axl:version attribute and an xsl:version attribute,
it would be unclear whether the document should be processed by an XSLT processor or an AXL
processor.
Therefore, the simplified syntax should not be used for XSLT stylesheets that may be used in
such a situation. This situation can, for example, arise when an XSLT stylesheet is transmitted as
a message with a MIME media type of text/xml or application/xml to a recipient that will use
the MIME media type to determine how the message is processed.
2.4 Qualified Names
The name of an internal XSLT object, specifically a named template (see [6 Named Templates]),
a mode (see [5.7 Modes]), an attribute set (see [7.1.4 Named Attribute Sets]), a key (see [12.2
Keys]), a decimal‐format (see [12.3 Number Formatting]), a variable or a parameter (see [11
Variables and Parameters]) is specified as a QName. If it has a prefix, then the prefix is
expanded into a URI reference using the namespace declarations in effect on the attribute in
which the name occurs. The expanded‐name consisting of the local part of the name and the
possibly null URI reference is used as the name of the object. The default namespace is not
used for unprefixed names.
2.5 Forwards‐Compatible Processing
An element enables forwards‐compatible mode for itself, its attributes, its descendants and
their attributes if either it is an xsl:stylesheet element whose version attribute is not equal to
1.0, or it is a literal result element that has an xsl:version attribute whose value is not equal to
1.0, or it is a literal result element that does not have an xsl:version attribute and that is the
document element of a stylesheet using the simplified syntax (see [2.3 Literal Result Element
as Stylesheet]). A literal result element that has an xsl:version attribute whose value is equal to
1.0 disables forwards‐compatible mode for itself, its attributes, its descendants and their
attributes.
If an element is processed in forwards‐compatible mode, then:
• if it is a top‐level element and XSLT 1.0 does not allow such elements as top‐level
elements, then the element must be ignored along with its content;
• if it is an element in a template and XSLT 1.0 does not allow such elements to occur in
templates, then if the element is not instantiated, an error must not be signaled, and if
the element is instantiated, the XSLT must perform fallback for the element as specified
in [15 Fallback];
206
• if the element has an attribute that XSLT 1.0 does not allow the element to have or if
the element has an optional attribute with a value that the XSLT 1.0 does not allow the
attribute to have, then the attribute must be ignored.
Thus, any XSLT 1.0 processor must be able to process the following stylesheet without error,
although the stylesheet includes elements from the XSLT namespace that are not defined in this
specification:
<xsl:stylesheet version="1.1"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:choose>
<xsl:when test="system‐property('xsl:version') >= 1.1">
<xsl:exciting‐new‐1.1‐feature/>
</xsl:when>
<xsl:otherwise>
<html>
<head>
<title>XSLT 1.1 required</title>
</head>
<body>
<p>Sorry, this stylesheet requires XSLT 1.1.</p>
</body>
</html>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
NOTE:If a stylesheet depends crucially on a top‐level element introduced by a version of XSL
after 1.0, then the stylesheet can use an xsl:message element with terminate="yes" (see [13
Messages]) to ensure that XSLT processors implementing earlier versions of XSL will not silently
ignore the top‐level element. For example,
<xsl:stylesheet version="1.5"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:important‐new‐1.1‐declaration/>
<xsl:template match="/">
<xsl:choose>
<xsl:when test="system‐property('xsl:version') < 1.1">
<xsl:message terminate="yes">
<xsl:text>Sorry, this stylesheet requires XSLT 1.1.</xsl:text>
</xsl:message>
</xsl:when>
<xsl:otherwise>
207
...
</xsl:otherwise>
</xsl:choose>
</xsl:template>
...
</xsl:stylesheet>
If an expression occurs in an attribute that is processed in forwards‐compatible mode, then an
XSLT processor must recover from errors in the expression as follows:
• if the expression does not match the syntax allowed by the XPath grammar, then an
error must not be signaled unless the expression is actually evaluated;
• if the expression calls a function with an unprefixed name that is not part of the XSLT
library, then an error must not be signaled unless the function is actually called;
• if the expression calls a function with a number of arguments that XSLT does not allow
or with arguments of types that XSLT does not allow, then an error must not be signaled
unless the function is actually called.
2.6 Combining Stylesheets
XSLT provides two mechanisms to combine stylesheets:
• an inclusion mechanism that allows stylesheets to be combined without changing the
semantics of the stylesheets being combined, and
• an import mechanism that allows stylesheets to override each other.
2.6.1 Stylesheet Inclusion
<!‐‐ Category: top‐level‐element ‐‐>
<xsl:include
href = uri‐reference />
An XSLT stylesheet may include another XSLT stylesheet using an xsl:include element. The
xsl:include element has an href attribute whose value is a URI reference identifying the
stylesheet to be included. A relative URI is resolved relative to the base URI of the xsl:include
element (see [3.2 Base URI]).
The xsl:include element is only allowed as a top‐level element.
The inclusion works at the XML tree level. The resource located by the href attribute value is
parsed as an XML document, and the children of the xsl:stylesheet element in this document
replace the xsl:include element in the including document. The fact that template rules or
definitions are included does not affect the way they are processed.
208
The included stylesheet may use the simplified syntax described in [2.3 Literal Result Element
as Stylesheet]. The included stylesheet is treated the same as the equivalent xsl:stylesheet
element.
It is an error if a stylesheet directly or indirectly includes itself.
NOTE:Including a stylesheet multiple times can cause errors because of duplicate definitions.
Such multiple inclusions are less obvious when they are indirect. For example, if stylesheet B
includes stylesheet A, stylesheet C includes stylesheet A, and stylesheet D includes both
stylesheet B and stylesheet C, then A will be included indirectly by D twice. If all of B, C and D
are used as independent stylesheets, then the error can be avoided by separating everything in
B other than the inclusion of A into a separate stylesheet B' and changing B to contain just
inclusions of B' and A, similarly for C, and then changing D to include A, B', C'.
2.6.2 Stylesheet Import
<xsl:import
href = uri‐reference />
An XSLT stylesheet may import another XSLT stylesheet using an xsl:import element. Importing
a stylesheet is the same as including it (see [2.6.1 Stylesheet Inclusion]) except that definitions
and template rules in the importing stylesheet take precedence over template rules and
definitions in the imported stylesheet; this is described in more detail below. The xsl:import
element has an href attribute whose value is a URI reference identifying the stylesheet to be
imported. A relative URI is resolved relative to the base URI of the xsl:import element (see [3.2
Base URI]).
The xsl:import element is only allowed as a top‐level element. The xsl:import element children
must precede all other element children of an xsl:stylesheet element, including any xsl:include
element children. When xsl:include is used to include a stylesheet, any xsl:import elements in
the included document are moved up in the including document to after any existing xsl:import
elements in the including document.
For example,
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:import href="article.xsl"/>
<xsl:import href="bigfont.xsl"/>
<xsl:attribute‐set name="note‐style">
<xsl:attribute name="font‐style">italic</xsl:attribute>
</xsl:attribute‐set>
</xsl:stylesheet>
209
The xsl:stylesheet elements encountered during processing of a stylesheet that contains
xsl:import elements are treated as forming an import tree. In the import tree, each
xsl:stylesheet element has one import child for each xsl:import element that it contains. Any
xsl:include elements are resolved before constructing the import tree. An xsl:stylesheet
element in the import tree is defined to have lower import precedence than another
xsl:stylesheet element in the import tree if it would be visited before that xsl:stylesheet
element in a post‐order traversal of the import tree (i.e. a traversal of the import tree in which
an xsl:stylesheet element is visited after its import children). Each definition and template rule
has import precedence determined by the xsl:stylesheet element that contains it.
For example, suppose
• stylesheet A imports stylesheets B and C in that order;
• stylesheet B imports stylesheet D;
• stylesheet C imports stylesheet E.
Then the order of import precedence (lowest first) is D, B, E, C, A.
NOTE:Since xsl:import elements are required to occur before any definitions or template rules,
an implementation that processes imported stylesheets at the point at which it encounters the
xsl:import element will encounter definitions and template rules in increasing order of import
precedence.
In general, a definition or template rule with higher import precedence takes precedence over a
definition or template rule with lower import precedence. This is defined in detail for each kind
of definition and for template rules.
It is an error if a stylesheet directly or indirectly imports itself. Apart from this, the case where a
stylesheet with a particular URI is imported in multiple places is not treated specially. The
import tree will have a separate xsl:stylesheet for each place that it is imported.
NOTE:If xsl:apply‐imports is used (see [5.6 Overriding Template Rules]), the behavior may be
different from the behavior if the stylesheet had been imported only at the place with the
highest import precedence.
2.7 Embedding Stylesheets
Normally an XSLT stylesheet is a complete XML document with the xsl:stylesheet element as
the document element. However, an XSLT stylesheet may also be embedded in another
resource. Two forms of embedding are possible:
• the XSLT stylesheet may be textually embedded in a non‐XML resource, or
• the xsl:stylesheet element may occur in an XML document other than as the document
element.
210
To facilitate the second form of embedding, the xsl:stylesheet element is allowed to have an ID
attribute that specifies a unique identifier.
NOTE:In order for such an attribute to be used with the XPath id function, it must actually be
declared in the DTD as being an ID.
The following example shows how the xml‐stylesheet processing instruction [XML Stylesheet]
can be used to allow a document to contain its own stylesheet. The URI reference uses a
relative URI with a fragment identifier to locate the xsl:stylesheet element:
<?xml‐stylesheet type="text/xml" href="#style1"?>
<!DOCTYPE doc SYSTEM "doc.dtd">
<doc>
<head>
<xsl:stylesheet id="style1"
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fo="http://www.w3.org/1999/XSL/Format">
<xsl:import href="doc.xsl"/>
<xsl:template match="id('foo')">
<fo:block font‐weight="bold"><xsl:apply‐templates/></fo:block>
</xsl:template>
<xsl:template match="xsl:stylesheet">
<!‐‐ ignore ‐‐>
</xsl:template>
</xsl:stylesheet>
</head>
<body>
<para id="foo">
...
</para>
</body>
</doc>
NOTE:A stylesheet that is embedded in the document to which it is to be applied or that may
be included or imported into an stylesheet that is so embedded typically needs to contain a
template rule that specifies that xsl:stylesheet elements are to be ignored.
3 Data Model
The data model used by XSLT is the same as that used by XPath with the additions described in
this section. XSLT operates on source, result and stylesheet documents using the same data
model. Any two XML documents that have the same tree will be treated the same by XSLT.
Processing instructions and comments in the stylesheet are ignored: the stylesheet is treated as
if neither processing instruction nodes nor comment nodes were included in the tree that
represents the stylesheet.
211
3.1 Root Node Children
The normal restrictions on the children of the root node are relaxed for the result tree. The
result tree may have any sequence of nodes as children that would be possible for an element
node. In particular, it may have text node children, and any number of element node children.
When written out using the XML output method (see [16 Output]), it is possible that a result
tree will not be a well‐formed XML document; however, it will always be a well‐formed external
general parsed entity.
When the source tree is created by parsing a well‐formed XML document, the root node of the
source tree will automatically satisfy the normal restrictions of having no text node children and
exactly one element child. When the source tree is created in some other way, for example by
using the DOM, the usual restrictions are relaxed for the source tree as for the result tree.
3.2 Base URI
Every node also has an associated URI called its base URI, which is used for resolving attribute
values that represent relative URIs into absolute URIs. If an element or processing instruction
occurs in an external entity, the base URI of that element or processing instruction is the URI of
the external entity; otherwise, the base URI is the base URI of the document. The base URI of
the document node is the URI of the document entity. The base URI for a text node, a comment
node, an attribute node or a namespace node is the base URI of the parent of the node.
3.3 Unparsed Entities
The root node has a mapping that gives the URI for each unparsed entity declared in the
document's DTD. The URI is generated from the system identifier and public identifier specified
in the entity declaration. The XSLT processor may use the public identifier to generate a URI for
the entity instead of the URI specified in the system identifier. If the XSLT processor does not
use the public identifier to generate the URI, it must use the system identifier; if the system
identifier is a relative URI, it must be resolved into an absolute URI using the URI of the
resource containing the entity declaration as the base URI [RFC2396].
3.4 Whitespace Stripping
After the tree for a source document or stylesheet document has been constructed, but before
it is otherwise processed by XSLT, some text nodes are stripped. A text node is never stripped
unless it contains only whitespace characters. Stripping the text node removes the text node
from the tree. The stripping process takes as input a set of element names for which
whitespace must be preserved. The stripping process is applied to both stylesheets and source
documents, but the set of whitespace‐preserving element names is determined differently for
stylesheets and for source documents.
A text node is preserved if any of the following apply:
212
• The element name of the parent of the text node is in the set of whitespace‐preserving
element names.
• The text node contains at least one non‐whitespace character. As in XML, a whitespace
character is #x20, #x9, #xD or #xA.
• An ancestor element of the text node has an xml:space attribute with a value of
preserve, and no closer ancestor element has xml:space with a value of default.
Otherwise, the text node is stripped.
The xml:space attributes are not stripped from the tree.
NOTE:This implies that if an xml:space attribute is specified on a literal result element, it will be
included in the result.
For stylesheets, the set of whitespace‐preserving element names consists of just xsl:text.
<!‐‐ Category: top‐level‐element ‐‐>
<xsl:strip‐space
elements = tokens />
<!‐‐ Category: top‐level‐element ‐‐>
<xsl:preserve‐space
elements = tokens />
For source documents, the set of whitespace‐preserving element names is specified by xsl:strip‐
space and xsl:preserve‐space top‐level elements. These elements each have an elements
attribute whose value is a whitespace‐separated list of NameTests. Initially, the set of
whitespace‐preserving element names contains all element names. If an element name
matches a NameTest in an xsl:strip‐space element, then it is removed from the set of
whitespace‐preserving element names. If an element name matches a NameTest in an
xsl:preserve‐space element, then it is added to the set of whitespace‐preserving element
names. An element matches a NameTest if and only if the NameTest would be true for the
element as an XPath node test. Conflicts between matches to xsl:strip‐space and xsl:preserve‐
space elements are resolved the same way as conflicts between template rules (see [5.5
Conflict Resolution for Template Rules]). Thus, the applicable match for a particular element
name is determined as follows:
• First, any match with lower import precedence than another match is ignored.
• Next, any match with a NameTest that has a lower default priority than the default
priority of the NameTest of another match is ignored.
It is an error if this leaves more than one match. An XSLT processor may signal the error; if it
does not signal the error, it must recover by choosing, from amongst the matches that are left,
the one that occurs last in the stylesheet.
213
4 Expressions
XSLT uses the expression language defined by XPath [XPath]. Expressions are used in XSLT for a
variety of purposes including:
• selecting nodes for processing;
• specifying conditions for different ways of processing a node;
• generating text to be inserted in the result tree.
An expression must match the XPath production Expr.
Expressions occur as the value of certain attributes on XSLT‐defined elements and within curly
braces in attribute value templates.
In XSLT, an outermost expression (i.e. an expression that is not part of another expression) gets
its context as follows:
• the context node comes from the current node
• the context position comes from the position of the current node in the current node
list; the first position is 1
• the context size comes from the size of the current node list
• the variable bindings are the bindings in scope on the element which has the attribute in
which the expression occurs (see [11 Variables and Parameters])
• the set of namespace declarations are those in scope on the element which has the
attribute in which the expression occurs; this includes the implicit declaration of the
prefix xml required by the the XML Namespaces Recommendation [XML Names]; the
default namespace (as declared by xmlns) is not part of this set
• the function library consists of the core function library together with the additional
functions defined in [12 Additional Functions] and extension functions as described in
[14 Extensions]; it is an error for an expression to include a call to any other function
1 Introduction
This specification defines the XML Linking Language (XLink), which allows elements to be
inserted into XML documents in order to create and describe links between resources.
XLink provides a framework for creating both basic unidirectional links and more complex
linking structures. It allows XML documents to:
• Assert linking relationships among more than two resources
• Associate metadata with a link
• Express links that reside in a location separate from the linked resources
An important application of XLink is in hypermedia systems that have hyperlinks. A simple case
of a hyperlink is an HTML A element, which has these characteristics:
214
• The hyperlink uses URIs as its locator technology.
• The hyperlink is expressed at one of its two ends.
• The hyperlink identifies the other end (although a server may have great freedom in
finding or dynamically creating that destination).
• Users can initiate traversal only from the end where the hyperlink is expressed to the
other end.
• The hyperlink's effect on windows, frames, go‐back lists, style sheets in use, and so on is
determined by user agents, not by the hyperlink itself. For example, traversal of A links
normally replaces the current view, perhaps with a user option to open a new window.
This set of characteristics is powerful, but the model that underlies them limits the range of
possible hyperlink functionality. The model defined in this specification shares with HTML the
use of URI technology, but goes beyond HTML in offering features, previously available only in
dedicated hypermedia systems, that make hyperlinking more scalable and flexible. Along with
providing linking data structures, XLink provides a minimal link behavior model; higher‐level
applications layered on XLink will often specify alternate or more sophisticated rendering and
processing treatments.
Integrated treatment of specialized links used in other technical domains, such as foreign keys
in relational databases and reference values in programming languages, is outside the scope of
this specification.
1.1 Origin and Goals
The design of XLink has been informed by knowledge of established hypermedia systems and
standards. The following standards have been especially influential:
• HTML [HTML]: Defines several element types that represent links.
• HyTime [ISO/IEC 10744]: Defines inline and inbound and third‐party link structures and
some semantic features, including traversal control and presentation of objects.
• Text Encoding Initiative Guidelines [TEI]: Provides structures for creating links, aggregate
objects, and link collections.
Many other linking systems have also informed the design of XLink, especially [Dexter], [FRESS],
[OHS], [MicroCosm], and [Intermedia].
See the XLink Requirements Document [XLREQ] for a thorough explanation of requirements for
the design of XLink.
2 XLink Concepts
This section describes the terms and concepts that are essential to understanding XLink,
without discussing the syntax used to create XLink constructs. A few additional terms are
introduced in later parts of this specification.
215
2.1 Links and Resources
[Definition: An XLink link is an explicit relationship between resources or portions of resources.]
[Definition: It is made explicit by an XLink linking element, which is an XLink‐conforming XML
element that asserts the existence of a link.] There are six XLink elements; only two of them are
considered linking elements. The others provide various pieces of information that describe the
characteristics of a link. (The term "link" as used in this specification refers only to an XLink link,
though nothing prevents non‐XLink constructs from serving as links.)
The notion of resources is universal to the World Wide Web. [Definition: As discussed in [IETF
RFC 2396], a resource is any addressable unit of information or service.] Examples include files,
images, documents, programs, and query results. The means used for addressing a resource is a
URI (Uniform Resource Identifier) reference (described more in 5.4 Locator Attribute (href)). It
is possible to address a portion of a resource. For example, if the whole resource is an XML
document, a useful portion of that resource might be a particular element inside the document.
Following a link to it might result, for example, in highlighting that element or scrolling to that
point in the document.
[Definition: When a link associates a set of resources, those resources are said to participate in
the link.] Even though XLink links must appear in XML documents, they are able to associate all
kinds of resources, not just XML‐encoded ones.
One of the common uses of XLink is to create hyperlinks. [Definition: A hyperlink is a link that is
intended primarily for presentation to a human user.] Nothing in XLink's design, however,
prevents it from being used with links that are intended solely for consumption by computers.
2.2 Arcs, Traversal, and Behavior
[Definition: Using or following a link for any purpose is called traversal.] Even though some
kinds of link can associate arbitrary numbers of resources, traversal always involves a pair of
resources (or portions of them); [Definition: the source from which traversal is begun is the
starting resource] and [Definition: the destination is the ending resource]. Note that the term
"resource" used in this fashion may at times apply to a resource portion, not a whole resource.
[Definition: Information about how to traverse a pair of resources, including the direction of
traversal and possibly application behavior information as well, is called an arc]. If two arcs in a
link specify the same pair of resources, but they switch places as starting and ending resources,
then the link is multidirectional, which is not the same as merely "going back" after traversing a
link.
2.3 Resources in Relation to the Physical Location of a Linking Element
[Definition: A local resource is an XML element that participates in a link by virtue of having as
its parent, or being itself, a linking element]. [Definition: Any resource or resource portion that
participates in a link by virtue of being addressed with a URI reference is considered a remote
resource, even if it is in the same XML document as the link, or even inside the same linking
216
element.] Put another way, a local resource is specified "by value," and a remote resource is
specified "by reference."
[Definition: An arc that has a local starting resource and a remote ending resource goes
outbound, that is, away from the linking element.] (Examples of links with such an arc are the
HTML A element, HyTime "clinks," and Text Encoding Initiative XREF elements.) [Definition: If an
arc's ending resource is local but its starting resource is remote, then the arc goes inbound.]
[Definition: If neither the starting resource nor the ending resource is local, then the arc is a
third‐party arc.] Though it is not required, any one link typically specifies only one kind of arc
throughout, and thus might be referred to as an inbound, outbound, or third‐party link.
To create a link that emanates from a resource to which you do not have (or choose not to
exercise) write access, or from a resource that offers no way to embed linking constructs, it is
necessary to use an inbound or third‐party arc. When such arcs are used, the requirements for
discovery of the link are greater than for outbound arcs. [Definition: Documents containing
collections of inbound and third‐party links are called link databases, or linkbases.]
3 XLink Processing and Conformance
This section details processing and conformance requirements on XLink applications and
markup.
[Definition: The key words must, must not, required, shall, shall not, should, should not,
recommended, may, and optional in this specification are to be interpreted as described in
[IETF RFC 2119].]
3.1 Processing Dependencies
XLink processing depends on [XML], [XML Names], [XML Base], and [IETF RFC 2396] (as updated
by [IETF RFC 2732]).
3.2 Markup Conformance
An XML element conforms to XLink if:
1. it has a type attribute from the XLink namespace whose value is one of "simple",
"extended", "locator", "arc", "resource", "title", or "none", and
2. it adheres to the conformance constraints imposed by the chosen XLink element type,
as prescribed in this specification.
This specification imposes no particular constraints on DTDs; conformance applies only to
elements and attributes.
217
3.3 Application Conformance
An XLink application is any software module that interprets well‐formed XML documents
containing XLink elements and attributes, or XML information sets [XIS] containing information
items and properties corresponding to XLink elements and attributes. (This document refers to
elements and attributes, but all specifications herein apply to their information set equivalents
as well.) Such an application is conforming if:
1. it observes the mandatory conditions for applications ("must") set forth in this
specification, and
2. for any optional conditions ("should" and "may") it chooses to observe, it observes them
in the way prescribed, and
3. it performs markup conformance testing according to all the conformance constraints
appearing in this specification.
4 XLink Markup Design
This section describes the design of XLink's markup vocabulary.
Link markup needs to be recognized reliably by XLink applications in order to be traversed and
handled properly. XLink uses the mechanism described in the Namespaces in XML
Recommendation [XML Names] to accomplish recognition of the constructs in the XLink
vocabulary.
The XLink namespace defined by this specification has the following URI:
http://www.w3.org/1999/xlink
As dictated by [XML Names], the use of XLink elements and attributes requires declaration of
the XLink namespace. For example, the following declaration would make the prefix xlink
available within the myElement element to represent the XLink namespace:
<myElement
xmlns:xlink="http://www.w3.org/1999/xlink">
...
</myElement>
Note:
Most code examples in this specification do not show an XLink namespace declaration. The
xlink prefix is used throughout to stand for the declaration of the XLink namespace on elements
in whose scope the so‐marked attribute appears (on the same element that bears the attribute
or on some ancestor element), whether or not an XLink namespace declaration is present in the
example.
218
XLink's namespace provides global attributes for use on elements that are in any arbitrary
namespace. The global attributes are type, href, role, arcrole, title, show, actuate, label, from,
and to. Document creators use the XLink global attributes to make the elements in their own
namespace, or even in a namespace they do not control, recognizable as XLink elements. The
type attribute indicates the XLink element type (simple, extended, locator, arc, resource, or
title); the element type dictates the XLink‐imposed constraints that such an element must
follow and the behavior of XLink applications on encountering the element.
Following is an example of a crossReference element from a non‐XLink namespace that has
XLink global attributes:
<my:crossReference
xmlns:my="http://example.com/"
xmlns:xlink="http://www.w3.org/1999/xlink"
xlink:type="simple"
xlink:href="students.xml"
xlink:role="http://www.example.com/linkprops/studentlist"
xlink:title="Student List"
xlink:show="new"
xlink:actuate="onRequest">
Current List of Students
</my:crossReference>
Using global attributes always requires the use of namespace prefixes on the individual
attributes and the use of the type attribute on the element.
4.1 XLink Attribute Usage Patterns
While the XLink attributes are considered global by virtue of their use of the namespace
mechanism, their allowed combinations on any one XLink element type depend greatly on the
value of the special type attribute (see 5.3 XLink Element Type Attribute (type) for more
information) for the element on which they appear. The conformance constraint notes in this
specification detail their allowed usage patterns. Following is a summary of the element types
(columns) on which the global attributes (rows) are allowed, with an indication of whether a
value is required (R) or optional (O):
href O R
role O O O O
219
simple extended locator arc resource title
arcrole O O
title O O O O O
show O O
actuate O O
label O O
from O
to O
(See also B Sample DTD for a non‐normative DTD that illustrates the allowed patterns of
attributes.)
This specification uses the convention "xxx‐type element" to refer to elements that must
adhere to a named set of constraints associated with an XLink element type, no matter what
name the element actually has. For example, "locator‐type element" would refer to all of the
following elements:
<locator xlink:type="locator" ... />
<loc xlink:type="locator" ... />
<my:pointer xlink:type="locator" ... />
4.2 XLink Element Type Relationships
Various XLink element types have special meanings dictated by this specification when they
appear as direct children of other XLink element types. Following is a summary of the child
element types that play a significant role in particular parent element types. (Other
combinations have no XLink‐dictated significance.)
Parent type Significant child types
simple none
extended locator, arc, resource, title
220
Parent type Significant child types
locator title
arc title
resource none
title none
4.3 Attribute Value Defaulting
Using XLink potentially involves using a large number of attributes for supplying important link
information. In cases where the values of the desired XLink attributes are unchanging across
individual instances in all the documents of a certain type, attribute value defaults (fixed or not)
may be added to a DTD so that the attributes do not have to appear physically on element
start‐tags. For example, if attribute defaults were provided for the xmlns:xlink, xmlns:my, type,
show, and actuate attributes in the example in the introduction to 4 XLink Markup Design, the
example would look as follows:
<my:crossReference
xlink:href="students.xml"
xlink:role="http://www.example.com/linkprops/studentlist"
xlink:title="Student List">
Current List of Students
</my:crossReference>
Information sets that have been created under the control of a DTD have all attribute values
filled in.
4.4 Integrating XLink Usage with Other Markup
This specification defines only attributes and attribute values in the XLink namespace. There is
no restriction on using non‐XLink attributes alongside XLink attributes. In addition, most XLink
attributes are optional and the choice of simple or extended link is up to the markup designer
or document creator, so a DTD that uses XLink features need not use or declare the entire set
of XLink's attributes. Finally, while this specification identifies the minimum constraints on XLink
markup, DTDs that use XLink are free to tighten these constraints. The use of XLink does not
absolve a valid document from conforming to the constraints expressed in its governing DTD.
Following is an example of a crossReference element with both XLink and non‐XLink attributes:
221
<my:crossReference
xmlns:my="http://example.com/"
my:lastEdited="2000‐06‐10"
xmlns:xlink="http://www.w3.org/1999/xlink"
xlink:type="simple"
xlink:href="students.xml">
Current List of Students
</my:crossReference>
4.5 Using XLink with Legacy Markup
Because XLink's global attributes require the use of namespace prefixes, non‐XLink‐based links
in legacy documents generally do not serve as conforming XLink constructs as they stand, even
if attribute value defaulting is used. For example, XHTML 1.0 has an a element with an href
attribute, but because the attribute is a local one attached to the a element in the XHTML
namespace, it is not the same as an xlink:href global attribute in the XLink namespace.
5 XLink Elements and Attributes
XLink offers two kinds of links:
Extended links
Extended links offer full XLink functionality, such as inbound and third‐party arcs, as well
as links that have arbitrary numbers of participating resources. As a result, their
structure can be fairly complex, including elements for pointing to remote resources,
elements for containing local resources, elements for specifying arc traversal rules, and
elements for specifying human‐readable resource and arc titles.
XLink defines a way to give an extended link special semantics for finding linkbases; used
in this fashion, an extended link helps an XLink application process other links.
Simple links
Simple links offer shorthand syntax for a common kind of link, an outbound link with
exactly two participating resources (into which category HTML‐style A and IMG links
fall). Because simple links offer less functionality than extended links, they have no
special internal structure.
While simple links are conceptually a subset of extended links, they are syntactically
different. For example, to convert a simple link into an extended link, several structural
changes would be needed.
The following sections define the XLink elements and attributes.
222
5.1 Extended Links (extended‐Type Element)
[Definition: An extended link is a link that associates an arbitrary number of resources. The
participating resources may be any combination of remote and local.]
The only kind of link that is able to have inbound and third‐party arcs is an extended link.
Typically, extended linking elements are stored separately from the resources they associate
(for example, in entirely different documents). Thus, extended links are important for situations
where the participating resources are read‐only, or where it is expensive to modify and update
them but inexpensive to modify and update a separate linking element, or where the resources
are in formats with no native support for embedded links (such as many multimedia formats).
The following diagram shows an extended link that associates five remote resources. This could
represent, for example, information about a student's course load: one resource being a
description of the student, another being a description of the student's academic advisor, two
resources representing courses that the student is attending, and the last resource representing
a course that the student is auditing.
Without the extended link, the resources might be entirely unrelated; for example, they might
be in five separate documents. The lines emanating from the extended link represent the
association it creates among the resources. However, notice that the lines do not have
directionality. Directionality is expressed with traversal rules; without such rules being
provided, the resources are associated in no particular order, with no implication as to whether
and how individual resources are accessed.
The following diagram shows an extended link that associates five remote resources and one
local resource (a special element inside the extended link element). This could represent the
same sort of course‐load example as described above, with the addition of the student's grade
point average stored locally. Again, the lines represent mere association of the six resources,
without traversal directions or behaviors implied.
223
The XLink element type for extended links is any element with an attribute in the XLink
namespace called type with a value of "extended".
The extended‐type element may contain a mixture of the following elements in any order,
possibly along with other content and markup:
• locator‐type elements that address the remote resources participating in the link
• arc‐type elements that provide traversal rules among the link's participating resources
• title‐type elements that provide human‐readable labels for the link
• resource‐type elements that supply local resources that participate in the link
It is not an error for an extended‐type element to associate fewer than two resources. If the
link has only one participating resource, or none at all, it is simply untraversable. Such a link
may still be useful, for example, to associate properties with a single resource by means of
XLink attributes, or to provide a placeholder for link information that will be populated
eventually.
Subelements of the simple or extended type anywhere inside a parent extended‐type element
have no XLink‐specified meaning. Subelements of the locator, arc, or resource type that are not
direct children of an extended‐type element have no XLink‐specified meaning.
The extended‐type element may have the semantic attributes role and title (see 5.5 Semantic
Attributes (role, arcrole, and title)). They supply semantic information about the link as a
whole; the role attribute indicates a property that the entire link has, and the title attribute
indicates a human‐readable description of the entire link. If other XLink attributes are present
on the element, they have no XLink‐specified relationship to the link. If both a title attribute
and one or more title‐type elements are present, they have no XLink‐specified relationship; a
higher‐level application built on XLink will likely want to specify appropriate treatment (for
example, precedence) in this case.
224
Example: Sample extended‐Type Element Declarations and Instance
Following is a non‐normative set of declarations for an extended‐type element and its
subelements. Parts of this example are reused throughout this specification. Note that the type
attribute and some other attributes are defaulted in the DTD in order to highlight the attributes
that are changing on a per‐instance basis.
<!ELEMENT courseload ((tooltip|person|course|gpa|go)*)>
<!ATTLIST courseload
xmlns:xlink CDATA #FIXED "http://www.w3.org/1999/xlink"
xlink:type (extended) #FIXED "extended"
xlink:role CDATA #IMPLIED
xlink:title CDATA #IMPLIED>
<!ELEMENT tooltip ANY>
<!ATTLIST tooltip
xlink:type (title) #FIXED "title"
xml:lang CDATA #IMPLIED>
<!ELEMENT person EMPTY>
<!ATTLIST person
xlink:type (locator) #FIXED "locator"
xlink:href CDATA #REQUIRED
xlink:role CDATA #IMPLIED
xlink:title CDATA #IMPLIED
xlink:label NMTOKEN #IMPLIED>
<!ELEMENT course EMPTY>
<!ATTLIST course
xlink:type (locator) #FIXED "locator"
xlink:href CDATA #REQUIRED
xlink:role CDATA #FIXED "http://www.example.com/linkprops/course"
xlink:title CDATA #IMPLIED
xlink:label NMTOKEN #IMPLIED>
<!‐‐ GPA = "grade point average" ‐‐>
<!ELEMENT gpa ANY>
<!ATTLIST gpa
xlink:type (resource) #FIXED "resource"
xlink:role CDATA #FIXED "http://www.example.com/linkprops/gpa"
xlink:title CDATA #IMPLIED
xlink:label NMTOKEN #IMPLIED>
<!ELEMENT go EMPTY>
<!ATTLIST go
225
xlink:type (arc) #FIXED "arc"
xlink:arcrole CDATA #IMPLIED
xlink:title CDATA #IMPLIED
xlink:show (new
|replace
|embed
|other
|none) #IMPLIED
xlink:actuate (onLoad
|onRequest
|other
|none) #IMPLIED
xlink:from NMTOKEN #IMPLIED
xlink:to NMTOKEN #IMPLIED>
Following is how XML elements using these declarations might look.
<courseload>
<tooltip>Course Load for Pat Jones</tooltip>
<person
xlink:href="students/patjones62.xml"
xlink:label="student62"
xlink:role="http://www.example.com/linkprops/student"
xlink:title="Pat Jones" />
<person
xlink:href="profs/jaysmith7.xml"
xlink:label="prof7"
xlink:role="http://www.example.com/linkprops/professor"
xlink:title="Dr. Jay Smith" />
<!‐‐ more remote resources for professors, teaching assistants, etc. ‐‐>
<course
xlink:href="courses/cs101.xml"
xlink:label="CS‐101"
xlink:title="Computer Science 101" />
<!‐‐ more remote resources for courses, seminars, etc. ‐‐>
<gpa xlink:label="PatJonesGPA">3.5</gpa>
<go
xlink:from="student62"
226
xlink:to="PatJonesGPA"
xlink:show="new"
xlink:actuate="onRequest"
xlink:title="Pat Jones's GPA" />
<go
xlink:from="CS‐101"
xlink:arcrole="http://www.example.com/linkprops/auditor"
xlink:to="student62"
xlink:show="replace"
xlink:actuate="onRequest"
xlink:title="Pat Jones, auditing the course" />
<go
xlink:from="student62"
xlink:arcrole="http://www.example.com/linkprops/advisor"
xlink:to="prof7"
xlink:show="replace"
xlink:actuate="onRequest"
xlink:title="Dr. Jay Smith, advisor" />
</courseload>
5.2 Simple Links (simple‐Type Element)
[Definition: A simple link is a link that associates exactly two resources, one local and one
remote, with an arc going from the former to the latter. Thus, a simple link is always an
outbound link.]
The purpose of a simple link is to be a convenient shorthand for the equivalent extended link. A
single simple linking element combines the basic functions of an extended‐type element, a
locator‐type element, an arc‐type element, and a resource‐type element.
The following diagram shows the characteristics of a simple link; it associates one local and one
remote resource, and implicitly provides a single traversal arc from the local resource to the
remote one. This could represent, for example, the name of a student appearing in text which,
when clicked, leads to information about the student.
227
Example: Simple Link Functionality Done with an Extended Link
A simple link could be represented by an extended link in approximately the following way:
<studentlink xlink:type="extended">
<resource
xlink:type="resource"
xlink:label="local">Pat Jones</resource>
<locator
xlink:type="locator"
xlink:href="..."
xlink:label="remote"
xlink:role="..."
xlink:title="..." />
<go
xlink:type="arc"
xlink:from="local"
xlink:to="remote"
xlink:arcrole="..."
xlink:show="..."
xlink:actuate="..." />
</studentlink>
A simple link combines all the features above (except for the types and labels) into a single
element. In cases where only this subset of features is required, the XLink simple linking
element is available as an alternative to the extended linking element. The features missing
from simple links are as follows:
• Supplying arbitrary numbers of local and remote resources
• Specifying an arc from its remote resource to its local resource
• Associating a title with the single hardwired arc
• Associating a role or title with the local resource
• Associating a role or title with the link as a whole
The XLink element for simple links is any element with an attribute in the XLink namespace
called type with a value of "simple". The simple equivalent of the above extended link would be
as follows:
<studentlink xlink:href="...">Pat Jones</studentlink>
The simple‐type element may have any content. The simple‐type element itself, together with
all of its content, is the local resource of the link, as if the element were a resource‐type
element. If a simple‐type element contains nested XLink elements, such contained elements
228
have no XLink‐specified relationship to the parent link. It is possible for a simple‐type element
to have no content; in cases where the link is expected to be traversed on request, interactive
XLink applications will typically generate some content in order to give the user a way to initiate
the traversal.
The simple‐type element effectively takes the locator attribute href and the semantic attributes
role and title from the locator‐type element, and the behavior attributes show and actuate and
the single semantic attribute arcrole from the arc‐type element.
It is not an error for a simple‐type element to have no locator (href) attribute value. If a value is
not provided, the link is simply untraversable. Such a link may still be useful, for example, to
associate properties with the resource by means of XLink attributes.
Example: Sample simple‐Type Element Declarations and Instance
Following is a non‐normative set of declarations for a simple‐type element.
<!ELEMENT studentlink ANY>
<!ATTLIST studentlink
xlink:type (simple) #FIXED "simple"
xlink:href CDATA #IMPLIED
xlink:role NMTOKEN #FIXED "http://www.example.com/linkprops/student"
xlink:arcrole CDATA #IMPLIED
xlink:title CDATA #IMPLIED
xlink:show (new
|replace
|embed
|other
|none) #IMPLIED
xlink:actuate (onLoad
|onRequest
|other
|none) #IMPLIED>
Following is how an XML document might use these declarations.
..., and <studentlink xlink:href="students/patjones62.xml">Pat
Jones</studentlink> is popular around the student union.
5.3 XLink Element Type Attribute (type)
The attribute that identifies XLink element types is type.
Constraint: type Value
229
The value of the type attribute must be supplied. The value must be one of "simple",
"extended", "locator", "arc", "resource", "title", or "none".
When the value of the type attribute is "none", the element has no XLink‐specified meaning,
and any XLink‐related content or attributes have no XLink‐specified relationship to the element.
Example: Sample type Attribute Declarations
Following is a non‐normative attribute‐list declaration for type on an element intended to be
simple‐type.
<!ATTLIST xlink:simple
xlink:type (simple) #FIXED "simple"
...>
For an element that serves as an XLink element only on some occasions, one declaration might
be as follows, where the document creator sets the value to "simple" in some circumstances
and "none" in others. The use of "none" might be useful in helping XLink applications to avoid
checking for the presence of an href value.
<!ATTLIST commandname
xlink:type (simple|none) #REQUIRED
xlink:href CDATA #IMPLIED>
1 Introduction
XPath is the result of an effort to provide a common syntax and semantics for functionality
shared between XSL Transformations [XSLT] and XPointer [XPointer]. The primary purpose of
XPath is to address parts of an XML [XML] document. In support of this primary purpose, it also
provides basic facilities for manipulation of strings, numbers and booleans. XPath uses a
compact, non‐XML syntax to facilitate use of XPath within URIs and XML attribute values. XPath
operates on the abstract, logical structure of an XML document, rather than its surface syntax.
XPath gets its name from its use of a path notation as in URLs for navigating through the
hierarchical structure of an XML document.
In addition to its use for addressing, XPath is also designed so that it has a natural subset that
can be used for matching (testing whether or not a node matches a pattern); this use of XPath
is described in XSLT.
XPath models an XML document as a tree of nodes. There are different types of nodes,
including element nodes, attribute nodes and text nodes. XPath defines a way to compute a
string‐value for each type of node. Some types of nodes also have names. XPath fully supports
230
XML Namespaces [XML Names]. Thus, the name of a node is modeled as a pair consisting of a
local part and a possibly null namespace URI; this is called an expanded‐name. The data model
is described in detail in [5 Data Model].
The primary syntactic construct in XPath is the expression. An expression matches the
production Expr. An expression is evaluated to yield an object, which has one of the following
four basic types:
• node‐set (an unordered collection of nodes without duplicates)
• boolean (true or false)
• number (a floating‐point number)
• string (a sequence of UCS characters)
Expression evaluation occurs with respect to a context. XSLT and XPointer specify how the
context is determined for XPath expressions used in XSLT and XPointer respectively. The
context consists of:
• a node (the context node)
• a pair of non‐zero positive integers (the context position and the context size)
• a set of variable bindings
• a function library
• the set of namespace declarations in scope for the expression
The context position is always less than or equal to the context size.
The variable bindings consist of a mapping from variable names to variable values. The value of
a variable is an object, which can be of any of the types that are possible for the value of an
expression, and may also be of additional types not specified here.
The function library consists of a mapping from function names to functions. Each function
takes zero or more arguments and returns a single result. This document defines a core
function library that all XPath implementations must support (see [4 Core Function Library]).
For a function in the core function library, arguments and result are of the four basic types.
Both XSLT and XPointer extend XPath by defining additional functions; some of these functions
operate on the four basic types; others operate on additional data types defined by XSLT and
XPointer.
The namespace declarations consist of a mapping from prefixes to namespace URIs.
The variable bindings, function library and namespace declarations used to evaluate a
subexpression are always the same as those used to evaluate the containing expression. The
context node, context position, and context size used to evaluate a subexpression are
sometimes different from those used to evaluate the containing expression. Several kinds of
expressions change the context node; only predicates change the context position and context
size (see [2.4 Predicates]). When the evaluation of a kind of expression is described, it will
always be explicitly stated if the context node, context position, and context size change for the
231
evaluation of subexpressions; if nothing is said about the context node, context position, and
context size, they remain unchanged for the evaluation of subexpressions of that kind of
expression.
XPath expressions often occur in XML attributes. The grammar specified in this section applies
to the attribute value after XML 1.0 normalization. So, for example, if the grammar uses the
character <, this must not appear in the XML source as < but must be quoted according to XML
1.0 rules by, for example, entering it as <. Within expressions, literal strings are delimited by
single or double quotation marks, which are also used to delimit XML attributes. To avoid a
quotation mark in an expression being interpreted by the XML processor as terminating the
attribute value the quotation mark can be entered as a character reference (" or ').
Alternatively, the expression can use single quotation marks if the XML attribute is delimited
with double quotation marks or vice‐versa.
One important kind of expression is a location path. A location path selects a set of nodes
relative to the context node. The result of evaluating an expression that is a location path is the
node‐set containing the nodes selected by the location path. Location paths can recursively
contain expressions that are used to filter sets of nodes. A location path matches the
production LocationPath.
In the following grammar, the non‐terminals QName and NCName are defined in [XML Names],
and S is defined in [XML]. The grammar uses the same EBNF notation as [XML] (except that
grammar symbols always have initial capital letters).
Expressions are parsed by first dividing the character string to be parsed into tokens and then
parsing the resulting sequence of tokens. Whitespace can be freely used between tokens. The
tokenization process is described in [3.7 Lexical Structure].
1 Introduction
As increasing amounts of information are stored, exchanged, and presented using XML, the
ability to intelligently query XML data sources becomes increasingly important. One of the great
strengths of XML is its flexibility in representing many different kinds of information from
diverse sources. To exploit this flexibility, an XML query language must provide features for
retrieving and interpreting information from these diverse sources.
XQuery is designed to meet the requirements identified by the W3C XML Query Working Group
[XML Query 1.0 Requirements] and the use cases in [XML Query Use Cases]. It is designed to be
a language in which queries are concise and easily understood. It is also flexible enough to
query a broad spectrum of XML information sources, including both databases and documents.
The Query Working Group has identified a requirement for both a non‐XML query syntax and
an XML‐based query syntax. XQuery is designed to meet the first of these requirements.
XQuery is derived from an XML query language called Quilt [Quilt], which in turn borrowed
features from several other languages, including XPath 1.0 [XPath 1.0], XQL [XQL], XML‐QL
[XML‐QL], SQL [SQL], and OQL [ODMG].
232
[Definition: XQuery operates on the abstract, logical structure of an XML document, rather than
its surface syntax. This logical structure, known as the data model, is defined in [XQuery/XPath
Data Model (XDM)].]
XQuery Version 1.0 is an extension of XPath Version 2.0. Any expression that is syntactically
valid and executes successfully in both XPath 2.0 and XQuery 1.0 will return the same result in
both languages. Since these languages are so closely related, their grammars and language
descriptions are generated from a common source to ensure consistency, and the editors of
these specifications work together closely.
XQuery also depends on and is closely related to the following specifications:
• [XQuery/XPath Data Model (XDM)] defines the data model that underlies all XQuery
expressions.
• [XQuery 1.0 and XPath 2.0 Formal Semantics] defines the static semantics of XQuery and
also contains a formal but non‐normative description of the dynamic semantics that
may be useful for implementors and others who require a formal definition.
• The type system of XQuery is based on [XML Schema].
• The built‐in function library and the operators supported by XQuery are defined in
[XQuery 1.0 and XPath 2.0 Functions and Operators].
• One requirement in [XML Query 1.0 Requirements] is that an XML query language have
both a human‐readable syntax and an XML‐based syntax. The XML‐based syntax for
XQuery is described in [XQueryX 1.0].
This document specifies a grammar for XQuery, using the same basic EBNF notation used in
[XML 1.0]. Unless otherwise noted (see A.2 Lexical structure), whitespace is not significant in
queries. Grammar productions are introduced together with the features that they describe,
and a complete grammar is also presented in the appendix [A XQuery Grammar]. The appendix
is the normative version.
In the grammar productions in this document, named symbols are underlined and literal text is
enclosed in double quotes. For example, the following production describes the syntax of a
function call:
[93] FunctionCall ::= QName "(" (ExprSingle ("," ExprSingle)*)? ")"
The production should be read as follows: A function call consists of a QName followed by an
open‐parenthesis. The open‐parenthesis is followed by an optional argument list. The argument
list (if present) consists of one or more expressions, separated by commas. The optional
argument list is followed by a close‐parenthesis.
Certain aspects of language processing are described in this specification as implementation‐
defined or implementation‐dependent.
233
• [Definition: Implementation‐defined indicates an aspect that may differ between
implementations, but must be specified by the implementor for each particular
implementation.]
• [Definition: Implementation‐dependent indicates an aspect that may differ between
implementations, is not specified by this or any W3C specification, and is not required to
be specified by the implementor for any particular implementation.]
This document normatively defines the dynamic semantics of XQuery. The static semantics of
XQuery are normatively defined in [XQuery 1.0 and XPath 2.0 Formal Semantics]. In this
document, examples and material labeled as "Note" are provided for explanatory purposes and
are not normative.
2 Basics
The basic building block of XQuery is the expression, which is a string of [Unicode] characters
(the version of Unicode to be used is implementation‐defined.) The language provides several
kinds of expressions which may be constructed from keywords, symbols, and operands. In
general, the operands of an expression are other expressions. XQuery allows expressions to be
nested with full generality. (However, unlike a pure functional language, it does not allow
variable substitution if the variable declaration contains construction of new nodes.)
Note:
This specification contains no assumptions or requirements regarding the character set
encoding of strings of [Unicode] characters.
Like XML, XQuery is a case‐sensitive language. Keywords in XQuery use lower‐case characters
and are not reserved—that is, names in XQuery expressions are allowed to be the same as
language keywords, except for certain unprefixed function‐names listed in A.3 Reserved
Function Names.
[Definition: In the data model, a value is always a sequence.] [Definition: A sequence is an
ordered collection of zero or more items.] [Definition: An item is either an atomic value or a
node.] [Definition: An atomic value is a value in the value space of an atomic type, as defined in
[XML Schema].] [Definition: A node is an instance of one of the node kinds defined in
[XQuery/XPath Data Model (XDM)].] Each node has a unique node identity, a typed value, and
a string value. In addition, some nodes have a name. The typed value of a node is a sequence
of zero or more atomic values. The string value of a node is a value of type xs:string. The name
of a node is a value of type xs:QName.
[Definition: A sequence containing exactly one item is called a singleton.] An item is identical to
a singleton sequence containing that item. Sequences are never nested—for example,
combining the values 1, (2, 3), and ( ) into a single sequence results in the sequence (1, 2, 3).
[Definition: A sequence containing zero items is called an empty sequence.]
234
[Definition: The term XDM instance is used, synonymously with the term value, to denote an
unconstrained sequence of nodes and/or atomic values in the data model.]
Names in XQuery are called QNames, and conform to the syntax in [XML Names]. [Definition:
Lexically, a QName consists of an optional namespace prefix and a local name. If the
namespace prefix is present, it is separated from the local name by a colon.] A lexical QName
can be converted into an expanded QName by resolving its namespace prefix to a namespace
URI, using the statically known namespaces [err:XPST0081]. [Definition: An expanded QName
consists of an optional namespace URI and a local name. An expanded QName also retains its
original namespace prefix (if any), to facilitate casting the expanded QName into a string.] The
namespace URI value is whitespace normalized according to the rules for the xs:anyURI type in
[XML Schema]. Two expanded QNames are equal if their namespace URIs are equal and their
local names are equal (even if their namespace prefixes are not equal). Namespace URIs and
local names are compared on a codepoint basis, without further normalization.
Certain namespace prefixes are predeclared by XQuery and bound to fixed namespace URIs.
These namespace prefixes are as follows:
• xml = http://www.w3.org/XML/1998/namespace
• xs = http://www.w3.org/2001/XMLSchema
• xsi = http://www.w3.org/2001/XMLSchema‐instance
• fn = http://www.w3.org/2005/xpath‐functions
• local = http://www.w3.org/2005/xquery‐local‐functions (see 4.15 Function
Declaration.)
In addition to the prefixes in the above list, this document uses the prefix err to represent the
namespace URI http://www.w3.org/2005/xqt‐errors (see 2.3.2 Identifying and Reporting
Errors). This namespace prefix is not predeclared and its use in this document is not normative.
Element nodes have a property called in‐scope namespaces. [Definition: The in‐scope
namespaces property of an element node is a set of namespace bindings, each of which
associates a namespace prefix with a URI, thus defining the set of namespace prefixes that are
available for interpreting QNames within the scope of the element. For a given element, one
namespace binding may have an empty prefix; the URI of this namespace binding is the default
namespace within the scope of the element.]
Note:
In [XPath 1.0], the in‐scope namespaces of an element node are represented by a collection of
namespace nodes arranged on a namespace axis, which is optional and deprecated in [XPath
2.0]. XQuery does not support the namespace axis and does not represent namespace bindings
in the form of nodes. However, where other specifications such as [XSLT 2.0 and XQuery 1.0
Serialization] refer to namespace nodes, these nodes may be synthesized from the in‐scope
namespaces of an element node by interpreting each namespace binding as a namespace node.
235
[Definition: Within this specification, the term URI refers to a Universal Resource Identifier as
defined in [RFC3986] and extended in [RFC3987] with the new name IRI.] The term URI has
been retained in preference to IRI to avoid introducing new names for concepts such as "Base
URI" that are defined or referenced across the whole family of XML specifications.
236
II UNIT
•• BBuussiinneessss M Weebb SSeerrvviicceess –– BB22bb –– BB22cc
Moottiivvaattiioonnss FFoorr W
•• TTeecchhnniiccaall M
Moottiivvaattiioonnss
•• LLiim
miittaattiioonnss O
Off CCoorrbbaa AAnndd D
Dccoom
m
•• SSeerrvviiccee‐‐O
Orriieenntteedd AArrcchhiitteeccttuurree ((SSooaa))
•• AArrcchhiitteeccttiinngg W
Weebb SSeerrvviicceess
•• IIm
mpplleem
meennttaattiioonn VViieew
w
•• W
Weebb SSeerrvviicceess TTeecchhnnoollooggyy SSttaacckk
•• LLooggiiccaall VViieew
w
•• CCoom
mppoossiittiioonn O
Off W
Weebb SSeerrvviicceess
•• D
Deeppllooyym
meenntt VViieew m AApppplliiccaattiioonn SSeerrvveerr TToo PPeeeerr TToo PPeeeerr
w –– FFrroom
•• PPrroocceessss VViieew
w
•• LLiiffee IInn TThhee RRuunnttiim
mee
237
Introduction
A short history of Web services
The Internet began its success story in the early nineties, even though it was used in the
academic world before for many years. The main driver for the Internet’s success was the
World Wide Web, whose main innovation was the easy access to information, from any place,
using standard Internet protocols and a simple data access protocol that enabled the
implementation browsers on
a variety of platforms. Together with the spread of the WWW, the Internet and its related
technologies became the de facto standard to connect computers all around the world.
With the spread of the Internet, it became clear that the infrastructure that was introduced by
the Internet could be used not just to retrieve information that was to be presented using a
browser (called human‐to‐application, H2A, scenarios).
Rather, there was also an increased demand for application‐to‐application (A2A)
communication using the existing technologies. And, it was hoped that the existing protocols
could be used for this purpose.
However, it soon became clear that this was not the case. HTTP had been designed with the
retrieval of information in mind, following a very simple access path that basically relies on
documents being linked together by means of hypertexts. The protocol does not provide for
complex operations that arise from A2A scenarios. And some of the protocols that were
defined at this time could not be used either because they did not fit into the Web world or
they were too
restrictive.
In late1999, Microsoft® published an XML‐based protocol, called SOAP that could be used for
A2A scenarios. As it was one among many protocols suggested, it may due to the fact that IBM
started supporting SOAP in early2000 that eventually lead to a public acceptance of SOAP by
the industry.
At this point in time, SOAP was just a protocol to perform complex A2A scenarios. However, it
quickly gained popularity and it was clear that there was a need for better describing and
finding the services that were implemented using SOAP. The term Web services was coined
several months later, when IBM, Microsoft, and Ariba jointly published the Web Services
Description Language (WSDL). Eventually, UDDI was also introduced, thus completing the set of
standards and protocols that make up the basis of Web services.
238
As Figure 5.1 shows, Web services builds on SOAP's capability for distributed, decentralized
network communication by adding new protocols and conventions that expose business
functions to interested parties over the Internet from any Web‐connected device. As we
discussed in Chapter 1, we're moving into a new computing paradigm based on the assembly of
constituent parts. SOAP, for example, is not a stand‐alone technology, but the result of
synergies between XML and HTTP. This phenomenon of emergence has not been lost on the
major industry players, who are actively working to update their existing infrastructures to keep
pace with the changes wrought by SOAP‐based messaging for the global Web.
Web services is a technology and process for discovery and connection.
Web services represents an industry‐wide response to the need for a flexible and efficient
business collaboration environment. Technically, it is a way to link loosely coupled systems
using technology that doesn't bind them to a particular programming language, component
model, or platform. Practically, it represents a discrete business process with supporting
protocols that functions by describing and exposing itself to users of the Web, being invoked by
a remote user, and returning a response. It includes:
• Describing: Web services describes its functionality and attributes so that other
applications can figure out how to use it.
239
• Exposing: Web services register with a repository that contains a white pages holding
basic service‐provider information, a yellow pages listing services by category, and a
green pages describing how to connect and use the services.
• Being invoked: When a Web service has been located, a remote application can invoke
the service.
• Returning a response: When a service has been invoked, results are returned to the
requesting application.
The driving force behind Web services is the desire to allow businesses to use the Internet to
publish, discover, and aggregate other Web services using the global underpinning of SOAP. The
fact that the delivery of Web services requires only the Internet means that legacy code and
data as well as object systems can plug into the Web services framework. This capability is
expected to result in new products, business processes, and value chains with global scope,
deliverable over wired or wireless networks. How these will emerge is anyone's guess. But the
track record of the Web, XML, and now SOAP indicates that new technologies will rapidly
emerge.
Business Motivation for Web Services
Web services are the design center that can enable your company to rapidly take advantage of
XML. Web services are based on XML and are the application model adopted by the giants of
the IT industry. Companies such as Microsoft, IBM, Sun, HP, BEA and Software AG are building
the technologies that will make the vision of Web services a reality. In this chapter, we will
explore what Web services are and how you can take advantage of them by building a simple
Web services architecture for your company. This architecture is entirely based on standards,
but at the same time allows you to reuse existing infrastructure.
By now you should have heard something about Web services. The topic is hard to avoid. The
media machine is busy cranking out the hype that has become so typical of our industry.
”Revolutionary.” ”A breakthrough.” ”Microsoft, SAP, HP, Sun and IBM joining ranks.”
”Transforming software into services.” The buzzwords keep flying. There is not an IT publication
today that does not spout forth a never ending flood of abbreviations like WSDL, UDDI, SOAP,
RDF, XSLT, etc. And, like with so many other technologies, the average reader is left with the
question: ”What is in it for me?” This chapter is devoted to demystifying the topic of Web
services and to explaining what Web services are and how to apply them to increase rev‐
enue for your corporation and to drive down the IT cost associated with building new products,
services and revenue channels. If the information in this chapter whets your appetite, there is
plenty of additional information available on the Internet. Just type ”Web services” into your
favorite search engine and prepare to be blown away by the wealth of information available on
this topic. For now, the most important fact to remember is that Web services form the design
center around XML, the most important IT standard for the next two decades.
Let’s begin with looking at the term Web services. The use of the term ”Web” implies that Web
services are based on the World Wide Web. While this is true, it is actually quite misleading.
Although the technologies for Web services are based on the standards that have evolved in
240
the World Wide Web Consortium (W3C), Web services are really not limited to the Web itself.
(Remember that the Web is really the graphical user interface that made the Internet easy to
use.) Web services are a new approach for building, extending, integrating and deploying
applications based on XML.
This new approach builds applications that can facilitate the communication process between
humans and machines (the Web), as well as the communication process between machines
(the Internet) and the communication process between applications (application integration).
By simplifying communications between humans, machines and applications based on a
common standard (XML), Web services are an important building block on the path to total
business integration and the unbounded enterprise. (Seechapter 1.) Web services allow your IT
department to do a number of extremely significant things, including but not limited to:
• reuse existing applications.
• tie existing applications into a single view.
• make these applications available to employees, partners and customers.
• build application extensions that model your business process.
• flow information across departments, business units or corporations.
If this sounds interesting to you, you should also know that these things can be accomplished
very quickly and at a very low cost when compared to traditional approaches. The main reason
why Web services are finding such rapid approval in the IT industry is their ability to reduce IT
costs while delivering the capability of building new revenue models based on your core
business. In addition, Web services can be the model for new application types that need to be
deployed very rapidly. You might wonder: ”Why is this different? Have we not seen these
claims before?” The answer is yes, we have seen the claims, but there has been a crucial
element missing in the equation. That element is called XML – the Extensible Markup Language.
XML has been accepted in the industry as the de facto standard for storing, exchanging and
publishing electronic documents. (More information about XML can be obtained in the book
”The XML Shockwave” by the same author.). So what does XML have do with Web services?
Everything. Period.
Web services (and their benefits of cost reduction and speed of implementation) are based on
the assumption that every single element and every single buzzword in the family of Web
services technologies must work with XML, and, as a matter of fact, are driven by the concept
of XML documents. To some extent, Web services are the ”killer application” for XML,much in
the same way the Web browser was the killer application for the Internet. The Web browser
made the Internet easy to use and spawned an explosion of new applications that allow
average users to access information. There are a lot of benefits to this approach, and the fact
that more than 400 million people are using the Web proves that there is ampledemand for this
kind of approach. There is a downside, however. The Web is all about graphical user interfaces
to applications. What the Web fails to accomplish is the ability for applications and machines to
communicate directly and automatically based on common standards. This is where a huge
potential for internal cost savings still remains untapped. According to research done at the
Massachusetts Institute of Technology (”The Unfinished Revolution,” by Michael Dertouzos)
about 50 percent of the world economy is based on office work. This is a huge number. What
this means is that in order to gain more productivity, we need to build more automation into
our business processes. We cannot continue to assume that a human being sitting in front of a
Web browser is the answer to the productivity challenge every company faces. We need
simple, automatic, machine‐to‐machine and application‐to‐application communication. But
241
how? How can this be done without the huge cost that used to be associated with integration
software such as CORBA and EDI?
Enter XML. XML is the standard accepted worldwide that helps corporations and IT companies
define the meaning of data. Once you have defined the meaning of data, you can build XML
documents that describe themselves that can be stored, exchanged and published with a
minimum of effort. XML is the technology that enables standards‐based machine‐to‐ machine
communications and application integration, process integration and business automation. XML
is well accepted. According to Giga Corporation, about 95 percent of companies surveyed have
already started using XML, and about 30 percent of companies have put their first mission‐
critical projects in place based on XML. But XML is a basic technology that needs an approach, a
model and a design center to succeed in the long term. Web services are the design center for
XML based applications.
XML is taking the world of IT by storm, and the adoption of the Web services model is becoming
the killer application for XML. To understand this a bit more, we need to look at the second part
of the term Web services, i.e. services. Like the term Web in Web services, the word services is
correct in describing what Web services seek to accomplish, but it is also misleading.
A Web Service really is a self‐contained application that is fully XML enabled. A Web Service can
do the following things automatically, without requiring human intervention:
• Describe a business function that is available in your corporation, for example a request for
credit.
• Publish that business function to other applications or end users based on standard Internet
technology such as Web servers, application servers, integration servers or XML servers.
• Receive XML documents as valid input to this business function.
• Store those XML documents to preserve the integrity of the request and to enable auditing
and tracking.
• Evaluate and process that input.
• Route that input into a processing application that could be another Web service or a
traditional application.
• Produce a result, for example the approval of credit.
• Store that result to preserve the integrity of the output and to enable auditing and tracking.
• Route the result to the consumer of the business function, which could be a user, a device
such as a mobile phone, a PC, a mainframe or any other entity that can be reached over the
Internet.
Web Services
Introduction
In this chapter we are going to see following,
1. What is a Web Service?
2. Why we need a Web Service?
3. ASP.Net Web Services
4. Use Data in Web Services
242
What is a Web Service?
The Internet is quickly evolving from today's Web sites that just deliver user interface
pages to browsers to a next generation of programmable Web sites that directly link
organizations, applications, services, and devices with one another. These programmable
Web sites become more than passively accessed sites ‐ they become reusable, intelligent
Web Services. They allow different applications to share business logic over the network.
The technical definition of a Web Service is programmable application logic
accessible via Standard web protocols.
Programmable Application Logic: A Web Service is Application non‐specific. The
application logic can be implemented by components, by PERL scripts, or by any other
mechanism that supports XML.
Standard Web Protocols: Web Services use Internet transport protocols such as SOAP,
HTTP and SMTP.
Why we need a Web Service?
Server and client need to understand following:
• Implementation details of a particular service,
• Service deployment,
• Security types and trusts, etc.
The common language runtime provides built‐in support for creating and exposing Web
Services, using a programming abstraction that is consistent and familiar to both ASP.NET
Web Forms developers and existing Visual Basic users. The resulting model is scalable and
extensible, and supports open Internet standards (HTTP, XML, SOAP, WSDL). Therefore it
can be accessed and consumed from any client or Internet‐enabled device.
One important feature of the Web services based computing model is that a client need not
know the language in which XML Web services are implemented. The client just needs to know
the location of an XML Web service and the methods that the client can call on the service.
Web services use XML‐based messaging to send and receive data, which enables
heterogeneous applications to interoperate with each other. You can use Web services to
integrate applications that are written in different programming languages and deployed on
different platforms. You can deploy Web services within an intranet as well as on the Internet.
While the Internet brings users closer to organizations, Web services allow organizations to
integrate their applications.
Web Services Infrastructure
243
The Web services infrastructure provides several components that enable client applications to
locate and consume Web services. These components include the following:
XML Web services directories (UDDI)
These directories provide a central place to store published information about Web
services. The Universal Description, Discovery, and Integration (UDDI) specifications
define the guidelines for publishing information about Web services. The XML schemas
associated with UDDI define four types of information that you must publish to make
your Web service accessible. This information includes business information, service
information, binding information, and service specifications. Microsoft provides one
such directory service, which is located at http://uddi.microsoft.com.
Web services discovery. (WSDL)
Using this process, clients locate the documents that describe a Web service using
WSDL. The discovery process enables clients to know about the presence of a Web
service and about the location of a particular XML Web service.
Web services description.
This component provides information that enables you to know which operations to
perform on a Web service. The Web service description is an XML document that
specifies the format of messages that a Web service can understand.
Web service wire formats.
To enable communication between disparate systems, Web services use open wire
formats. Open wire formats are the protocols that can be understood by any system
that is capable of supporting common Web standards, such as HTTP and SOAP. The
HTTP‐GET and HTTP‐POST protocols are the standard Web protocols that allow you to
send parameters as name‐value pairs. The HTTP‐GET protocol allows you to send URL‐
encoded parameters as name‐value pairs to an XML Web service. The HTTP‐GET
protocol requires you to append the parameter name‐value pairs to the URL of the Web
service. You can also use the HTTP‐POST protocol to URL‐encode and pass parameters
to the Web service as name‐value pairs. However, the parameters are passed inside the
actual request message and not appended to the URL of the Web service.
The SOAP protocol allows you to exchange structured and typed information between the
applications on the Internet. The SOAP protocol consists of four parts. The first part is
mandatory and defines the envelope that contains the message. The SOAP envelope is the basic
unit of exchange between the processors of SOAP messages. The second part defines the
optional data encoding rules that you use to encode application‐specific data types. The third
part defines the request/response pattern of message exchanges between Web services. The
fourth part, which is optional, defines the bindings between the SOAP and HTTP protocols.
244
How components of the XML Web services infrastructure enable clients to locate and call
methods on XML Web services.
When a client accesses a UDDI service to locate a Web service, the UDDI service returns a URL
to the discovery document of the Web service. A discovery document is a .disco file, which
contains the link to the resources that describe a Web service. A discovery file is an XML
document that enables programmatic discovery of a Web service. After the client receives the
URL to the discovery document, the client requests a server, which returns the discovery
document to the client. The contents of a sample discovery document are shown in the
following code.
XML
<?xml version="1.0" ?>
<disco:discovery xmlns:disco="http://schemas.xmlsoap.org/disco"
xmlns:wsdl="http://schemas.xmlsoap.org/disco/wsdl">
245
<wsdl:contractRef ref="http://www.seed.com/MyWebService.asmx?WSDL"/>
</disco:discovery>
The client uses the information in the discovery document and requests a server to return the
service description of a Web service. The service description is a .wsdl file and enables a client
to interact with a Web service.
Next the client invokes methods on an XML Web service.
The process of communication between a client and an XML Web service is similar to a remote
procedure call (RPC). The client uses a proxy object of the XML Web service on the local
computer to call methods on the Web service.
Figure shows the process of communication between a client and a Web service.
Client and Web service communication
As shown in Figure, the interaction between a client and a Web service consists of several
phases. The following tasks are performed during these phases:
246
1. The client creates an object of the Web service proxy class on the same computer on
which the client resides.
2. The client calls a method on the proxy object.
3. The Web services infrastructure on the client system serializes the method call and
arguments into a SOAP message and sends it to the Web service over the network.
4. The infrastructure on the server on which the Web service resides deserializes the SOAP
message and creates an instance of the Web service. The infrastructure then calls the
method with the arguments on the Web service.
5. The Web service executes the method and returns the value with any out parameters to
the infrastructure.
6. The infrastructure serializes the return value and any out parameters into a SOAP
message and sends them to the client over the network.
7. The infrastructure on the client computer deserializes the SOAP message containing the
return value and any out parameters and sends them to the proxy object.
8. The proxy object sends the return value and any out parameters to the client.
To build Web services that the clients can consume, you use the ASP.NET infrastructure, which
is an integral part of the .NET Framework. Visual Studio .NET provides tools to build, deploy,
and publish your Web services using ASP.NET.
Benefits of Web Services
Experts and visionaries believe that the benefits of XML Web services will be instrumental in
propelling explosive business growth over the next few years. One of the major benefits is Web
services' ease of integration. You will easily integrate your software with other pieces of
software. You can run on all kinds of machines, from the desktop to the mainframe, either
within your enterprise or at external sites. This ease of integration will enable tighter business
relationships and more efficient business processes.
With Web services readily available, and as the pool of XML Web services grows, you will be
able to find software modules that can be integrated into your own application, by finding it
and integrating it through XML Web services. Integrate with existing Web services instead of
reinventing them. The bottom line is that you will be able to develop applications much faster
than before.
An integral part of the XML Web services programming model, is the ease of integration with
external data sources. No longer does each application need to copy and maintain external data
sources. You can request and get information in real time, and transform it to your particular
format. This will allow you to deliver individualized software and services, while your
maintenance burden is reduced.
Consumers will enjoy ease of use when using XML Web services‐based applications. XML Web
services link applications, services, and devices together. Using Web services will be an
247
integrated experience that excels in its simplicity. XML Web services give users the ability to act
on information any time, any place, and from any smart device.
Businesses will love Web services because it will force them to streamline their processes. All
suppliers will use the same language to describe their offerings. An XML Web service is a
simple, reliable way to blend existing systems with new applications and services.
Microsoft is already sporting several commercial applications that were written in a record
speed, thanks to the use of XML Web services. The first one is from Dollar Rent‐A‐Car. In two
weeks, programmers built, tested, and deployed a Web Services‐based solution that translates
reservation requests and data between the company's mainframe‐based reservation system
and an airline partner's UNIX servers. Dollar can now reuse the same integration model to link
their reservation system with other airline or hotel partners. The second case is expedia.com.
This is a site that will find you the lowest price itinerary. They are now in the process of
transforming itineraries into communication centers. These centers are Web services that users
can access to get relevant and timely information at their convenience. With this new service,
Expedia hopes to attract new customers and further strengthen the loyalty of their existing
customer base.
z B2B (Business‐to‐Business)
Companies doing business with each other such as manufacturers selling to distributors
and wholesalers selling to retailers. Pricing is based on quantity of order and is often
negotiable.
248
z B2C (Business‐to‐Consumer)
Businesses selling to the general public typically through catalogs utilizing shopping cart
software. By dollar volume, B2B takes the prize, however B2C is really what the average
Joe has in mind with regards to ecommerce as a whole.
Constraints of Ecommerce
z Time for delivery of physical products
z Physical product, supplier & delivery uncertainty .
z Perishable goods
z Limited and selected sensory information
z Returning goods.
z Privacy, security, payment, identity, contract.
z Defined services & the unexpected .
z Personal service
What is B2C?
• B2C Commerce: Interactions relating to the purchase and sale of goods and services between
a business and consumer—retail transactions.
• “Novelty” is that retail transaction is done on the Internet, rather than a “brick and mortar”
store location.
• Technical evolution of B2C from “brick and mortar” model not new.
Problem 1: I want to use your Web Service.
• Where can I find it?
• What messages are accepted / generated? What syntax?
• How should they be encoded?
Problem 2: Many others also want to offer Web Services
• Need standard format for describing Web Services
Revenue Models
• Sell goods and services and take a cut (just like B&M retailers).(e.g., Amazon, E*Trade, Dell)
• Advertising
– Ads only (original Yahoo)
249
– Ads in combination with other sources
• Transaction fees
• Sell digital content through subscription. (e.g., WSJ online, Economist Intelligence
Wire)
Open Issues in E‐commerce
• Globalization
• Contractual and Financial Issues
• Ownership
• Privacy and Security
• Interconnectivity and Interoperability
• Deployment
• Main Attraction:
Lower Retail Prices
• “B2C Pure Plays” could eliminate intermediaries, storefront costs, some distribution
costs, etc.
• Archetype: www.amazon.com
Basic Problems Encountered Immediately
• “Customer‐Acquisition Costs” are huge.
• Service is technically commoditizable, and there are no significant network effects.
• Customers’ switching costs are tiny. (Lock‐in to online book‐buying is high. Lock‐in
to
Amazon is low. Recall Netscape and IE.)
• Competition is fierce in almost all segments. Few e‐tailers are profitable.
• Investors have run out of money and patience.
250
251
Web Services Description Language
• Defines the contract/interface for using a Web Service:
o URI
o Messages accepted & generated (Syntax and datatypes)
o Access protocol
o Message encoding
• Standard format (use of XML)
Web Services Description examples
1. get a temperature for a given US town
2. set the temperature for a given US town
3. Web Services Description example
4.
5. <definitions name="Temperature"
6. targetNamespace="http://weather.com/USservice"
7. [... xmlns declarations ...]>
8.
9. <types>
10. <xsd:schema
11. targetNamespace="http://wheather.com/USservice.xsd">
12. <xsd:element name="temperature"
13. type='xsd:short' />
14. </xsd:schema>
15. </types>
16.
17. <message name="message1">
18. <part name="zipcode"
19. type="xsd:nonNegativeInteger"/>
20. </message>
21. <message name="message2">
22. <part name="temperature" type="xus:temperature"/>
23. </message>
24.
25. <interface name="Temperature1">
26. <operation name="getTemperature">
27. <input message="us:message1"/>
28. <output message="us:message2"/>
29. </operation>
252
30. </interface>
31.
32. <interfaceBinding name="TemperatureGETBinding"
33. type="us:Temperature1">
34. <http:binding verb="GET"/>
35. <operation name="us:getTemperature">
36. <http:operation
37. location="GetTemperature"/>
38. <input>
39. <http:urlEncoded />
40. </input>
41. <output>
42. <mime:content type="application/xml"/>
43. </output>
44. </operation>
45. </interfaceBinding>
46.
47. <service name="USTemperatureService">
48. <endpoint name="get"
49. binding="us:TemperatureGETBinding">
50. <http:address
51. location="http://weather.com/" />
52. </endpoint>
53. </service>
54.
55. </definitions>
56. <definitions name="Temperature"
57. targetNamespace="http://weather.com/USservice"
58. [... xmlns declarations ...]>
59.
60. <message name="message3">
61. <part name="zipcode"
62. type="xsd:nonNegativeInteger"/>
63. <part name="temperature" type="xsd:short"/>
64. </message>
65.
66. <interface name="Temperature2">
67. <operation name="setTemperature">
68. <input message="us:message3"/>
69. </operation>
70. </interface>
71. </interface>
72.
73. <interfaceBinding name="TemperaturePOSTBinding"
74. type="us:Temperature2">
75. <http:binding verb="POST"/>
253
76. <operation name="us:setTemperature">
77. <http:operation
78. location="SetTemperature"/>
79. <input>
80. <mime:content
81. type="application/x‐www‐form‐urlencoded"/>
82. </input>
83. </operation>
84. </interfaceBinding>
85.
86. <service name="USTemperatureService">
87. <endpoint name="set"
88. binding="us:TemperaturePOSTBinding">
89. <http:address
90. location="http://weather.com/" />
91. </endpoint>
92. </service>
93.
94. </definitions>
What is CORBA?
Common Object Request Broker Architecture (CORBA) is a competing distributed systems
technology that offers greater portability than remote method invocation. Unlike RMI, CORBA
isn't tied to one language, and as such, can integrate with legacy systems of the past written in
older languages, as well as future languages that include support for CORBA. CORBA isn't tied to
a single platform (a property shared by RMI), and shows great potential for use in the future.
That said, for Java developers, CORBA offers less flexibility, because it doesn't allow executable
code to be sent to remote systems.
CORBA services are described by an interface, written in the Interface Definition Language (IDL).
IDL mappings to most popular languages are available, and mappings can be written for
languages written in the future that require CORBA support. CORBA allows objects to make
requests of remote objects (invoking methods), and allows data to be passed between two
remote systems. Remote method invocation, on the other hand, allows Java objects to be
passed and returned as parameters. This allows new classes to be passed across virtual
machines for execution (mobile code). CORBA only allows primitive data types, and structures
to be passed ‐ not actual code.
Under communication between CORBA clients and CORBA services, method calls are passed to
Object Request Brokers (ORBs). These ORBs communicate via the Internet Inter‐ORB Protocol
(IIOP). IIOP transactions can take place over TCP streams, or via other protocols (such as HTTP),
254
in the event that a client or server is behind a firewall. The following diagram shows a client and
a servant communicating.
Limitations of CORBA
Describing services require the use of an interface definition language (IDL) which must be
learned. Implementing or using services require an IDL mapping to your required language ‐
writing one for a language that isn't supported would take a large amount of work.
IDL to language mapping tools create code stubs based on the interface ‐ some tools may not
integrate new changes with existing code.
CORBA does not support the transfer of objects, or code.
The future is uncertain ‐ if CORBA fails to achieve sufficient adoption by industry, then CORBA
implementations become the legacy systems.
Some training is still required, and CORBA specifications are still in a state of flux
Not all classes of applications need real‐time performance, and speed may be traded off against
ease of use for pure Java systems.
Service‐Oriented Architectures are an approach to distributed computing that thinks of
software resources as services available on a network. Such architectures are nothing new;
CORBA and DCOM are familiar examples. However, these older examples of service‐orientation
suffered from a few difficult problems. First, they were tightly coupled, which meant that both
ends of each distributed computing link had to agree on the details of the API. A code change
to a COM object, for example, required corresponding changes to the code accessing that
object. Secondly, such Service‐Oriented Architectures were proprietary. Microsoft unabashedly
controlled DCOM, and while CORBA was ostensibly a standards‐based effort, in practice,
implementing a CORBA architecture typically necessitated the decision to work with a single
vendor's implementation of the specification.
255
Web services is an evolutionary development that improves upon DCOM and CORBA's
weaknesses. What is new about today's Service‐Oriented Architectures built with Web services
is that they are standards‐based and loosely coupled. The reliance upon universally accepted
standards like XML and SOAP provides broad interoperability among different vendors'
solutions, and loose coupling separates the participants in distributed computing interactions
so that modifying the interface of one participant in the exchange does not break the other.
The combination of these two core principles means that companies can implement Web
services without having any knowledge of the consumers of those services, and vice versa. The
Service‐Oriented Architectures we will discuss are the standards‐based, loosely coupled kind,
which we will refer to as SOAs.
CORBA and DCOM: A Feature Comparison
In the following sections, we define each “enterprise” quality and compare the levels of support
currently
available in CORBA and DCOM specifications and products. Although this list is not
comprehensive, it
stands as a reasonable baseline for middleware comparison. These features are not necessarily
listed in
order of priority. Instead, each is treated independently, though many are highly
interdependent. Finally,
individual ratings are given at the end of each section to indicate the relative levels of
enterprise readiness.
A “+” implies full readiness, “0” connotes marginal status, and “‐” indicates a failure to meet the
overall
needs of the enterprise.
Interoperability
Cross‐Language Support
Cross‐language support is one part of the critical interoperability capabilities required of
enterprise systems. While languages such as C++, Visual Basic, and Java are on the rise, COBOL
is still often cited as the most widely used programming language, with an estimated three
million active programmers.
CORBA
CORBA was designed from the ground up to be language and platform independent through
the use of a common Interface Definition Language (IDL). Now an ISO standard, OMG IDL
provides a common notation for describing cross‐platform, cross‐language application program
interfaces (APIs). IDL is used to define the “interface” of the component, not the inner
workings.
For this, other standard programming languages are used. IDL interfaces are translated to
standard languages through mappings. Currently, IDL has been mapped to C, C++, Smalltalk,
Ada, OLE (Visual Basic, PowerBuilder, Delphi, etc.), Java, and soon to Eiffel and Objective C. The
benefits of interoperability are not without costs however. IDL was never meant to substitute
for a general‐purpose language. Instead, it was designed to express generalized interfaces. IDL
limits the language data types to a least common denominator that can be supported by all
256
languages. Although some of the language‐specific data types are not directly usable, IDL does
permit an “any” type to overcome this obstacle.
DCOM
DCOM’s language portability (heterogeneity) is based upon a “binary standard.” Binary
compatibility is accomplished at the ones‐and‐zeros level, an area that has previously been the
domain of computer language compilers and interpreters. To guarantee compatibility at this
level, the way each language is translated to machine binary code must be controlled. This can
present a few obstacles, but also has its benefits. First, there are fundamental differences in
how languages are translated. Since some languages are compiled and others are interpreted,
“binary compatibility” requires that components support all possible translation variations.
Second, there are many compilers/interpreters for a given language, each taking unique
approaches to code translation. Finally, specifying compatibility at such a low hardware level
creates vulnerabilities due to advances in hardware itself.
Microsoft has been successful in controlling the mainstream development tools in the desktop
arena for DCOM’s predecessor, COM. COM is currently supported by the popular array of
Microsoft products as well as Java, PowerBuilder, Delphi, and Micro Focus COBOL. Distributed
COM, however, requires additional support from Microsoft or a third party that ports DCOM
(see Software AG below).
Summary: Both CORBA and DCOM provide extensive support for multiple programming
languages, though they use different techniques.
META Group Consulting
Enterprise Criterion Ratings
Interoperability CORBA DCOM
Cross‐Language Support + +
Cross‐Platform Support
The “middle” in middleware often refers to the synergistic connection between disparate
enterprise IT resources. Until it is feasible for all enterprise resources to be hosted entirely on
homogenous hardware platforms, middleware must support new and legacy platforms and the
freedom to mix them as required.
CORBA
Cross‐platform support has always been a central focus of CORBA. ORBs currently exist for
more than 30 platforms and supports even more Microsoft operating systems than DCOM.
Orbix, one of the leading ORB products, supports 20 platforms itself.
DCOM
DCOM has approached cross‐platform support as an afterthought. In 1993, Microsoft
approached a German company, Software AG, to port DCOM to other platforms. Software AG
has ported DCOM to several Unix variants; however, the port does not include many of the
components of DCOM. For example, many critical supporting technologies have not been
ported, the most important of which are MTS and MSMQ. Without MTS and MSMQ, DCOM is
simply not a viable enterprise middleware. DCOM has also been ported to Macintoshes and
DEC Alphas that run Windows NT. Many other ports are currently in the works (Open VMS,
Digital Unix, HPUX, Solaris, IBM OS/390, IBM AIX, and Linux).
Summary: It should be clear by now that in order to cast either of these technologies into the
enterprise role, a comprehensive collection of critical infrastructure services must be
considered for each of the required platforms. For DCOM, this means exploiting the combined
synergies of COM, MTS, MSMQ, and clustering them together to fulfill the needs of enterprise
257
computing. Without MTS, for example, DCOM will be unable to fulfill these needs. By way of
comparison, CORBA‐based products typically provide each of their services on all supported
platforms. As such, ORBs are much further ahead in their support for heterogeneous enterprise
environments. Enterprise Criterion Ratings
Interoperability CORBA DCOM Cross‐Platform Support + ‐
Network Communications
Robust support for enterprise network communications requires that middleware seamlessly
provide interoperability with many disparate networked systems. To enable this, the
middleware should be “protocol transparent.”
CORBA
The predominant CORBA networking model for cross‐ORB communication is based on a form of
TCP known as IIOP (Internet Inter‐ORB Protocol), a connection‐based protocol. IIOP was
specifically designed to ensure that all ORBs use a common communications protocol. Internal
to a particular ORB, however, other protocols are possible. ORB products, similar to DCOM, are
usually remote procedure call (RPC)‐based.
META Group Consulting 9
DCOM
Initially, DCOM utilized UDP/DCOM, a connectionless protocol that is based on the OSF’s DCE
RPC specification with some changes. DCOM now offers a TCP protocol configuration as an
option, although by using this, many efficiency features are sacrificed.
Summary: CORBA has established the lead in common network protocol support through the
de facto IIOP standard. DCOM provides protocol options, but does not support them equally.
Enterprise Criterion Ratings Interoperability CORBA DCOM
Network Communications + 0
Common Services
Common services form the base infrastructure of the middleware. These services are married
to the individual patterns of business in an enterprise setting. For example, a banking model is
highly transaction oriented and requires secure transaction support as a fundamental
middleware service. To this end, most organizations require a number of key services. It should
be understood, however, that not all services are equally important to all enterprises. Where
more than one service is required, it should be fully compatible and interoperable with the
others. Using the OMG specification terminology, we consider the following services as a
minimal set for enterprise middleware: Transactions, Directory, Messaging, Security, and
interoperability. The CORBA road map provides ORB vendors with a path for service
interoperability. This interoperability is required to integrate the best available third‐party
services across platforms. Microsoft’s approach is less explicit, with service interoperability
implied for the NT platform only. CORBA and DCOM products support these basic services in
various degrees.
CORBA
The OMG has concentrated on the development and integration of key architectural services.
Their technology adoption process is specifically aimed at ensuring that services are
implemented in an interoperable manner. The CORBA specification defines 15 services, though
not all commercial ORB products support the complete set. One exception to this is IBM’s COS
(Common Object Services), a suite of the full 15 CORBA services that is compatible with DSOM
and other brokers.
DCOM
258
DCOM services are less defined from an architectural standpoint, though there are many
CORBA equivalents. DCOM currently offers a limited naming service, transactions, and security
integration with NT. Other services such as MSMQ and clustering are becoming available, but
are not formally integrated into the DCOM specification.
Summary: Full‐service support is not yet available from DCOM or CORBA products. At present,
CORBA products have the lead in the number, maturity, and scope of enterprise‐required
services that are made available to both new and legacy applications. Enterprise Criterion
Ratings
Interoperability CORBA DCOM
Common Services 0 ‐
META Group Consulting 10
Reliability
Transactions
Transaction support has been the focus of both middleware technologies in recent years.
During 1997, the gaps in both camps were significantly closed.
CORBA
CORBA’s Object Transaction Service (OTS) specification offers a range of services for distributed
transaction support. These services extend the range of traditional flat transactions to support
both flat and nested transactions (since nested transactions break up transactions into
hierarchies of sub‐transactions, this offers developers the flexibility for failures in a
subtransaction to be retried using an alternative method, while the main transaction can
succeed).
OTS enables both ORB and non‐ORB applications to participate in the same transaction, so that
object transactions and procedural transactions (that support X/Open’s DTP standard) can
interoperate. It also supports transactions across heterogeneous ORBs, so that multiple ORBs
can participate in the same transaction. Also, a single IDL interface supports both transactional
and non‐transactional implementations. To make an object transactional, developers use an
interface that inherits from an abstract OTS class. Taken together, the interfaces for OTS, plus
the Concurrency and Control service and Transactions, offer full commit, rollback, locking and
other capabilities, enabling ORB vendors supporting it to offer distributed transaction
capabilities. A number of the ORB implementers have offered links to tools from traditional TP
monitors, and OTS enables them to incorporate these capabilities directly into the ORB and
distribute them.
The goal of integrating best‐of‐breed transaction products has been widely realized in the ORB
marketplace over the last year. Tuxedo, the most scalable TP monitor for highly distributed
environments, has been successfully integrated with two prominent ORBs. In addition, Visigenic
and Hitachi have integrated TPBroker and Iona has integrated Transarc in OrbixOTS.
DCOM
Microsoft also has been aggressively attacking transaction support in the form of its Microsoft
Transaction Server (MTS). As a fully integrated transaction service, MTS has great potential for
at least the Wintel environment, and is positioned by Microsoft as an extension to DCOM. With
MTS, transactions are supported implicitly, thereby freeing the developer from the complexity
of dealing with transaction services directly. This enables MTS to preserve the
component model. In addition, MTS provides a declarative security model. MTS is in an early
state of maturity, however. Few examples are available to assess the relative scalability of MTS,
and it has not been offered for the cross‐platform environment to date.
259
Summary: We continue to believe that CORBA will remain the leading‐edge middleware
transaction model for networked objects, with DCOM MTS transaction support suitable for low‐
end processing but gaining ground quickly.
Enterprise Criterion Ratings
Reliability CORBA DCOM
Transactions + 0
Messaging
Reliable transmission and receipt of messages is a foundational quality of distributed
middleware. Without it, the electronic commerce (EC) systems of tomorrow will ultimately fail
in delivering expedient and reliable services to the increasingly demanding marketplace.
Effective messaging requires four important qualities: reliability, user convenience, system
convenience, and performance.
META Group Consulting 11
In messaging, reliability means nothing less than guaranteed delivery. To guarantee delivery of
anything requires a reliable middleman, not unlike the US Postal Service. Rain or shine, the
postal service can be relied upon to deliver mail to its eventual destination. The operational
word here is “eventual.” If the weather becomes too severe, postal workers do not throw the
mail away; they hold onto it until the weather permits delivery. The same quality is required of
middleware.
User convenience, system convenience, and performance are highly interrelated qualities. User
convenience means that the sending and receiving parties are not forced to be at a particular
place and time to send and receive messages. This is known in technical terms as asynchronous
communication. With asynchronous communication, a sender or system does not have to wait
until the message is sent AND received before being able to do other work. This convenience
enables all parties — sender, system, and receiver — to continue performing useful work,
regardless of each other’s current situation. To support these needs, distributed middleware
requires a robust message queuing system. Message
queues support asynchronous transmission by providing a persistent queue (message queue) as
a temporary message holding area. Again, CORBA and DCOM approach messaging in different
ways, but both technologies are geared toward the same needs outlined above.
CORBA
The early CORBA specifications addressed messaging from a more primitive standpoint. The
Event Service was the basis of many messaging protocols such as push‐pull and pull‐push. ORBs
typically provided two avenues for messaging: the Event Service primitives or a proprietary
mechanism. The OMG recently addressed a more robust messaging model in the
CORBAservices specification. This specification addresses the asynchronous communication
option that is required by enterprise‐grade applications; however, it has not been adopted yet.
Many ORB products have implemented extensions to the CORBA Events service that provide
reliable messaging. For example, in early 1996, Orbix announced their OrbixTalk Reliable
Multicast Protocol, which provides reliable sequencing and delivery of messages. Some ORB
implementations have integrated an enterprise‐grade messaging service on par with
standalone Message‐Oriented Middleware (MOM). IBM’s ComponentBroker is one example of
integration with MQSeries, a leading MOM product. Iona has been successful in demonstrating
GIOP over MQSeries. BEA has also announced intentions to integrate its newly acquired
MessageQ into the Iceberg product.
DCOM
260
Formally, DCOM does not directly support asynchronous communication. Microsoft’s answer to
reliable messaging is a separate offering titled Microsoft Message Queue Server (MSMQ) or
Falcon. On the plus side, MSMQ promises to support each of the important qualities of reliable
messaging and more. Unfortunately, MSMQ is not a fully integrated part of DCOM at this time
and has the same interoperability limitations as MTS. Software AG’s EntireX product, a cross‐
platform port of DCOM, is integrated with the proprietary EntireX Message Broker service. This
service does not rely upon MSMG and provides persistent storage of messages to enable
asynchronous communication between clients and servers.
Summary: Reliable messaging is now being recognized by both CORBA and DCOM as a critical
service for the enterprise. CORBA has been augmented with leading MOM products, but full
inter‐service integration has not yet been achieved. DCOM has also been augmented with early
MOM functionality, but also lacks full integration with other complementary services and is
again, not available across a wide range of platforms. Enterprise Criterion Ratings
Reliability CORBA DCOMMessaging 0 ‐
META Group Consulting 12
Security
Clearly, security is one of the key considerations for enterprise computing. Most organizations
cringe at the prospect of opening up the mainframe to the Internet. Distributed applications
that are exposed to the Web simply cannot tolerate security breaches.
CORBA
The CORBA Security service is one of the most comprehensive security specifications available
for distributed computing. The 262‐page specification was jointly adopted with the Time
Service and covers nearly every conceivable aspect of security, including integrity,
accountability, availability, confidentiality, and non‐repudiation. It also recognizes that differing
levels of security needs exist in an enterprise environment. The service defines three (0‐2)
levels of security compliance, ranging from non‐aware ORB products to those that require the
entire range of services (access control, delegation, auditing, authentication, and policy
implementation).
ORB products differ widely in their support for security. For example, ICL’s DAIS product was
the first ORB to offer CORBA security conforming to Kerberos and the GSS API standards. Orbix
provides both the SSL‐IIOP standard (secure encrypted communications over the Internet) and
an implementation of the CORBA Security Level 1 service. Finally, Visigenic has recently
partnered with MITRE Corporation spin‐off Concept Five to deliver the first ORB complying with
CORBA Level 2 security.
DCOM
DCOM utilizes NT mechanisms as the basis of its security support. NT Version 3.5 has been
rated level C2 by the National Computer Security Center and ensures a comprehensive array of
security controls such as discretionary access, authentication, and auditing. DCOM also
provides a CryptoAPI to enable advanced encryption of information. This service requires the
support of a Cryptographic Service Provider (CSP) that is provided with NT. Without question,
the combination of NT, MTS, and COM can provide a comprehensively secure environment;
however, because DCOM’s security managers are NT dependent, this support is limited to
Windows/NT platforms.
Summary: CORBA and DCOM are both building comprehensive security mechanisms. To
CORBA’s credit, the recognition of a wide variety of security services will provide more solutions
to the differing needs of the enterprise. For DCOM, the cooperation of the operating system is
261
paramount to providing high levels of security. Although from different directions, both
middleware technologies are reaching critical mass in their support for secure distributed
computing. Enterprise Criterion Ratings
Reliability CORBA DCOM
Security 0 0
Directory Service
An essential feature of any middleware is the ability to keep track of the location of key services
in the distributed network space. This lessens the burden of each application (provides location
transparency) and, more importantly, provides for load balancing and failover services.
Examples of working directory services include; DNS, X.500, Novell NDS, and Microsoft NTDS,
though each is accessed by a specialized interface.
META Group Consulting 13
CORBA
The OMG has specified the Naming Service for just this purpose. Similar to a “white pages”
directory, the Naming Service permits a component to look up a service by name. The Naming
Service was designed to allow the use of conventional directory services such as those
identified above. These services are wrapped by a higher‐level service interface that masks
idiosyncrasies from the developer.
VisiBroker offers a CORBA‐compliant naming service that is fault tolerant (self‐recovering) and
persistent (survives shutdowns and abnormal failures), and supports federated name spaces.
Orbix also provides a fault‐tolerant naming service.
DCOM
Microsoft’s answer to this need is called the Active Directory Service (ADS). This service is said
to combine the best features of X.500 and DNS. Like the OMG Naming Service, ADS intends to
abstract differences between various directory services by providing one standardized
interface. ADS Version 1.0 is offered with NT 4.0, with the full ADS capability to be integrated in
NT 5.0. The ADS Interface (ADSI) is based on DCOM with specific offerings from directory
service providers being implemented as DCOM objects.
Summary: Both CORBA and DCOM are beginning to support sophisticated directory services on
par with previous “enterprise tested” incarnations such as NDS and DNS. Enterprise Criterion
Ratings
Reliability CORBA DCOM
Directory Service 0 0
Fault Tolerance
Middleware’s ability to “heal itself” in the event of reasonable failure is essential for most
enterprise applications. There are many supporting mechanisms that contribute to this
capability. Asynchronous messaging (discussed under Messaging) is one example. Service pools
and redundant failover mechanisms also enable graceful recovery and increase the fault
tolerance and reliability of middleware. Finally, a reliable directory service is needed to find and
connect backup services in the event of failure.
CORBA
The CORBA specification does not directly support fault‐tolerance services; however, many ORB
vendors have provided this support. For example, Visigenic’s VisiBroker provides symmetric
failover support to automatically bind to another object server on a separate host in the event
of service failure.
262
Most ORBs provide a simple timeout mechanism for detecting dead or disconnected clients.
This approach alone is not sufficient for highly fault‐tolerant applications.
DCOM
Basic support for fault tolerance is provided at the protocol level. DCOM utilizes reference
counts augmented by “keep alive” messages or pinging as an essential component of the DCOM
object life cycle. It requires the successful transmission and receipt of a heartbeat every two
minutes between a client and server. If three consecutive heartbeats are missed, the server
declares the client dead and decrements the reference count. According to a recent Web FAQ1
maintained by AT&T Labs (updated Nov. 5, 1997), DCOM does not support configurable times,
so clients may not detect problems for a considerable period of time (six minutes). Further, if a
distributed component gets into a continuous loop, there is no automated way to detect a
problem, because the 1 COM Reliability FAQ;
http://akpublic.research.att.com/~ymwang/resources/COM‐R‐FAQ.htm
META Group Consulting 14
heartbeats will still be sent. Finally, this approach utilizes significant network resources and may
not scale well for large numbers of connections. Microsoft has taken positive steps to
streamline this approach and has employed piggybacking, grouped pings, and delta pinging to
reduce network traffic. Anything beyond this generally requires extensive customization on
both the middleware and application side.
Summary: Both DCOM and CORBA do not directly support robust fault tolerance; however,
with sophisticated customization, it can be provided. For DCOM, it is not clear whether such
customization is possible across a heterogeneous platform environment, because most
workarounds currently require the support of NT or Windows 95 components.Enterprise
Criterion Ratings
Reliability CORBA DCOM
Fault Tolerance 0 0
Performance
Scalability
We generally define scalability as the middleware’s ability to perform when the size of the
problem increases. Middleware performance can be highly variable depending on how it is
used. For example, component granularity is one of the most significant drivers of performance
stress. In other words, as the pieces get smaller, so does sheer volume — causing the
middleware substrate mechanisms to work harder. As this occurs, the need arises for
middleware mechanism tuning in ways that conventional database products have supported.
Finally, middleware performance is very costly to measure, since only in vitro modeling can
provide a reasonable capacity estimation.
There must be compelling evidence, via current implementations or anecdotal data, that
indicate the middleware’s ability to scale through various scenarios. These scenarios may
include numbers of objects or users. Key areas where impacts are most likely are found in
services that commonly aggregate components such as naming services or interface
repositories. Finally, middleware must able to support threads to allow parallel processing of
work.
CORBA
263
As a specification, CORBA does not address specific scalability services aside from providing for
the transparent distribution of processing. Instead, individual ORBs deal with this problem in
one of two ways:
Threading — Many ORBs provide thread‐safe libraries that use each operating system’s native
thread model. This enables threads to be created for clients, objects, or even specific method
calls of an object. In addition, several ORB products also support thread pools. Filters can also
be used to balance processing based on current loading.
Tuning — ORB products provide various internal tuning mechanisms to enable optimization for
specific situations. For example, internal memory representations can be changed to order
references by “most frequently used” or other criteria that suit the specific conditions.
DCOM
DCOM offers similar scalability mechanisms such as parallel processing and threading. As with
CORBA, DCOM features are not transparently supported, and require detailed knowledge of
client/server interactions to implement.
META Group Consulting 15
Thread pools — DCOM utilizes thread pool managers to maximize scalability however, Windows
NT symmetric multiprocessing is required to support this feature. Summary: Both middleware
products are relatively nascent in their support for highly scalable enterprise applications.
There are, however, a growing number of large‐scale ORB implementation examples in the
investment, aerospace, and telecommunications industries. In addition, indirect evidence of
scalability can be inferred when combining ORBs with enterprise‐grade products such as
commercial TP monitors and MOM. Concrete evidence of large‐scale DCOM enterprise
applications is not readily available at this time. Enterprise Criterion Ratings
Performance CORBA DCOM
Scalability 0 ‐
Viability
Product Maturity
Despite the current fragmented condition of middleware offerings, we believe IT organizations
should be consumers of middleware components, not producers. As of this writing, the best
way to manage the massive complexity of middleware is through the purchase and
customization of commercial frameworks that organize their flexibility into structured
application packages. Such frameworks go beyond the primitive and complex services that
CORBA and DCOM can provide but still require a minimum maturity level from each.
CORBA
Many commercial ORB products are in their third generation of development. As such, we are
beginning to see a critical mass of services (Directory, Messaging, Transactions, and Security) in
several leading products. Although the OMG specification is explicitly designed to insure these
services are well integrated, no single ORB vendor has brought them all together in strict
CORBA compliance. Irrespective of this, ORBs are now being used for enterprise systems in
many demanding industries, including telecommunications, aerospace, and investment.
DCOM
The arrival of the predator services (Falcon, Viper, Wolfpack, and Active Directory) represents
Microsoft’s recognition of what must be in place in an enterprise setting. Like the leading ORB
products, these services are not fully integrated (even in the “NT only” environment) and are
not explicitly part of DCOM. What is worse, platform interoperability is only just appearing on
the DCOM radar screen and will likely be the last piece to fall into place.
264
Summary: Products from both DCOM and CORBA are only just beginning to aspire to the so‐
called “heavy lifting” demands of the enterprise. At the end of the day, representative products
from both middleware camps require a tremendous amount of financial fortitude and technical
expertise from the adopting enterprise to be successful.
Enterprise Criterion Ratings
Viability CORBA DCOM
Product Maturity 0 ‐
Vendor Outlook
Clearly, the trick is to buy and extend frameworks that are based on the likely winners of the
middleware framework wars, not the losers.
META Group Consulting 16
CORBA
CORBA differs from DCOM in an important way. While the CORBA specification is controlled by
the OMG standards body, ORBs are produced by a variety of vendors (though most belong to
the OMG). This separation has caused — and always will cause — a natural tension between
the need to differentiate product offerings and the need for interoperable standards
compliance. The OMG currently enjoys membership and backing from some 760 members. This
large membership will continue to uphold the OMG’s emphasis on interoperable and
standardized solutions but is
often to blame for slow progress.
There are several leading vendors whose viability is sound for at least the next several
generations of enterprise technology. Users must insist on enterprise players that will not only
survive the middleware wars but also remain committed to ORBs as part of their long‐term,
strategic direction. Such vendors as IBM, BEA, and Orbix appear to fit this category.
DCOM
Clearly, Microsoft will be one of the survivors. Microsoft stands as testament to the difference
between technological elegance and marketing leadership, a distinction that should never be
overlooked. While there is no question about Microsoft’s intention to support its value
proposition — optimized product integration on lower‐cost NT platforms — overwhelming
support for competing platforms would not be a logical assumption. That being said, Microsoft
will continue to focus its attentions on real scalability, manageability, and low cost in the NT
environment.
Summary: Both technologies will continue to have significant market backing and support.
DCOM will continue to enjoy heavy independent software vendor (ISV) tool support while ORBs
will continue to be supported by corporate customers. It is important to note, however, that
non‐DCOM integrated services such as MTS and MSMQ have not been widely tested, and ISV
acceptance for these is yet to be determined. Enterprise Criterion Ratings
Viability CORBA DCOM
Vendor Outlook + +
Web service technology stack
WEB SERVICES TECHNOLOGIES stack
Web Service Technology Stack
265
Discovery: fetch descriptions of providers. UDDI, WS‐Inspection.
Description: describe services. WSDL.
Packaging: is serialization or marshaling. SOAP.
Transport: application‐to‐application communication. HTTP, SMTP, TCP, Jabber.
Network: network layer. TCP/IP
Communications Layer
Web Services are essentially transport‐neutral.
A web service message can be transported using HTTP or HTTPS, as well as more
specialized transport mechanisms, such as e.g. JMS.Web services insulate the designer
from most of the details and implications of the message transport layer.
Messaging Layer
SOAP = Simple Object Access Protocol
A protocol to exchange structured information in a distributed environment.
SOAP extensions:
266
1) WS‐Reliable Messaging ‐ a standard for web services messaging to guarantee the
receipt of messages for WS requestors and providers
2) WS‐Transactions ‐ a series of standards related to WS invocations in transactions
(atomicity, consistency, isolation and durability semantics)
Descriptions Layer
WSDL = Web Services Description Language
A language that allows a service provider to specify the functional characteristic of its
web services.
WSDL extensions:
1) WS‐Policy ‐ augment WSDL with non‐functional constraints on WS
2) WS‐Resource Properties – describes how to define and access properties of resources
through WS
Processes Layer: Discovery
Discovery ‐ locating a machine‐processable description of a web service that may have
been previously unknown and that meets certain criteria.
UDDI = Universal Description, Discovery and Integration
UDDI defines a way to store and retrieve information about web services.
Processes Layer: Choreography
WS Interoperability
Web Services tackle the set of problems related to loosely coupled dynamically
configured heterogeneous distributed computing.
WS Specifications:
1) A series of smaller, purpose‐focused specifications dealing with narrow problems
(security, transactions, etc.) in isolation.
267
2) Each WS specification is designed to be composed with the others.
3) WS designers determine which specifications their system needs and implement
them accordingly.
WS‐I Organization
Web Services Interoperability organization (WS‐I):
1) WS‐I is to standardize combinations of WS specifications that can be used to increase
the level of interoperability between web services.
2) WS‐I promotes the Basic Profile ‐ implementation guidelines for how non‐proprietary
WS specifications, such as SOAP, WSDL, UDDI, should be used together for best
interoperability.
268
Current Web services stack
In this section, I will recap the standards and the tools used to implement them at each layer in
the Web services stack. We start from the bottommost layer, transport, and work my way up
one layer at a time until I reach the final layer, service flow.
Transport layer
As you can see in Figure 2, the transport layer is the foundation of the Web services stack. Web
services must be invoked by a service client so they can be accessible to a network. Web
services that are publicly available run over the Internet. Only the authorized users within an
internal organization can view intranet‐available Web services, while unauthorized users in the
outside world cannot. It is possible to create extranet‐based Web services to allow legitimate
users access to them on more than one intranet.
269
Figure 2. Original Web services stack
The Internet protocols that can be supported by this stack are HTTP and, to a lesser extent,
SMTP (for electronic mail) and FTP (for file transfer). Intranet domains use middleware call
infrastructures, such as IBM's MQSeries, and CORBA (the Common Object Request Broker
Architecture). The latter relies on a protocol called the Internet Inter‐ORB Protocol (IIOP) for
remote objects.
MQSeries name changes
Note that the following MQSeries products have been renamed as part of the consolidation of
IBM's middleware product portfolio:
• MQSeries has been renamed WebSphere MQ.
• MQSeries Integrator has been renamed WebSphere MQ Integrator.
• WS BtoBi PAM has been renamed WebSphere Partner Agreement Manager.
XML‐based messaging layer
In this layer, SOAP is the messaging protocol for XML. It is built upon the lower layer ‐‐ transport
‐‐ meaning that SOAP is used singly or in combination with any transport protocols. All SOAP
messages support the publish, bind, and find operations in the Web services architecture. SOAP
comprises three parts: an envelope to describe the content of a message, a set of encoding
rules, and a mechanism for providing remote procedure calls (RPCs) and responses.
IBM, Microsoft, and others submitted SOAP to the W3C as the basis of the XML Protocol
Working Group. When the W3C releases a draft standard for the XML protocol, the Web
services architecture stack will migrate from SOAP to the XML protocol.
Service description layer
Service description provides the means of invoking Web services. WSDL is the basic standard
for defining, in XML format, the implementations and interfaces of services. This means that
270
WSDL divides a service description into two parts: service implementation and service interface.
You must create a service interface before you can implement WSDL.
Service publication layer
Service discovery layer
Service discovery relies on service publication; if a Web service is not or cannot be published, it
cannot be found or discovered. The service client can make the service description available to
an application at runtime. The service client, for example, can retrieve a WSDL document from
a local file obtained through a direct publish. This action is known as static discovery. The
service can also be discovered at design or run time using either a local WSDL registry or a
public or private UDDI registry.
Service flow layer
Web Services Flow Language (WSFL) is the standard for the service flow layer at the top of the
stack. It differs from other standards in the stack in that it looks at business process modeling
and workflows. WSFL is used to describe how Web services are to interact in workflows and
how they are to perform in service‐to‐service communications or collaborations. This means
that Web services are components of, or can be dynamically orchestrated into, workflows ‐‐
between a buyer, a seller, and a shipper, for instance.
For example, WSFL allows a workflow manager to call from a composite Web service each
individual Web service with its particular role in a business process; such processes could
include managing financial reports, supporting forecasts and budgets in a five‐year IT plan, or
making a hotel reservation. For instance, in making a hotel reservation, workflow components
could include:
• An enterprise's private Web services collaborating to present a single Web service
interface to the public.
• Different enterprises providing Web services in a collaborative effort to perform
business‐to‐business transactions.
You need a tool, such as IBM MQSeries Workflow (now called WebSphere Process Manager;
see the sidebar), to define business processes as a series of activities, and to vary the sequence
of these activities as the requirements for business processes change.
271
Back to top
Suggested Web services stack
The original Web services stack needs to be updated to incorporate IBM's new standards. This
is achieved by the addition of several new layers: a service user interface/presentation layer, a
service agreement layer, and a service security layer.Figure 3 illustrates the suggested stack.
Figure 3. Suggested Web services stack
Let's start with the topmost layer: the service user interface/presentation layer. The standard
associated with this layer is called Web Services Experience Language (WSXL), and is used to
describe how user experiences should be delivered to end users (for example, through a
browser, or a portal, or by embedding into a third party interactive Web application). It is
independent of presentation markup.
WSXL logically sits atop the WSFL, as user experiences depend on how Web services are to
interact in workflows. The WSFL uses WSDL for the description of service interfaces and their
protocol bindings. The service description layer for WSDL consists of two components: the
service implementation layer and the service interface layer. The definition for the
implementation layer describes how a service provider implements a service interface. You
must create a service interface before you can implement WSDL.
WSDL, in turn, relies on the WS‐Security specification, which, as the specification itself states, is
used to describe "how to attach signature and encryption headers to SOAP messages." (See
Resources for the full specification.) Included in the security specifications are other types of
272
message attachments, such as X.509 and Kerberos tickets. IBM's WSCA 1.0 specifies how the
WSDL description of the service interface definition and the service implementation definition
can be derived from the UDDI entries. This means that UDDI is used as a service registry for
WSDL‐based services.
When not using UDDI, you may want to use the alternative WS‐Inspection as a parallel to the
service discovery layer. Both standards provide "directory assistance" via third‐party sources.
While WS‐Inspection primarily supports focused discovery (active search) patterns, along with
some unfocused (open‐ended browsing) ones, UDDI (static) is limited to focused discovery
patterns. Additionally, WS‐Inspection has two other characteristics that UDDI does not possess:
direct communication (via voice) and simple aggregate token (via business card), both of which
are disseminated by the originator.
As you can see in Figure 3, the service agreement layer has been inserted between the service
flow and service discovery layers. This new layer, TPA (short for Trading Partner Agreement),
describes an agreement between two partners (for example, business IBM Software Group
Architecture Overview partners) on how they should interact regarding a service, as specified in
the service flow layer. In a procurement/purchase order system, the seller (the service
provider), for instance, must have Web services to receive request for quote (RFQ) messages,
purchase order (PO) messages, invitation‐for‐bid (IFB) solicitation query messages, and
payment messages. The buyer (the service client) must have Web services to receive RFQ
quotes, invoice messages, bid award notification messages, and account summary messages.
Between the provider and the client are many steps involved in the workflow of exchanging
messages.
Web Services Stacks
To ensure interoperability when performing the publish, find and bind operations expressed in
the Service Oriented Architecture (SOA) diagram; conceptual and technical standards must be
defined for each role and type of interaction. This section will explore each of roles and
interactions in order identify each relevant stack of technologies.
There are over arching concerns involving security, management and quality of services that
must be addressed at each layer in the conceptual and technical stacks. The various solutions at
each layer may or may not be independent of one other. More of these overarching concerns
will emerge as the web services paradigm is adopted and evolved. What is most important is
building a framework through which all such concerns may be applied to each of the layers in
the stack so that as new concerns emerge they may be dealt with flexibly and consistently.
At the end of this section we assemble the independent stacks into a single stack where each
additional layer builds upon the capabilities provided by those below it. The vertical towers
represent the variety of over arching concerns that must be addressed at every level of each of
the stacks.
273
An important point is that, towards the bottom layers of the stack, the technologies and
concepts are relatively more mature and achieve a higher level of standardization than many of
the upper layers. The maturation and adoption of Web services will drive the continued
development and standardization of the higher levels of the stack and the overarching
concerns.
3.3.1 Wire "Stack"
The wire stack encapsulates the concepts and technologies dealing with the actual physical
exchange of information between any of the roles in the SOA diagram. This includes the variety
network transport, message packaging and message extensions that may be utilized to facilitate
data exchange.
3.3.1.1 Transport
The foundation of the web services stack is the network. Web services must be network
accessible to be invoked by a service requestor. Web services that are publicly available on the
274
Internet use commonly deployed network protocols. Because of its ubiquity, HTTP is the de
facto standard network protocol for Internet‐available web services. Other Internet protocols
may be supported including SMTP and FTP. Intranet domains may use proprietary or platform
and vendor specific protocols such as MQSeries, CORBA, etc.. The specific choice of network
protocol used in any given scenario depends entirely upon application requirements, including
concerns such as security, availability, performance, and reliability. This allows web services to
capitalize on existing higher function networking infrastructures and message oriented
middleware, such as MQSeries.
Within an enterprise with multiple types of network infrastructures, HTTP can be used as a
common, interoperable bridge to connect disparate systems. One of the benefits of web
services is that it provides a unified programming model for the development and usage of
private Intranet as well as public Internet services. As a result, the choice of network technology
can be made entirely transparent to the developer and consumer of the service.
3.3.1.2 Packaging
Moving up the Wire stack, the next layer, Packaging, represents the technologies that may be
used to package information being exchanged. XML has been broadly adopted as the basis for
Web service message packaging protocols.
SOAP is a simple and lightweight XML‐based mechanism for creating structured data packages
that can exchanged between network applications. SOAP consists of four fundamental
components: an envelope that defines a framework for describing message structure, a set of
encoding rules for expressing instances of application‐defined data types, a convention for
representing remote procedure calls (RPC) and responses, and a set of rules for using SOAP
with HTTP. SOAP can be used in combination with a variety of network protocols; such as HTTP,
SMTP, FTP, RMI/IIOP, or a proprietary messaging protocol.
SOAP is currently the de facto standard for XML messaging for a number of reasons. First, SOAP
is relatively simple, defining a thin layer that builds on top of existing network technologies
such as HTTP that are already broadly implemented. Second, SOAP is flexible and extensible in
that rather than trying to solve all of the various issues developers may face when constructing
Web services, it provides an extensible, composable framework that allows solutions to be
incrementally applied as needed. Thirdly, SOAP is based on XML. Finally, SOAP enjoys broad
industry and developer community support.
3.3.1.3 Extensions
Building on the transport and packaging layers, the final layer in the Wire stack provides a
framework that allows additional information to be attached to Web service messages
representing a variety of additional concerns; such as context, routing, policy, etc. As a key part
of its envelope message structure, SOAP defines a mechanism to incorporate orthogonal
extensions (also known as features) to the message in the form of headers and encoding rules.
It is expected that as Web services are adopted and evolved, a broad collection of such
extensions will emerge and be standardized.
275
3.3.2 XM
ML Messagin
ng with SOAP
P
Editorial note
CBF: misssing graphic...
276
Typically, a SOAP Server running in a web application server performs these functions.
Alternatively, a programming language‐specific runtime library can be used that encapsulates
these functions within an API. Application integration with SOAP can be achieved by using four
basic steps:
• In the Figure 1 above at (1), a service requestor’s application creates a SOAP message.
This SOAP message is the request that invokes the web service operation provided by
the service provider. The XML document in the body of the message can be a SOAP RPC
request or a document‐centric message as indicated in the service description. The
service requestor presents this message together with the network address of the
service provider to the SOAP infrastructure (e.g. a SOAP client runtime). The SOAP client
runtime interacts with an underlying network protocol (e.g. HTTP) to send the SOAP
message out over the network.
• The network infrastructure delivers the message to the service provider’s SOAP runtime
(e.g. a SOAP server) (2). The SOAP server routes the request message to the service
provider's web service. The SOAP runtime is responsible for converting the XML
Message into programming language specific objects if required by the application. This
conversion is governed by the encoding schemes found within the message.
• The web service is responsible for processing the request message and formulating a
response. The response is also a SOAP message. The response SOAP message is
presented to the SOAP runtime with the service requestor as the destination (3). In the
case of synchronous request/response over HTTP, the underlying request/response
nature of the networking protocol is used to implement the request/response nature of
the messaging. The SOAP runtime sends the SOAP message response to the service
requestor over the network.
• The response message is received by the networking infrastructure on the service
requestor’s node. The message is routed through the SOAP infrastructure; potentially
converting the XML message into objects in a target programming language (4). The
response message is then presented to the application.
This example uses the request/response transmission primitive that is quite common in most
distributed computing environments. The request/response exchange may be synchronous or
asynchronous. Other transmission primitives such as one‐way messaging (no response),
notification (push style response), publish/subscribe are possible using SOAP.
3.3.2.1 Interactions
• One way: Message sent from requestor to provider. Provider may or may not return a
response. If the provider returns a response, the requester may have already stopped
‘listening’ for it or closed the communications channel. Response will be ‘thrown away’
and not processed by the requester
• Conversational: Requester and Provider exchange multiple messages. Can be defined by
choreography language.
• Many‐to‐Many: Requester sends message to many providers. Or, service provider
responds to many requestors. Can be defined by choreography language.
277
3.3.3 Desscription "Sttack"
Editorial note
It is thrrough the service description th hat the service provider commu unicates all the
specifications for invvoking the W Web service tto the servicce requestorr. The service description is a
key contributor to m making the W Web servicees architectu ure loosely ccoupled and d to reducingg the
amount of required shared understanding, custom pro ogramming aand integrattion between the
service pprovider and the service requestor. FFor examplee, neither the requestor nor the pro ovider
must be aware of the other's underlying platform, programming lan nguage, or d distributed o object
model (if
( any). Th he service description combined with the underlyingg XML Messsage
infrastructure defineed in the Wire
W stack sufficiently encapsulates
e s this detaill away from
m the
service reequestor's aapplication and the service provider’’s Web service.
278
We desccribe the constituent parts of th he service description
d used in thhe Web serrvices
architectture in two groups, tho
ose used to fully describ
be one Web
b service an
nd those useed to
describe interactionss or relationsships between sets of W Web services.
Discovery
ry Agencies ""Stack"
While thhe bottom three layeers of the stack identtify technologies for compliance and
interoperability, the service publication and service disccovery can b
be implemen nted with a rrange
of solutio
ons.
Since a web servicee cannot bee discovered if it has not been published,
p s
service discoovery
depends upon service publicatiion. The varriety of disccovery mechhanisms parrallels the set
s of
publication mechanisms. Any meechanism that allows the service req questor to ggain access to the
service description
d and make it
i available to the application at runtime quaalifies as seervice
279
discovery. The simplest, most static example of discovery is static discovery wherein the service
requestor retrieves a WSDL document from a local file. This is usually the WSDL document
obtained through a direct publish or the results of a previous find operation. Alternatively, the
service may be discovered at design time or run time using a local WSDL registry, or a public or
private registry such as UDDI. The variety of service discovery mechanisms is discussed in more
detail in the section titled Service Discovery.
3.1 OVERVIEW
Web services are a concept rather than a packaged solution to a well‐defined set of problems.
Web services are based on XML and form a design center around XML, which in itself comprises
a number of standards. It would go beyond the scope of a paper like this to list all of the
relevant technologies and standards components that might be included in an overall Web
services architecture. (Much more information is available on the Internet, and the author
strongly recommends keeping up‐to‐date with the rapidly evolving technologies related to Web
services.) Over the past two years, a number of companies have been involved in the
development of the Web services architecture, most notably IBM and Microsoft, who have
been playing a vital role in bringing the first three components of the Web services model to
the point of standardization. These three key Webservice technologies are:
• a standardized way to describe Web services (Web Services Description Language, or WSDL).
• a standardized way to publish and discover Web services (Universal Description, Discovery,
and Integration, or UDDI).
• a standardized way to invoke Web services (Simple Object Access Protocol, or SOAP).
Service‐oriented architecture
SOA = Service‐Oriented Architecture
SOA is a software architecture where all software‐implemented tasks and processes are
designed as services to be consumed over a network.
Keywords:
1) architecture
2) service
SOA approach:
The focus of design is the service interface. A service:
280
1) has a well‐defined interface
2) can be potentially invoked over a network
3) can be reused in multiple business contexts
An application:
1) is integrated at the interface and not implementation level
2) is built to work with any implementation of a contract, resulting in a loosely coupled and
more flexible system
SOA Components
1) roles
a) service provider
b) service requestor
c) service registry
2) operations
a) publish
b) bind
c) find
SOA Roles: Service Provider
281
What a service provider does?
1) creates a service description
2) deploys the service in a runtime environment to make it accessible to other entities
over the network
3) publishes the service description to one or more services registries
4) receives messages invoking the service from service requestors
SOA Roles: Service Requestor
What a service requestor does?
1) finds a service description published in a service registry
2) applies the service description to bind and invoke the web service hosted by a service
provider
A service requestor can be any consumer of a web service.
Any entity that hosts a network‐available web service is a service provider.
SOA Roles: Service Registry
What a service registry does?
1) Accepts request from service providers to publish and advertise web service descriptions
2) allows service requestors to search the collection of service descriptions contained within the
service registry
The role of service registry is to enable match‐making between service providers and service
requestors.
Once the match has been found, the interactions are carried out directly between the service
requestor and the service provider.
SOA Operations: Publish
The publish operation is an act of service registration or service advertisement. When a service
provider publishes its web service in a service registry, it is advertising the service to the whole
community of potential service requestors.
The details of the publish operation depends on how the service registry is implemented.
282
SOA Operations: Find
The find operation is an act of looking for a service satisfying certain conditions:
1) service requestor states a search criteria, such as: the type of the service, its quality, etc.
2) service registry matches the search criteria against the published web service descriptions
The result is a list of service descriptions that match the search criteria.
Details of the operation depend on the implementation of the service registry.
SOA Operations: Bind
The bind operation creates the client‐server relationship between service requestor and service
provider.
The operation can be:
283
1) dynamic ‐ creating a client‐side proxy on‐the‐fly based on the service description to invoke
the web service
2) static ‐ the developer hard‐codes the way the client invokes the web service
SOA Properties 1
SOA is a form of distributed systems architecture.
It is characterized by:
1) Logical view ‐ a service is an abstraction is what actual programs, databases, businesses
processes etc. are able to do.
2) Message exchange – a service is defined in terms of the messages exchanged between
provider and requestor agents and not in terms of the properties of the agents themselves
3) abstraction – SOA hides the implementation details of the underlying languages, process and
database structures, etc.
4) meta‐data – a service is described by machine‐processable meta‐data
5) small number of operations – a service tends to rely on a small number of operations with
relatively large and complex messages
6) network orientation ‐ services are oriented to their use over a network
7) platform‐neutral ‐ messages are sent in a standardized format delivered through the
interfaces. XML is typically used.
SOA Benefits 1
SOA enables the agents participating in the message exchange to be loosely coupled, which in
turn allows for more flexibility:
284
1) a client is only coupled to a service, not to a server ‐ the integration of the server takes place
outside the scope of the client application
2) functional components and their interfaces are separated ‐ new interfaces can be easily
added
3) old and new functionality can be encapsulated as software components that provide and
receive services
4) the control of business processes can be isolated:
a) business‐rule engine can control the workflow of a business process
b) depending on the state, the engine invokes different services
5) services can be incorporated dynamically during runtime
6) service bindings are specified using configuration files and can be easily adapted to satisfy
new needs
Service Description in SOA
The key to SOA is service description:
1) it is published by the service provider in the service registry
2) it is returned to the service requestor as a result of the search operatio
3) it specifies to the service requestor:
a) how to bind and invoke the web service
b) what information is returned as a result of the invocation
This section is a short introduction to a service‐oriented architecture, its key concepts, and
requirements. You will find a more thorough description of service‐oriented architectures in
Chapter10, “Web services architectures” on page181. (Be aware that, because the presented
architecture makes no statements about the infrastructure or protocols it uses, the
implementation of the service‐oriented architecture is not limited to Web technologies.)
A service‐oriented architecture consists of three basic components:
Service provider
Service broker
285
Service requestor
However, each component can also act as one of the two other components. For instance, if a
service provider needs some more information that it can only acquire from some other
service, it acts as a service requestor while still serving the original request. Figure1‐1 shows the
operations each component can perform.
The service provider creates a Web service and possibly publishes its interface and access
information to the service broker. Each provider must decide which services to expose, how to
make trade‐offs between security and easy availability, and how to price the services (or, if they
are free, how to exploit them for other value). The provider also has to decide what category
the service should be listed in for a given broker service and what sort of trading partner
agreements are required to use the service.
The service broker (also known as service registry) is responsible for making the Web service
interface and implementation access information available to any potential service requestor.
The implementers of a broker have to decide about the scope of the broker. Public brokers are
available all over the Internet, while private brokers are only accessible to a limited audience,
for example, users of a company‐wide intranet. Furthermore, the width and breadth of the
offered information has to be decided. Some brokers will specialize in breadth of listings.
Others will offer high levels of trust in the listed services. Some will cover a broad landscape of
services, and others will focus within a given industry. Brokers will also arise that simply catalog
286
other brokers. Depending on the business model, a broker might attempt to maximize the look‐
up requests, number of listings, or accuracy of the listings.
The service requestor locates entries in the broker registry using various find operations and
then binds to the service provider in order to invoke one of its Web services.
One important issue for users of services is the degree to which services are statically chosen by
designers compared to those dynamically chosen at runtime. Even if most initial usage is largely
static, any dynamic choice opens up the issues of how to choose the best service provider and
how to assess quality of service. Another issue is how the user of services can assess the risk of
exposure to failures of service suppliers.
Characteristics
In this architecture, a client is not coupled to a server, but to a service. Thus, the integration of
the server to use takes place outside of the scope of the client application programs.
Old and new functional blocks are encapsulated into components that work as services.
Functional components and their interfaces are separated. Therefore, new interfaces can be
plugged in more easily.
Within complex applications, the control of business processes can be isolated. A business rule
engine can be incorporated to control the workflow of a defined business process. Depending
on the state of the workflow, the engine calls the respective services.
Services can be incorporated dynamically during runtime.
Bindings are specified using configuration files and can thus easily be adapted to new needs.
Requirements
For an efficient use of a service‐oriented architecture, a number of requirements have to be
fulfilled:
Interoperability between different systems and programming languages The most important
basis for a simple integration between applications on different platforms is a communication
protocol that is available for most systems and programming languages.
287
Clear and unambiguous description language To use a service offered by a provider, it is not
only necessary to be able to access the provider system, but also the syntax of the service
interface must be clearly defined in a platform‐independent fashion.
Retrieval of the service
To allow a convenient integration at design time or even runtime of the system, we require a
mechanism that provides search facilities to retrieve suitable available services. Such services
should be classified into computer‐accessible, hierarchical categories, or taxonomies, based on
what the services in each category do and how they can be invoked.
Security
Protection of the services, including the information passed to and received from the service
against unauthorized and malicious access, must be supported by the platform to win the
confidence of the requestor (chain)—at the end the business customers. The type and extent of
security depends on the type and placement of the participants—service requestors and service
providers—and the services themselves. Service usage monitoring and security incident action
plans have to be in place to detect unauthorized access (attempts) and trigger counter
measures. Security is required to empower and retain authenticated and authorized
requestors/customers while fencing off everything and everyone.
Web services
Web services are a relatively new technology that implements a service‐oriented architecture.
During the development of this technology, a major focus was put on making functional
building blocks accessible over standard Internet protocols that are independent from
platforms and programming languages.
If we had to describe Web services using just one sentence, we would say:
Web services are self‐contained, modular applications that can be described, published,
located, and invoked over a network.
Web services perform encapsulated business functions, ranging from simple request‐reply to
full business process interactions. These services can be new applications or just wrapped
around existing legacy systems to make them network‐enabled. Services can rely on other
services to achieve their goals.
Figure1‐2 shows the relationship between the core elements of Web services in a service‐
oriented architecture (SOA).
288
The following core technologies are used for Web services. These technologies are covered in
detail in the subsequent chapters.
XML (Extensible Markup Language) is the markup language that underlies most of the
specifications used for Web services. XML is a generic language that can be used to describe
any kind of content in a structured way, separated from its presentation to a specific device.
SOAP (Simple Object Access Protocol) is a network, transport, and programming language and
platform‐neutral protocol that allows a client to call a remote service. The message format is
XML. WSDL (Web Services Description Language) is an XML‐based interface and
implementation description language. The service provider uses a WSDL document in order to
specify the operations a Web service provides and the parameters and data types of these
operations. A WSDL document also contains the service access information.
WSIL (Web Services Inspection Language, also WS‐Inspection) is an XML‐based specification
about how to locate Web services without the necessity of using UDDI. However, WSIL can be
also used together with UDDI, that is, it is orthogonal to UDDI and does not replace it.
Most business partners today do not find one another from UDDI registries; rather they are
based on existing relationships. That is where the Web Services Inspection Language fits in.
WSIL decentralizes the centralized model of service publication within a UDDI registry and
distributes the pieces such that each service provider itself can advertise its Web Services
offerings. WSIL thus facilitates the behavior that most businesses desiring to use Web Services
289
(today) are most comfortable with (today). Yet, WSIL is less widely used today as Web Service
Registries take their place.
UDDI (Universal Description, Discovery, and Integration) is both a client‐side API and a SOAP‐
based server implementation that can be used to store and retrieve information on service
providers and Web services.
Properties of Web services All Web services share the following properties:
Web services are self‐contained. On the client side, no additional software is required. A
programming language with XML and HTTP client support is enough to get you started. On the
server side, merely an HTTP server and a SOAP server are required. It is possible to enable an
existing application for Web services without writing a single line of code.
Web services are self‐describing.
The definition of the message format travels with the message; no external metadata
repositories or code generation tools are required.
Web services can be published, located, and invoked across the Web.This technology uses
established lightweight Internet standards such as HTTP. It leverages the existing infrastructure.
Some additional standards thatare required to do so include SOAP, WSDL, and UDDI.
Web services are modular.
Simple Web services can be aggregated to more complex ones, either using workflow
techniques or by calling lower‐layer Web services from a Web service implementation. Web
services can be chained together to perform higher‐level business functions. This shortens
development time and enables best‐of‐breed implementations.
Web services are language‐independent and interoperable.The client and server can be
implemented in different environments. Existing code does not have to be changed in order to
be Web service enabled. Basically, any language can be used to implement Web service clients
and servers.
Web services are inherently open and standard‐based.
XML and HTTP are the major technical foundation for Web services. A large part of the Web
service technology has been built using open‐source projects. Therefore, vendor independence
and interoperability are realistic goals.
Web services are loosely coupled.
290
Traditionally, application design has depended on tight interconnections at both ends. Web
services require a simpler level of coordination that allows a more flexible reconfiguration for
an integration of the services in question.
Web services are dynamic.
Dynamic e‐business can become reality using Web services, because with UDDI and WSDL, the
Web service description and discovery can be automated. In addition, Web services can be
implemented and deployed without disturbing clients that use them.
Web services provide programmatic access. The approach provides no graphical user interface;
it operates at the code level. Service consumers have to know the interfaces to Web services
but do not have to know the implementation details of services.
Web services provide the ability to wrap existing applications. Already existing stand‐alone
applications can easily be integrated into the service‐oriented architecture by implementing a
Web service as an interface.
Web services build on proven, mature technology. There are a lot of commonalities, as well as
a few fundamental differences, with other distributed computing frameworks. For example, the
transport protocol is text based and not binary.
Web Services Architecture‐‐Components, interactions and development cycle
Web Services Technologies
An architecture built around a client, a provider, and a registry.
Web services depends on several enabling technologies including XML, SOAP, UDDI, and WSDL.
In the following sections we examine UDDI and WSDL, key pieces in the Web services
framework.
The Web Services Architecture
As Figure 5.2 illustrates, there are three major aspects to Web services:
• A service provider provides an interface for software that can carry out a specified set of
tasks.
• A service requester discovers and invokes a software service to provide a business
solution. The requester will commonly invoke a remote procedure call on the service
provider, passing parameter data to the provider and receiving a result in reply.
291
• A repository or broker manages and publishes the service. Service providers publish
their services with the broker, and requests access those services by creating bindings to
the service provider.
The Web services triad includes a broker, a service provider, and a service requester.
Key Technologies
Web services builds on SOAP, UDDI, and WSDL.
Web services relies on several key underlying technologies, in particular, UDDI, WSDL, and
SOAP.
UDDI is a protocol for describing Web services components that allows businesses to register
with an Internet directory so they can advertise their services and companies can find each
other and carry out transactions over the Web.
292
WSDL is the proposed standard for describing a Web service. WSDL is built around an XML‐
based service Interface Definition Language that defines both the service interface and the
implementation details. WSDL details may be obtained from UDDI entries that describe the
SOAP messages needed to use a particular Web service.
SOAP is a protocol for communicating with a UDDI service (see Figure 5.3). SOAP simplifies
UDDI access by allowing applications to invoke object methods or functions residing on remote
servers. The advantage of SOAP is that it can use universal HTTP to make a request and to
receive a response. SOAP requests and responses use XML not only to target the remote
method but to package any data that is required by the method.
Communication involving UDDI uses SOAP to package UDDI requests and replies to a Web
services repository.
WS Components
A web service includes three basic components:
1) a mechanism to find and register interest in a service
293
2) a definition of the service’s input and output parameters
3) a transport mechanism to access a service
Web services also include other technologies that can be used to provide additional features
such as security, transaction processing and others.
WS Process
1) A service provider publishes a service to an external repository
2) A client looks up for a service in the repository
3) The repository returns information about the service:
Call format
Provider address
4) The client binds to the underlying service
5) The client calls and accesses the service
294
WS and Others
Web services do not introduce new functionality.
Similar functionality is provided by:
1) Sun/RPC
2) DCOM
3) Enterprise Java Beans
4) etc.
The difference is how this functionality is provided.
Web Service Application
Consider the same application, but built as a web service.
A client requests a file from the server. The server sends the file to the client.
When received, the client saves the file on the local machine.
The steps involved:
1) setup the SOAP server
2) develop the server
3) develop the client
4) start the web server
5) deploy the server as a web service
6) run the client application
Web Service Example 1
Running the application:
1) the server is running on the auxiliary PC ‐ start the web server
2) the client is running on the current PC‐run_client.bat
295
What happens if we enable a firewall on the server side?
Let’s try again to run the client application:
Web Service Example 2
Comparison: Communication
What is the observable difference between CORBA and WS applications?
With the firewall enabled, the CORBA application was unable to run.One advantage of SOAP is
its explicit definition of HTTP binding through the process of hiding another protocol inside
HTTP messages.
This allows SOAP messages to pass through a firewall unimpeded.Firewalls will usually allow
HTTP protocol through port 80, while they will restrict use of other protocols or ports.
Comparison:
296
Functionality
The same functionality in CORBA and WS.
The difference is how WS provides this functionality:
1) data is formatted for transfer using XML
2) data is passed using standard communication protocols
3) the exposed service is well defined in an XML vocabulary
4) services are found in standard ways with XML vocabularies WS provides more a flexible
design than CORBA.
Comparison: Standards
The main difference with past Distributed Computing Environments is adopted standards and
implementations:
1) a standard lookup service – UDDI
2) a standard definition mechanism – WSDL
3) a standard way for two parties to communicate – SOAP
The foundation technology for all three (and more) is XML.
Web Service Implementation
The standards used by web services are defined with little concern for the underlying
implementation mechanism.
Therefore: a web service written in C and running on Microsoft IIS can access a web
service written in Java, running on BEA WebLogic Server.
297
Web Service Environments
Several environments exist to build, deploy and access web services.
Best known:
1) Microsoft’s .Net Platform
2) Sun’s Java 2 Platform
We rely on Java Web Services.
Traditional Communication
Traditional system communication:
1) systems must be tightly bound
2) data must be transferred in such a way that two systems agree beforehand on the format
3) various “network normal forms” were created to decide how bytes, integers, etc. were to be
encoded for transfer
XML‐Based Communication
Before ‐ no common data‐definition mechanism.
With XML:
1) common, well‐defined data and representation
2) well‐defined set of validity and well‐formedness rules Web service communication relies on
XML syntax to write messages.
WS ‐ Business Perspective
Web services and business processes/goals:
298
1) a web service is an implementation of a business process or a step within such a process
2) a web service is made available over a network to internal and/or external business partners
to achieve specific business goals Web services promote integration of applications within an
organization and between different business partners.
Key feature: to allow for rapid construction of business applications by combining web services
built internally with those of business partners.
Web Service Usage
Two main scenarios of web service usage:
1) application integration
2) B2B partner integration over the Internet
WS Usage: B2B Integration
Business‐to‐business (B2B) partner integration over the Internet.
B2B integrates business systems of two or more companies to support
cross‐enterprise business processes, e.g. supply chain management.
WS Usage: Application Integration
Legacy systems can be wrapped as web services and made available for integration with other
systems.
Applications exposed as web services are accessible by other applications running on different
hardware platforms and written in different languages.
Web Service Benefits
1) platform integration ‐ the platform‐neutrality of WS allows combining
business systems using different devices (PDAs, cell phones, desktops)
with service providers of all sizes and shapes
299
2) software integration ‐ systems supporting new or modified business processes can be rapidly
delivered by wrapping the existing functionality
3) standard technology ‐ open standards enable developers to choose among different
products, avoiding vendor‐dependence
4) small businesses integration ‐ the low cost of WS allows small businesses to deploy and
participate in WS applications
5) easy integration ‐ interface‐based development using web service descriptions reduces the
time to integrate applications
SOA and WS
Service‐Oriented Architecture:
1) Provides an approach for building systems focused on a loosely coupled set of components
(services) that can be dynamically composed
2) promotes seamless software integration as a business benefit
Web Services:
1) one approach to building SOA
2) provide a standard for a particular set of XML‐based technologies that can be used to build
SOA systems
WS‐Based Approach to SOA
300
Using SOA and WS
SOA and WS are most appropriate for the applications that:
1) can operate over the Internet, accepting that reliability and performance of communication
cannot be guaranteed in this case
2) do not require that all service requestors and providers are upgraded at the same time
3) consist of the components running remotely on different execution platforms and vendor
products
4) were designed using legacy technology but need to be exposed for use over a network, using
a web service wrapping
4+1 view model‐logical view –process view –implementation view‐deployment view
The 4 + 1 View Model describes software architecture using five concurrent views, each of
which addresses a specific set of concerns: The logical view describes the design's object model,
the process view describes the design's concurrency and synchronization aspects; the physical
view describes the mapping of the software onto the hardware and shows the system's
distributed aspects, and the development view describes the software's static organization in
the development environment. Software designers can organize the description of their
architectural decisions around these four views and then illustrate them with a few selected
301
use cases, or scenarios, which constitute a fifth view. The architecture is partially evolved from
these scenarios. The 4+1 View Model allows various stakeholders to find what they need in the
software architecture. System engineers can approach it first from the physical view, then the
process view; end users, customers, and data specialists can approach it from the logical view;
and project managers and software‐configuration staff members can approach it from the
development view.
• The logical view, which is the object model of the design (when an object‐oriented
design method is used),
• The process view, which captures the concurrency and synchronization aspects of the
design,
• The physical view, which describes the mapping(s) of the software onto the hardware
and reflects its distributed aspect,
• The development view, which describes the static organization of the software in its
development environment.
302
• Use‐case view: Describes functionality of the system, its external interfaces, and its
principal users. This view is mandatory when using the 4+1 Views, because all elements
of the architecture should be derived from requirements.
• Logical view: Describes how the system is structured in terms of units of
implementation. The elements are packages, classes, and interfaces. The relationship
between elements shows dependencies, interface realizations, part‐whole relationships,
and so forth. Note: This view is mandatory when using the 4+1 Views of Software
Architecture.
• Implementation view: Describes how development artifacts are organized in the file
system. The elements are files and directories (any configuration items). This includes
development artifacts and deployment artifacts. This view is optional when using the
4+1 Views.
• Process view: Describes how the run‐time system is structured as a set of elements that
have run‐time behavior and interactions. Run‐time structure often bears little
resemblance to the code structure. It consists of rapidly changing networks of
communication objects. The elements are components that have run‐time presence
(processes, threads, Enterprise JavaBeans™ (EJB™), servlets, DLLs, and so on), data
stores, and complex connectors, such as queues. Interaction between elements varies,
based on technology. This view is useful for thinking about run‐time system quality
attributes, such as performance and reliability. This view is optional when using the 4+1
Views.
• Deployment view: Describe how the system is mapped to the hardware. This view is
optional when using the 4+1 Views.
In addition, you may wish to represent the following,
• Data view: A specialization of the logical view. Use this view if persistence is a significant
aspect of the system, and the translation from the design model to the data model is
not done automatically by the persistence mechanism.
Underlying Architectural Framework
This architecture follows the “4+1” framework [Kruchten 1995] that defines a set
of “Views” of the architecture.
Correspondence Between the Views
The various views are not fully orthogonal or independent. Elements of one view are connected
to elements
in other views, following certain design rules and heuristics.
From the logical to the process view
We identify several important characteristics of the classes of the logical architecture:
303
• Autonomy: are the objects active, passive, protected?‐an active object takes the initiative of
invoking other objects’ operations or its own operations, and has full control over the
invocation of its own operations by other objects ‐a passive object never invokes spontaneously
any operations and has no control over the invocation of its own operations by other objects ‐ a
protected object never invokes spontaneously any operations but performs some arbitration on
the invocation of its operations.
• Persistence: are the objects transient , permanent? Do they the failure of a process or
processor?
• Subordination: are the existence or persistence of an object depending on another object?
• Distribution: are the state or the operations of an object accessible from many nodes in the
physical architecture, from several processes in the process architecture?
In the logical view of the architecture we consider each object as active, and potentially
“concurrent,” i.e.,behaving “in parallel” with other objects, and we pay no more attention to
the exact degree of concurrency we need to achieve this effect. Hence the logical architecture
takes into account only the functional aspect of the requirements.
However when we come to defining the process architecture, implementing each object with
its own thread of control (e.g., its own Unix process or Ada task) is not quite practical in the
current state of technology, because of the huge overhead this imposes. Moreover, if objects
are concurrent, there must be some form of arbitration for invoking their operations.
From logical to development
A class is usually implemented as a module, for example a type in the visible part of an Ada
package. Large classes are decomposed into multiple packages. Collections of closely related
classes—class categories—are grouped into subsystems. Additional constraints must be
considered for the definition of subsystems,such as team organization, expected magnitude of
code (typically 5K to 20K SLOC per subsystem), degree of expected reuse and commonality, and
strict layering principles (visibility issues), release policy and configuration management.
Therefore we usually end up with a view that does not have a one to one correspondence with
the logical view.
The logical and development views are very close, but address very different concerns. We
have found that the larger the project, the greater the distance between these views. Similarly
for the process and physical views: the larger the project, the greater the distance between the
views. For example, if we compare fig. 3b and fig. 6, there is no one to one mapping of the class
categories to the layers. If we take the ‘External interfaces—Gateway’ category, its
implementation is spread across several layers: communications protocols are in subsystems in
or below layer 1, general gateway mechanisms are in subsystems in layer 2, and the actual
specific gateways in layer 5 subsystems.
From process to physical
Processes and process groups are mapped onto the available physical hardware, in various
configurations for testing or deployment. Birman describes some very elaborate schemes for
this mapping in the Isis project5.The scenarios relate mostly to the logical view, in terms of
which classes are used, and to the process view when the interactions between objects involve
more than one thread of control.
304
A scenario‐driven approach
The most critical functionality of the system is captured in the form of scenarios (or use cases).
By critical we mean: functions that are the most important, the raison d’être of the system, or
that have the highest frequency of use, or that present some significant technical risk that must
be mitigated.
Start:
• A small number of the scenarios are chosen for an iteration based on risk and criticality.
Scenarios may be synthesized to abstract a number of user requirements.
• A strawman architecture is put in place. The scenarios are then “scripted” in order to identify
major abstractions (classes, mechanisms, processes, subsystems) as indicated by Rubin and
Goldberg6 — decomposed in sequences of pairs (object, operation).
• The architectural elements discovered are laid out on the 4 blueprints: logical, process,
development, and physical.
• This architecture is then implemented, tested, measured, and this analysis may detect some
flaws or potential enhancement.
• Lessons learned are captured.
Loop:
The next iteration can then start by:
• reassessing the risks,
• extending the palette of scenarios to consider
• selecting a few additional scenarios that will allow risk mitigation or greater architecture
coverage
Then:
• Try to script those scenarios in the preliminary architecture
• discover additional architectural elements, or sometimes significant architectural changes
that need to occur to accommodate these scenarios
• update the 4 main blueprints: logical, process, development, physical
• revise the existing scenarios based on the changes
• upgrade the implementation (the architectural prototype) to support the new extended set of
scenario.
• Test. Measure under load, in real target environment if possible.
• All five blueprints are then reviewed to detect potential for simplification, reuse,
commonality.
• Design guidelines and rationale are updated.
• Capture the lessons learned.
305
Unit III
Web Services Building Block
Transport protocols for web services
Messaging with web services
Protocols
SOAP
Describing web services
WSDL
Anatomy of WSDL
Manipulating WSDL
Web service policy
Discovering web services
UDDI
Anatomy of UDDI
Web service inspection
Ad Hoc Discovery
Securing web services
Building a Web Service
Web Service Development Tasks
The following tasks are interdependent but not entirely sequential. For example, you can add a
method reference while creating a web service or afterward.Webservice development is an
iterative process.
The tasks, in general order of development workflow, are:
Creating a SOAP RPC web service
Developing XML operations (if needed)
Adding method references to a web service
Resolving references to runtime objects
Adding environment entries
Generating runtime classes
Assembling and deploying a web service as a J2EEapplication
Creating a test client
Testing a web service
Working with UDDI registries
306
Web services
Web services are a relatively new technology that implements a service‐oriented architecture.
During the development of this technology, a major focus was put on making functional
building blocks accessible over standard Internet protocols that are independent from
platforms and programming languages.
If we had to describe Web services using just one sentence, we would say:
Web services are self‐contained, modular applications that can be described, published,
located, and invoked over a network.
Web services perform encapsulated business functions, ranging from simple request‐reply to
full business process interactions. These services can be new applications or just wrapped
around existing legacy systems to make them network‐enabled. Services can rely on other
services to achieve their goals.
Figure1‐2 shows the relationship between the core elements of Web services in a service‐
oriented architecture (SOA).
307
The following core technologies (standards) are used for Web services.
XML (Extensible Markup Language) is the markup language that underlies most of the
specifications used for Web services. XML is a generic language that can be used to describe
any kind of content in a structured way, separated from its presentation to a specific device.
SOAP (Simple Object Access Protocol) is a network, transport, and programming language and
platform‐neutral protocol that allow a client to call a remote service. The message format is
XML.
SOAP is a transport‐independent messaging protocol. Each SOAP message is an XML document.
SOAP uses one‐way messages, although it’s possible to combine messages into request‐and‐
response sequences.
The SOAP specification defines the format of the XML message but not its content and how it’s
actually sent. SOAP does, however, specify how SOAP messages are routed over HTTP.
Each SOAP document has a root <Envelope> element. The root element, the first element in an
XML document, contains all the other elements in the document.
Within the “envelope” are two parts: a header and a body. The header contains routing or
context data. It can be empty. The body contains the actual message. It too can be empty.
308
WSDL (Web Services Description Language) is an XML‐based interface and implementation
description language. The service provider uses a WSDL document in order to specify the
operations a Web service provides and the parameters and data types of these operations. A
WSDL document also contains the service access information.
A web service is useless unless others can find out what it does and howto call it. Developers
must know enough information about a web serviceso they can write a client program that calls
it. WSDL is an XML‐based language used to define web services and describe how to access
them. Specifically, it describes the data and message contracts a web service offers. By
examining a web service’s WSDL document, developers know what methods are available and
how to call them using the proper parameters.
309
WSIL (Web Services Inspection Language, also WS‐Inspection) is an XML‐based specification
about how to locate Web services without the necessity of using UDDI. However, WSIL can be
also used together with UDDI, that is, it is orthogonal to UDDI and does not replace it. Most
business partners today do not find one another from UDDI registries; rather they are based on
existing relationships. That is where the Web Services Inspection Language fits in. WSIL
decentralizes the centralized model of service publication within a UDDI registry and distributes
the pieces such that each service provider itself can advertise its Web Services offerings. WSIL
thus facilitates the behavior that most businesses desiring to use Web Services (today) are most
comfortable with (today). Yet, WSIL is less widely used today as Web Service Registries take
their place.
UDDI (Universal Description, Discovery, and Integration) is both a client‐side API and a SOAP‐
based server implementation that can be used to store and retrieve information on service
providers and Web services.
UDDI is an evolving standard for describing, publishing, and discovering the web services that a
business provides. It’s a specification for a distributed registry of information on web services.
Once a web service is developed and a WSDL document describing it is created, there needs to
be a way to get the WSDL information into the hands of the users who want to use the web
service it describes. When a web service is published in a UDDI registry, potential users have a
way to look up and learn about the web service’s existence.
The content of a UDDI registry is similar to a telephone directory. In the “white pages” section
of the registry is information such as the name, address, and telephone number of the business
that offers one or more web services. The “yellow pages” section identifies the business type
and categorizes it by industry. The “green pages” section provides the data about the web
services the business offers.
310
Properties of Web services
All Web services share the following properties:
Web services are self‐contained.
On the client side, no additional software is required. A programming language with
XML and HTTP client support is enough to get you started. On the server side, merely an
HTTP server and a SOAP server are required. It is possible to enable an existing
application for Web services without writing a single line of code.
Web services are self‐describing.
The definition of the message format travels with the message; no external metadata
repositories or code generation tools are required.
Web services can be published, located, and invoked across the Web.
This technology uses established lightweight Internet standards such as HTTP. It
leverages the existing infrastructure. Some additional standards that are required to do
so include SOAP, WSDL, and UDDI.
Web services are modular.
Simple Web services can be aggregated to more complex ones, either using workflow
techniques or by calling lower‐layer Web services from a Web service implementation.
Web services can be chained together to perform higher‐level business functions. This
shortens development time and enables best‐of‐breed implementations.
Web services are language‐independent and interoperable.
311
The client and server can be implemented in different environments. Existing code does
not have to be changed in order to be Web service enabled. Basically, any language can
be used to implement Web service clients and servers.
2
Web services are inherently open and standard‐based.
XML and HTTP are the major technical foundation for Web services. A large part of the
Web service technology has been built using open‐source projects. Therefore, vendor
independence and interoperability are realistic goals.
Web services are loosely coupled.
Traditionally, application design has depended on tight interconnections at both ends.
Web services require a simpler level of coordination that allows a more flexible
reconfiguration for an integration of the services in question.
Web services are dynamic.
Dynamic e‐business can become reality using Web services, because with UDDI and
WSDL, the Web service description and discovery can be automated. In addition, Web
services can be implemented and deployed without disturbing clients that use them.
Web services provide programmatic access.
The approach provides no graphical user interface; it operates at the code level. Service
consumers have to know the interfaces to Web services but do not have to know the
implementation details of services.
Web services provide the ability to wrap existing applications.
Already existing stand‐alone applications can easily be integrated into the service‐
oriented architecture by implementing a Web service as an interface.
Web services build on proven, mature technology.
There are a lot of commonalities, as well as a few fundamental differences, with other
distributed computing frameworks. For example, the transport protocol is text based and not
binary.
Transport protocols for web services
Web services stack
312
The transport layer is the foundation of the Web services stack. Web services must be invoked
by a service client so they can be accessible to a network. Web services that are publicly
available run over the Internet. Only the authorized users within an internal organization can
view intranet‐available Web services, while unauthorized users in the outside world cannot. It is
possible to create extranet‐based Web services to allow legitimate users access to them on
more than one intranet.
The Internet protocols that can be supported by this stack are HTTP and, to a lesser extent,
SMTP (for electronic mail) and FTP (for file transfer). Intranet domains use middleware call
infrastructures, such as IBM's MQSeries, and CORBA (the Common Object Request Broker
Architecture). The latter relies on a protocol called the Internet Inter‐ORB Protocol (IIOP) for
remote objects.
HTTP has a special status in the W3C Web Services Architecture for a number of both practical
and theoretical reasons. For one, early versions of SOAP (and XML‐RPC, which is still widely
used in existing web services) were explicitly tied to HTTP. SOAP 1.1 implied a certain amount of
protocol‐independence, and SOAP 1.2 makes this explicit, but HTTP is still the dominant means
of communicating between web services agents. Also, HTTP is
4.3.2 Other Protocols
Many believe that HTTP is the "native" protocol of the Web because it was designed to work
with the URIs that identify Web resources. While HTTP has become almost ubiquitous, and
many of the issues surrounding its earlier incarnations have been resolved in subsequent
versions of the standard and by "industrial strength" implementations, it is not the only
protocol upon which Web services can be built. For example
• TCP
• UDP
• BEEP
• JMS Service Provider's protocol
• proprietary messaging systems
313
SOAP Transport Protocol
Client requests and web service responses are transmitted as Simple Object Access Protocol
(SOAP) messages over HTTP to enable a completely interoperable exchange between clients
and web services, all running on different platforms and at various locations on the Internet.
HTTP is a familiar request‐and response standard for sending messages over the Internet, and
SOAP is an XML‐based protocol that follows the HTTP request‐and‐response model.
The SOAP portion of a transported message handles the following:
• Defines an XML‐based envelope to describe what is in the message and how to process
the message
• Includes XML‐based encoding rules to express instances of application‐defined data
types within the message
• Defines an XML‐based convention for representing the request to the remote service
and the resulting response
Web Service Messaging
A client sends a request message to a web service, and receives a response message.
Messaging Protocols
The most common transport used for Web services interoperability is HTTP, and the most
common encodings used by Web services are XML‐based SOAP 1.1, SOAP 1.2, and Message
Transmission Optimization Mechanism (MTOM).
Web services are defined based on XML messaging, and the following three parameters
describe a given web service's messaging interaction.
1. Message exchange pattern
2. Synchronous and asynchronous client API
3. One‐way/two‐way behavior of the transport
314
Message Exchange Patterns
The sequence of messages involved in the Web service operation invocation is referred to as
the message exchange pattern
When a client and a web service communicate they exchange messages. A request message
is sent from the client to the web service. The web service responds with a response
message. This is just like in ordinary HTTP, where a web browser sends an HTTP request to a
web server, and the web server replies with an HTTP response.
In the beginning the only web service message format available was SOAP. Later came REST
type web services, which uses plain XML and HTTP. Following the REST movement came a
wave of people using JSON (JavaScript Object Notation) as message format. Another very
simple remoting protocol is called XML‐RPC (XML Remote Procedure Call). Of these, the
most common is SOAP and I will not get into details with these message formats here, since
they will get their own tutorial trails later. I will just briefly mention what they look like.
In the most abstract form, web service messaging is based on sending and receiving messages.
A given message is sent by one party, and received by a another.
SOAP
SOAP Message Model
The XML syntax of a SOAP message is fairly simple. A SOAP message consists of an envelope
containing:
• an optional header containing zero or more header entries (sometimes ambiguously
referred to as headers),
• a body containing zero or more body entries, and
• zero or more additional, non‐standard elements.
The only body entry defined by SOAP is a SOAP fault which is used for reporting errors.
Some of the XML elements of a SOAP message define namespaces, each in terms of a URI and a
local name, and encoding styles, a standard one of which is defined by SOAP.
SOAP message model looks like this:
315
SOAP (Simple Object Access Protocol) is an XML based message format. Here is a simple SOAP
message:
<?xml version="1.0"?>
<soap:Envelope
xmlns:soap="http://www.w3.org/2001/12/soap‐envelope"
soap:encodingStyle="http://www.w3.org/2001/12/soap‐encoding">
<soap:Header>
</soap:Header>
<soap:Body>
... message data ...
<soap:Fault>
</soap:Fault>
</soap:Body>
</soap:Envelope>
As you can see a SOAP message consists of:
• Envelope
o Header
o Body
Message Data
Fault (optional)
The same SOAP message structure is used to send both requests and responses between client
and web service.
The Fault element inside the Body element is optional. A Fault element is only sent back if an
error occurs inside the web service. Otherwise the normal message data is sent back.
316
SOAP doesn't specify how a message gets from the client to the web service, although the most
common scenario is via HTTP.
Advantages
• SOAP is versatile enough to allow for the use of different transport protocols. The
standard stacks use HTTP as a transport protocol, but other protocols are also usable
Disadvantages
• Because of the verbose XML format, SOAP can be considerably slower than competing
middleware technologies such as CORBA.
• When relying on HTTP as a transport protocol and not using WS‐Addressing the roles of
the interacting parties are fixed. Only one party (the client) can use the services of the
other.
• Most uses of HTTP as a transport protocol are done in ignorance of how the operation
would be modeled in HTTP
• When relying on HTTP as a transport protocol, a firewall designed to only allow web
browsing is forced to perform more detailed (and thus more costly) analysis of the HTTP
packages.
• Although SOAP is an open standard, not all languages offer appropriate support. Java,
Curl, Delphi, PHP, .NET and Flex offer excellent SOAP integration and/or IDE support.
Some Perl and Python support exists.
Protocols
Layers of the Web Services Protocol Stack
A web service protocol stack is a protocol stack (a stack of computer networking protocols)
that is used to define, locate, implement, and make Web services interact with each other. A
web service protocol stack typically stacks four protocols:
317
• (Service) Transport Protocol: responsible for transporting messages between network
applications and includes protocols such as HTTP, SMTP, FTP, as well as the more recent
Blocks Extensible Exchange Protocol (BEEP).
• (XML) Messaging Protocol: responsible for encoding messages in a common XML
format so that they can be understood at either end of a network connection. Currently,
this area includes such protocols as XML‐RPC, WS‐Addressing, and SOAP.
• (Service) Description Protocol: used for describing the public interface to a specific web
service. The WSDL interface format is typically used for this purpose.
• (Service) Discovery Protocol: centralizes services into a common registry such that
network web services can publish their location and description, and makes it easy to
discover what services are available on the network. Universal Description Discovery
and Integration was intended for this purpose, but it has not been widely adopted.
Web services consist of sets of internet protocols and standards for exchanging data between
applications. The Web Services Protocol Stack describes the layering of the set of internet
protocols or rules used to design, discover, and implement web services.
The major components or layers of a Web Service Protocol Stack include:
• Transport Layer—transports messages between applications
• XML Messaging Layer—encodes messages in XML that can be understood by both client
and server
• WSDL Layer—describes the service provided
• UDDI Layer—centralizes services with a common registry
Transport Layer
• The Transport layer is the first component in the stack and is responsible for moving
XML messages between applications. The Transport protocol most commonly used is
the standard HTTP protocol. Other commonly used Web protocols are SMTP and FTP.
XML Messaging
• The messaging layer in the protocol stack is based on an XML model. XML is widely used
in Web Services applications and is the foundation for all web services. XML is just one
of the standards enabling web services to map between technology domains. You will
find many resources on the Web that describe XML messaging. For more information,
refer to the World Wide Web Consortium (W3C) site on Messaging listed in the link list
below.
• The XML Messaging specification is a broadly‐defined umbrella under which a number
of more specific protocols are defined. SOAP is one of the more popular standards, and
is one of the most significant standards in communicating web services over the
network. XML provides a means for communicating over the Web using an XML
318
document that both requests and responds to information between two disparate
systems. SOAP allows the sender and the receiver of XML documents to support a
common data transfer protocol for effective networked communication. You will find
many resources on the Web that describe SOAP. For more information, refer to the
W3C site for SOAP listed in the link list below.
WSDL Layer
• This layer represents a way of specifying a public interface for a web service. It contains
information on available functions, on data types for XML messaging, binding
information about the transport protocol, and the location of the specific web service.
• Any client application that wants to know about a service, what data it expects to
receive, whether or not it delivers any results, and the supported transport, uses WSDL
to find that information. When you create a Web Service, it must be described and
advertised to its potential customers before it can be used. WSDL provides a common
format for describing and publishing that web service information. Typically, WSDL is
used with SOAP, and the WSDL specification includes a SOAP binding.
UDDI Layer
• This layer represents a way to publish and find web services over the Web. You can
think of this layer as the White and Yellow Pages of your phonebook. The White pages
of web services provides general information about a specific company, for instance,
their business name, description, and address. The Yellow Pages includes the
classification of data for the services offered, for instance, industry type and products.
• The protocol you use to publish your web services is known as UDDI. The UDDI Business
Registry allows anyone to search existing UDDI data and enables you to register your
company and its services. With RAD Studio, your data automatically gets published to
the registry, or a distributed directory for business and web services.
SOAP
Table of Contents
What is SOAP ?
This chapter explains what is SOAP and why SOAP is useful.
SOAP Message Structure
This chapter describes the structure of a complete SOAP message.
319
SOAP Envelope
This chapter describes the SOAP Envelope element of SOAP message.
SOAP Header
This chapter describes the SOAP Header element of SOAP message.
SOAP Body
This chapter describes the SOAP Body element of SOAP message.
SOAP Fault
This chapter describes the SOAP Fault element of SOAP message.
SOAP Encoding
This chapter describes the built‐in set of rules for encoding various data types.
SOAP Transport
This chapter describes the transport protocols for SOAP to exchange the messages.
SOAP Examples
This chapter has given a simple SOAP example to illustrate the concepts.
SOAP Standards
This chapter gives links to latest standards related to SOAP.
What is SOAP?
SOAP is an XML‐based protocol for exchanging information between computers.
SOAP is XML. That is, SOAP is an application of the XML specification.
All statements are TRUE for SOAP
SOAP is acronym for Simple Object Access Protocol
SOAP is a communication protocol
SOAP is designed to communicate via Internet
SOAP can extend HTTP for XML messaging
SOAP provides data transport for Web services
SOAP can exchange complete documents or call a remote procedure
SOAP can be used for broadcasting a message
SOAP is platform and language independent
SOAP is the XML way of defining what information gets sent and how
Although SOAP can be used in a variety of messaging systems and can be delivered via a variety
of transport protocols, the initial focus of SOAP is remote procedure calls transported via HTTP.
320
SOAP enables client applications to easily connect to remote services and invoke remote
methods.
Other frameworks, including CORBA, DCOM, and Java RMI, provide similar functionality to
SOAP, but SOAP messages are written entirely in XML and are therefore uniquely platform‐ and
language‐independent.
SOAP ‐ Recommended Knowledge
It is recommended that before you proceed further you should be familiar with XML and XML
namespace.
SOAP message is an ordinary XML document containing the following elements.
Envelope: ( Mandatory )
Defines the start and the end of the message.
Header: ( Optional )
Contains any optional attributes of the message used in processing the message, either
at an intermediary point or at the ultimate end point.
Body: ( Mandatory )
Contains the XML data comprising the message being sent.
Fault: ( Optional )
An optional Fault element that provides information about errors that occurred while
processing the message
All these elements are declared in the default namespace for the SOAP envelope:
http://www.w3.org/2001/12/soap‐envelope
and the default namespace for SOAP encoding and data types is:
http://www.w3.org/2001/12/soap‐encoding
NOTE: All these specificiations are subject to change. So keep updating yourself with the latest
specifications available W3 website.
A SOAP Message Structure
<?xml version="1.0"?>
321
<SOAP‐ENV:Envelope
xmlns:SOAP‐ENV="http://www.w3.org/2001/12/soap‐envelope"
SOAP‐ENV:encodingStyle="http://www.w3.org/2001/12/soap‐encoding">
<SOAP‐ENV:Header>
...
...
</SOAP‐ENV:Header>
<SOAP‐ENV:Body>
...
...
<SOAP‐ENV:Fault>
...
...
</SOAP‐ENV:Fault>
</SOAP‐ENV:Body>
</SOAP_ENV:Envelope>
SOAP Envelope
The SOAP envelope indicates the start and the end of the message so that the receiver knows
when an entire message has been received. The SOAP envelope solves the problem of knowing
when you're done receiving a message and are ready to process it. The SOAP envelope is
therefore basic ally a packaging mechanism
SOAP Envelope element can be explained as:
Every SOAP message has a root Envelope element.
Envelope element is mandatory part of SOAP Message.
Every Envelope element must contain exactly one Body element.
If an Envelope contains a Header element, it must contain no more than one, and it
must appear as the first child of the Envelope, beforethe Body.
The envelope changes when SOAP versions change.
The SOAP envelope is specified using the ENV namespace prefix and the Envelope
element.
The optional SOAP encoding is also specified using a namespace name and the optional
encodingStyle element, which could also point to an encoding style other than the SOAP
one.
A v1.1‐compliant SOAP processor will generate a fault when receiving a message
containing the v1.2 envelope namespace.
A v1.2‐ compliant SOAP processor generates a VersionMismatch fault if it receives a
message that does not include the v1.2 envelope namespace.
Example for v1.2 is given below
322
<?xml version="1.0"?>
<SOAP‐ENV:Envelope
xmlns:SOAP‐ENV="http://www.w3.org/2001/12/soap‐envelope"
SOAP‐ENV:encodingStyle="http://www.w3.org/2001/12/soap‐encoding">
...
Message information goes here
...
</SOAP‐ENV:Envelope>
Following example illustrates the use of a SOAP message within an HTTP POST operation, which
sends the message to the server. It shows the namespaces for the envelope schema definition
and for the schema definition of the encoding rules. The OrderEntry reference in the HTTP
header is the name of the program to be invoked at the tutorialspoint.com Web site.
POST /OrderEntry HTTP/1.1
Host: www.tutorialspoint.com
Content‐Type: application/soap; charset="utf‐8"
Content‐Length: nnnn
<?xml version="1.0"?>
<SOAP‐ENV:Envelope
xmlns:SOAP‐ENV="http://www.w3.org/2001/12/soap‐envelope"
SOAP‐ENV:encodingStyle="http://www.w3.org/2001/12/soap‐encoding">
...
Message information goes here
...
</SOAP‐ENV:Envelope>
323
NOTE: The HTTP binding specifies the location of the service.
SOAP header
The optional Header element offers a flexible framework for specifying additional application‐
level requirements. For example, the Header element can be used to specify a digital signature
for password‐protected services; likewise, it can be used to specify an account number for pay‐
per‐use SOAP services.
SOAP Header element can be explained as:
Header elements are optional part of SOAP messages.
Header elements can occur multiple times.
Headers are intended to add new features and functionality
The SOAP header contains header entries defined in a namespace.
The header is encoded as the first immediate child element of the SOAP envelope.
When more than one header is defined, all immediate child elements of the SOAP
header are interpreted as SOAP header blocks.
SOAP Header element can have following two attributes
Actor attribute:
The SOAP protocol defines a message path as a list of SOAP service nodes. Each of these
intermediate nodes can perform some processing and then forward the message to the
next node in the chain. By setting the Actor attribute, the client can specify the recipient
of the SOAP header.
MustUnderstand attribute
Indicates whether a Header element is optional or mandatory. If set to true ie. 1 the
recipient must understand and process the Header attribute according to its defined
semantics, or return a fault.
Following example shows how to use a Header in the SOAP message.
<?xml version="1.0"?>
<SOAP‐ENV:Envelope
xmlns:SOAP‐ENV="http://www.w3.org/2001/12/soap‐envelope"
SOAP‐ENV:encodingStyle="http://www.w3.org/2001/12/soap‐encoding">
<SOAP‐ENV:Header>
<t:Transaction
324
xmlns:t="http://www.tutorialspoint.com/transaction/"
SOAP‐ENV:mustUnderstand="true">5</t:Transaction>
</SOAP‐ENV:Header>
...
...
</SOAP‐ENV:Envelope>
Soap BODY
The SOAP body is a mandatory element which contains the application‐defined XML data being
exchanged in the SOAP message. The body must be contained within the envelope and must
follow any headers that might be defined for the message. The body is defined as a child
element of the envelope, and the semantics for the body are defined in the associated SOAP
schema.
The body contains mandatory information intended for the ultimate receiver of the message.
For example:
<?xml version="1.0"?>
<SOAP‐ENV:Envelope
........
<SOAP‐ENV:Body>
<m:GetQuotation xmlns:m="http://www.tp.com/Quotation">
<m:Item>Computers</m:Item>
</m:GetQuotation>
</SOAP‐ENV:Body>
</SOAP‐ENV:Envelope>
325
The example above requests the quotation of computer sets. Note that the m:GetQuotation
and the Item elements above are application‐specific elements. They are not a part of the SOAP
standard.
Here is the response of above query:
<?xml version="1.0"?>
<SOAP‐ENV:Envelope
........
<SOAP‐ENV:Body>
<m:GetQuotationResponse xmlns:m="http://www.tp.com/Quotation">
<m:Quotation>This is Qutation</m:Quotation>
</m:GetQuotationResponse>
</SOAP‐ENV:Body>
</SOAP‐ENV:Envelope>
Normally, the application also defines a schema to contain semantics associated with the
request and response elements.
The Quotation service might be implemented using an EJB running in an application server; if
so, the SOAP processor would be responsible for mapping the body information as parameters
into and out of the EJB implementation of the GetQuotationResponse service. The SOAP
processor could also be mapping the body information to a .NET object, a CORBA object, a
COBOL program, and so on.
When an error occurs during processing, the response to a SOAP message is a SOAP fault
element in the body of the message, and the fault is returned to the sender of the SOAP
message.
The SOAP fault mechanism returns specific information about the error, including a predefined
code, a description, the address of the SOAP processor that generated
A SOAP Message can carry only one fault block
Fault element is an optional part of SOAP Message
326
For the HTTP binding, a successful response is linked to the 200 to 299 range of status
codes;
SOAP fault is linked to the 500 to 599 range of status codes.
The SOAP Fault element has the following sub elements:
Sub Element Description
<faultCode> A text code used to indicate a class of errors. See the next Table for a
listing of predefined fault codes.
<faultString> A text message explaning the error
<faultActor> A text string indicating who caused the fault. This is useful if the SOAP
message travels through several nodes in the SOAP message path, and
the client needs to know which node caused the error. A node that
does not act as the ultimate destination must include a faultActor
element.
<detail> An element used to carry application‐specific error messages. The
detail element can contain child elements, called detail entries.
SOAP Fault Codes
The faultCode values defined below must be used in the faultcode element when describing
faults
Error Description
SOAP‐
Found an invalid namespace for the SOAP Envelope element
ENV:VersionMismatch
SOAP‐ An immediate child element of the Header element, with the
ENV:MustUnderstand mustUnderstand attribute set to "1", was not understood
SOAP‐ENV:Client The message was incorrectly formed or contained incorrect
information
SOAP‐ENV:Server There was a problem with the server so the message could not proceed
327
SOAP Fault Example
The following code is a sample Fault. The client has requested a method named
ValidateCreditCard , but the service does not support such a method. This represents a client
request error, and the server returns the following SOAP response:
<?xml version='1.0' encoding='UTF‐8'?>
<SOAP‐ENV:Envelope
xmlns:SOAP‐ENV="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/1999/XMLSchema‐instance"
xmlns:xsd="http://www.w3.org/1999/XMLSchema">
<SOAP‐ENV:Body>
<SOAP‐ENV:Fault>
<faultcode xsi:type="xsd:string">SOAP‐ENV:Client</faultcode>
<faultstring xsi:type="xsd:string">
Failed to locate method (ValidateCreditCard) in class
(examplesCreditCard) at /usr/local/ActivePerl‐5.6/lib/
site_perl/5.6.0/SOAP/Lite.pm line 1555.
</faultstring>
</SOAP‐ENV:Fault>
</SOAP‐ENV:Body>
</SOAP‐ENV:Envelope>
SOAP includes a built‐in set of rules for encoding data types.This enables the SOAP message to
indicate specific data types, such as integers, floats, doubles, or arrays.
SOAP data types are divided into two broad categories: scalar types and compound
types.
Scalar types contain exactly one value, such as a last name, price, or product
description.
Compound types contain multiple values, such as a purchase order or a list of stock
quotes.
Compound types are further subdivided into arrays and structs.
The encoding style for a SOAP message is set via the SOAP‐ENV:encodingStyle attribute.
To use SOAP 1.1 encoding, use the value http://schemas.xmlsoap.org/soap/encoding/
To use SOAP 1.2 encoding, use the value http://www.w3.org/2001/12/soap‐encoding
Latest SOAP specification adopts all the built‐in types defined by XML Schema. Still SOAP
maintains its own convention for defining constructs not standardized by XML Schema,
such as arrays and references.
328
SOAP Transport
SOAP is not tied to any one transport protocol.
SOAP can be transported via SMTP, FTP, IBM's MQSeries, or Microsoft Message Queuing
(MSMQ).
SOAP specification includes details on HTTP only.
HTTP remains the most popular SOAP transport protocol.
SOAP via HTTP
Quite logically, SOAP requests are sent via an HTTP request and SOAP responses are returned
within the content of the HTTP response. While SOAP requests can be sent via an HTTP GET, the
specification includes details on HTTP POST only.
Additionally, both HTTP requests and responses are required to set their content type to
text/xml.
The SOAP specification mandates that the client must provide a SOAPAction header, but the
actual value of the SOAPAction header is dependent on the SOAP server implementation.
For example, to access the AltaVista BabelFish Translation service, hosted by XMethods, you
must specify the following as a SOAPAction header.
urn:xmethodsBabelFish#BabelFish
Even if the server does not require a full SOAPAction header, the client must specify an empty
string (""), or a null value. For example:
SOAPAction: ""
SOAPAction:
Here is a sample request sent via HTTP to the XMethods Babelfish Translation service:
POST /perl/soaplite.cgi HTTP/1.0
Host: services.xmethods.com
Content‐Type: text/xml; charset=utf‐8
Content‐Length: 538
SOAPAction: "urn:xmethodsBabelFish#BabelFish"
329
<?xml version='1.0' encoding='UTF‐8'?>
<SOAP‐ENV:Envelope
xmlns:SOAP‐ENV="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/1999/XMLSchema‐instance"
xmlns:xsd="http://www.w3.org/1999/XMLSchema">
<SOAP‐ENV:Body>
<ns1:BabelFish
xmlns:ns1="urn:xmethodsBabelFish"
SOAP‐ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<translationmode xsi:type="xsd:string">en_fr</translationmode>
<sourcedata xsi:type="xsd:string">Hello, world!</sourcedata>
</ns1:BabelFish>
</SOAP‐ENV:Body>
</SOAP‐ENV:Envelope>
Note the content type and the SOAPAction header. Also note that the BabelFish method
requires two String parameters. The translation mode en_fr will translate from English to
French.
Here is the response from XMethods:
HTTP/1.1 200 OK
Date: Sat, 09 Jun 2001 15:01:55 GMT
Server: Apache/1.3.14 (Unix) tomcat/1.0 PHP/4.0.1pl2
SOAPServer: SOAP::Lite/Perl/0.50
Cache‐Control: s‐maxage=60, proxy‐revalidate
Content‐Length: 539
Content‐Type: text/xml
<?xml version="1.0" encoding="UTF‐8"?>
<SOAP‐ENV:Envelope
xmlns:SOAP‐ENC="http://schemas.xmlsoap.org/soap/encoding/"
SOAP‐ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:xsi="http://www.w3.org/1999/XMLSchema‐instance"
xmlns:SOAP‐ENV="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/1999/XMLSchema">
<SOAP‐ENV:Body>
<namesp1:BabelFishResponse xmlns:namesp1="urn:xmethodsBabelFish">
<return xsi:type="xsd:string">Bonjour, monde!</return>
</namesp1:BabelFishResponse>
</SOAP‐ENV:Body>
</SOAP‐ENV:Envelope>
330
SOAP responses delivered via HTTP are required to follow the same HTTP status codes. For
example, a status code of 200 OK indicates a successful response. A status code of 500 Internal
Server Error indicates that there is a server error and that the SOAP response includes a Fault
element.
SOAP Examples
In the example below, a GetQuotation request is sent to a SOAP Server over HTTP. The request
has a QuotationName parameter, and a Quotation will be returned in the response.
The namespace for the function is defined in "http://www.xyz.org/quotation" address.
Here is the SOAP request:
POST /Quotation HTTP/1.0
Host: www.xyz.org
Content‐Type: text/xml; charset=utf‐8
Content‐Length: nnn
<?xml version="1.0"?>
<SOAP‐ENV:Envelope
xmlns:SOAP‐ENV="http://www.w3.org/2001/12/soap‐envelope"
SOAP‐ENV:encodingStyle="http://www.w3.org/2001/12/soap‐encoding">
<SOAP‐ENV:Body xmlns:m="http://www.xyz.org/quotations">
<m:GetQuotation>
<m:QuotationsName>MiscroSoft</m:QuotationsName>
</m:GetQuotation>
</SOAP‐ENV:Body>
</SOAP‐ENV:Envelope>
331
A corresponding SOAP response will look like :
HTTP/1.0 200 OK
Content‐Type: text/xml; charset=utf‐8
Content‐Length: nnn
<?xml version="1.0"?>
<SOAP‐ENV:Envelope
xmlns:SOAP‐ENV="http://www.w3.org/2001/12/soap‐envelope"
SOAP‐ENV:encodingStyle="http://www.w3.org/2001/12/soap‐encoding">
<SOAP‐ENV:Body xmlns:m="http://www.xyz.org/quotation">
<m:GetQuotationResponse>
<m:Quotation>Here is the quotation</m:Quotation>
</m:GetQuotationResponse>
</SOAP‐ENV:Body>
</SOAP‐ENV:Envelope>
SOAP Standards
SOAP 1.1 was originally submitted to the W3C in May 2000. Official submitters included large
companies, such as Microsoft, IBM, and Ariba, and smaller companies, such as UserLand
Software and DevelopMentor.
In July 2001, the XML Protocol Working Group released a "working draft" of SOAP 1.2. Within
the W3C, this document is officially a work in progress, meaning that the document is likely to
be updated many times before it is finalized.
SOAP Version 1.1 is available online at
http://www.w3.org/TR/SOAP/
332
The working draft of SOAP Version 1.2 is available at
http://www.w3.org/TR/soap12/
Note that the W3C also hosts a submission for "SOAP Messages with Attachments", which
separates from the core SOAP specification. This specification enables SOAP messages to
include binary attachments, such as images and sound files. For full details, see the W3C Note
at
http://www.w3.org/TR/SOAP‐attachments.
SOAP Implementations
If you are interested to list out your website, book or any other resources in the list below then
please contact at webmaster@tutorialspoint.com
Dozens of SOAP implementations now freely exist on the Internet. Here are four of the most
popular and widely cited implementations.
Apache SOAP (http://xml.apache.org/soap/)
Open source Java implementation of the SOAP protocol; based on the IBM SOAP4J
implementation
Microsoft SOAP ToolKit 2.0 (http://msdn.microsoft.com/soap/ )
COM implementation of the SOAP protocol for C#, C++, Visual Basic, or other COM‐
compliant languages
SOAP::Lite for Perl (http://www.soaplite.com/)
Perl implementation of the SOAP protocol, written by Paul Kulchenko, that includes
support for WSDL and UDDI
GLUE from the Web Methods (http://www.webmethods.com)
Java implementation of the SOAP protocol that includes support for WSDL and UDDI
Describing web services
333
1. Describing Web Services ‐ users need to find a service to perform a given task
2. Interoperation ‐ joining services together frequently requires shims /mediators
WSDL Basics
The Web Services Description Language (WSDL) specification was created to describe and
publish the formats and protocols of a Web service in a standard way. Web service interface
standards are needed to ensure that you don't have to create special interactions with each
server on the Web, as you would today, using the extended URL approach from a browser.
WSDL establishes a common format for describing and publishing Web service information
WSDL elements contain a description of the data, typically using one or more XML schemas, to
be passed to the Web service so that both the sender and the receiver understand the data
being exchanged. The WSDL elements also contain a description of the operations to be
performed on that data, so that the receiver of a message knows how to process it, and a
binding to a protocol or transport, so that the sender knows how to send it. Typically, WSDL is
used with SOAP, and the WSDL specification includes a SOAP binding.
WSDL elements describe data and operations on it
WSDL was developed by Microsoft, Ariba, and IBM, and v1.1 of the specification was submitted
to the W3C, which accepted WSDL as a note and published it on the W3C Web site.1 Twenty‐
two other companies joined the submission, comprising at that time the largest number of W3C
members ever to support a joint submission. WSDL therefore already enjoys broad‐based
support, and many companies offer implementations of WSDL in their Web services products.
In fact, with such near unanimity within the vendor community, it could be said that the WSDL
specification provides the de facto definition of a Web service description. However, it is very
likely that a W3C working group will nonetheless make significant improvements and changes.
WSDL was developed collaboratively by IBM, Microsoft, and Ariba
Both parties that participate in a Web Services “conversation” or interaction must have access
to the same WSDL to be able to understand each other.2 In other words, both the sender and
the receiver of a message involved in a Web service interaction must have access to the same
XML schema. The sender needs to know how to format the output message correctly, and the
receiver needs to understand how to interpret the input message correctly. As long as both
parties to the interaction have the same WSDL file, the implementations behind the Web
services can be anything. This is the magic of WSDL: It provides a common format to encode
and to decode messages to and from virtually any back‐end application, such as CORBA, COM,
EJB, JMS, MQ Series, ERP systems, and so on.
334
What is the transport protocol you use to call a Web service?
SOAP. Transport Protocols: It is essential for the acceptance of Web Services that they are
based on established Internet infrastructure. This in fact imposes the usage of of the HTTP,
SMTP and FTP protocols based on the TCP/IP family of transports. Messaging Protocol: The
format of messages exchanged between Web Services clients and Web Services should be
vendor neutral and should not carry details about the technology used to implement the
service. Also, the message format should allow for extensions and different bindings to specific
transport protocols. SOAP and ebXML Transport are specifications which fulfill these
requirements. We expect that the W3C XML Protocol Working Group defines a successor
standard.
WSDL
Web Services Description Language is the standard format for describing a
web service in XML format.
In this tutorial you will learn what is WSDL and Why and How to use it.
WSDL is very easy to learn and very important for Web Services.
WSDL Introduction
WSDL Abstract:
WSDL stands for Web Services Description Language
WSDL is an XML based protocol for information exchange in decentralized and
distributed environments.
WSDL is the standard format for describing a web service.
WSDL definition describes how to access a web service and what operations it will
perform.
WSDL is a language for describing how to interface with XML‐based services.
WSDL is an integral part of UDDI, an XML‐based worldwide business registry.
WSDL is the language that UDDI uses.
WSDL was developed jointly by Microsoft and IBM.
WSDL is pronounced as 'wiz‐dull' and spelled out as 'W‐S‐D‐L'
335
WSDL Usage:
WSDL is often used in combination with SOAP and XML Schema to provide web services over
the Internet. A client program connecting to a web service can read the WSDL to determine
what functions are available on the server. Any special datatypes used are embedded in the
WSDL file in the form of XML Schema. The client can then use SOAP to actually call one of the
functions listed in the WSDL.
History of WSDL
WSDL 1.1 was submitted as a W3C Note by Ariba, IBM and Microsoft for describing services for
the W3C XML Activity on XML Protocols in March 2001.
WSDL 1.1 has not been endorsed by the World Wide Web Consortium (W3C), however it has
just (May 11th, 2005) released a draft for version 2.0, that will be a recommendation (an official
standard), and thus endorsed by the W3C.
The structure of a WSDL document.
336
Anatomyy of WSDL
The following is the sstructure of the informaation in a WSSDL file:
337
A WSDL file contains the following parts:
• Web service interface definition
This part contains the elements and the namespaces.
• Web service implementation
This part contains the definition of the service and ports.
WSDL Elements
WSDL breaks down Web services into three specific, identifiable elements that can be
combined or reused once defined.
Three major elements of WSDL that can be defined separately and they are:
Types
Operations
Binding
A WSDL document has various elements, but they are contained within these three main
elements, which can be developed as separate documents and then they can be combined or
reused to form complete WSDL files.
Following are the elements of WSDL document. Within these elements are further
subelements, or parts:
338
Definition: element must be the root element of all WSDL documents. It defines the
name of the web service, declares multiple namespaces used throughout the remainder
of the document, and contains all the service elements described here.
Data types: the data types ‐ in the form of XML schemas or possibly some other
mechanism ‐ to be used in the messages
Message: an abstract definition of the data, in the form of a message presented either
as an entire document or as arguments to be mapped to a method invocation.
Operation: the abstract definition of the operation for a message, such as naming a
method, message queue, or business process, that will accept and process the message
Port type : an abstract set of operations mapped to one or more end points, defining
the collection of operations for a binding; the collection of operations, because it is
abstract, can be mapped to multiple transports through various bindings.
Binding: the concrete protocol and data formats for the operations and messages
defined for a particular port type.
Port: a combination of a binding and a network address, providing the target address of
the service communication.
Service: a collection of related end points encompassing the service definitions in the
file; the services map the binding to the port and include any extensibility definitions.
In addition to these major elements, the WSDL specification also defines the following utility
elements:
Documentation: element is used to provide human‐readable documentation and can be
included inside any other WSDL element.
Import: element is used to import other WSDL documents or XML Schemas.
NOTE: WSDL parts usually are generated automatically using Web services‐aware tools.
A WSDL file describes a Web service with the following elements:
portType
The description of the operations and associated messages. The portType element defines
abstract operations.
<portType name="EightBall">
<operation name="getAnswer">
<input message="ebs:IngetAnswerRequest"/>
<output message="ebs:OutgetAnswerResponse"/>
</operation>
</portType>
message
The description of input and output parameters and return values.
339
<message name="IngetAnswerRequest">
<part name="meth1_inType" type="ebs:questionType"/>
</message>
<message name="OutgetAnswerResponse">
<part name="meth1_outType" type="ebs:answerType"/>
</message>
types
The schema for describing XML types used in the messages.
<types>
<xsd:schema targetNamespace="...">
<xsd:complexType name="questionType">
<xsd:element name="question" type="string"/>
</xsd:complexType>
<xsd:complexType name="answerType">
...
</types>
binding
The bindings describe the protocol that is used to access a portType, as well as the data formats
for the messages that are defined by a particular portType element.
<binding name="EightBallBinding" type="ebs:EightBall">
<soap:binding style="rpc" transport="schemas.xmlsoap.org/soap/http">
<operation name="ebs:getAnswer">
<soap:operation soapAction="urn:EightBall"/>
<input>
<soap:body namespace="urn:EightBall" ... />
...
Service
The services and ports define the location of the Web service.
The service contains the Web service name and a list of ports.
Ports
The ports contain the location of the Web service and the binding used for service access.
<service name="EightBall">
<port binding="ebs:EightBallBinding" name="EightBallPort">
<soap:address location="localhost:8080/axis/EightBall"/>
</port>
340
</service>
The WSDL Document Structure
The main structure of a WSDL document looks like this:
<definitions>
<types>
definition of types........
</types>
<message>
definition of a message....
</message>
<portType>
<operation>
definition of a operation.......
</operation>
</portType>
<binding>
definition of a binding....
</binding>
<service>
definition of a service....
</service>
</definitions>
A WSDL document can also contain other elements, like extension elements and a service
element that makes it possible to group together the definitions of several web services in one
single WSDL document.
Proceed further to analyze an example of WSDL Document.
Manipulating WSDL
WSDL Document Example
Following is the WSDL file that is provided to demonstrate a simple WSDL program.
341
Assuming the service provides a single publicly available function, called sayHello. This function
expects a single string parameter and returns a single string greeting. For example if you pass
the parameter world then service function sayHello returns the greeting, "Hello, world!".
Content of HelloService.wsdl file
<definitions name="HelloService"
targetNamespace="http://www.examples.com/wsdl/HelloService.wsdl"
xmlns="http://schemas.xmlsoap.org/wsdl/"
xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
xmlns:tns="http://www.examples.com/wsdl/HelloService.wsdl"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<message name="SayHelloRequest">
<part name="firstName" type="xsd:string"/>
</message>
<message name="SayHelloResponse">
<part name="greeting" type="xsd:string"/>
</message>
<portType name="Hello_PortType">
<operation name="sayHello">
<input message="tns:SayHelloRequest"/>
<output message="tns:SayHelloResponse"/>
</operation>
</portType>
<binding name="Hello_Binding" type="tns:Hello_PortType">
<soap:binding style="rpc"
transport="http://schemas.xmlsoap.org/soap/http"/>
<operation name="sayHello">
<soap:operation soapAction="sayHello"/>
<input>
<soap:body
encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
namespace="urn:examples:helloservice"
use="encoded"/>
</input>
<output>
<soap:body
encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
namespace="urn:examples:helloservice"
use="encoded"/>
</output>
</operation>
</binding>
342
<service name="Hello_Service">
<documentation>WSDL File for HelloService</documentation>
<port binding="tns:Hello_Binding" name="Hello_Port">
<soap:address
location="http://www.examples.com/SayHello/">
</port>
</service>
</definitions>
Analysis of the Example
Definition : HelloService
Type : Using built‐in data types and they are defined in XMLSchema.
Message :
1. sayHelloRequest : firstName parameter
2. sayHelloresponse: greeting return value
Port Type: sayHello operation that consists of a request and response service.
Binding: Direction to use the SOAP HTTP transport protocol.
Service: Service available at http://www.examples.com/SayHello/.
Port: Associates the binding with the URI http://www.examples.com/SayHello/ where
the running service can be accessed.
A detailed description of these elements is given in subsequent sections of the tutorial.
WSDL Definition Element
The <definition> element must be the root element of all WSDL documents. It defines the
name of the web service.
Here is the example piece of code from last session which uses definition element.
<definitions name="HelloService"
targetNamespace="http://www.examples.com/wsdl/HelloService.wsdl"
xmlns="http://schemas.xmlsoap.org/wsdl/"
xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
xmlns:tns="http://www.examples.com/wsdl/HelloService.wsdl"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
................................................
</definitions>
343
From the above example we can conclude the followings points:
The definitions element is a container of all the other elements.
The definitions element specifies that this document is the HelloService.
The definitions element specifies a targetNamespace attribute. The targetNamespace is
a convention of XML Schema that enables the WSDL document to refer to itself. In this
example we have specified a targetNamespace of
http://www.examples.com/wsdl/HelloService.wsdl.
The definition element specifies a default namespace:
xmlns=http://schemas.xmlsoap.org/wsdl/. All elements without a namespace prefix,
such as message or portType, are therefore assumed to be part of the default WSDL
namespace.
It also specifies numerous namespaces that will be used throughout the remainder of
the document.
NOTE: The namespace specification does not require that the document actually exist at the
given location. The important point is that you specify a value that is unique, different from all
other namespaces that are defined.
WSDL Types Element
A Web service needs to define its inputs and outputs and how they are mapped into and out of
services. WSDL <types> element take care of defining the data types that are used by the web
service. Types are XML documents, or document parts.
Here is a piece of code taken from W3C specification. This code depicts how a types element
can be used within a WSDL.
The types element describes all the data types used between the client and server.
WSDL is not tied exclusively to a specific typing system
WSDL uses the W3C XML Schema specification as its default choice to define data types.
If the service uses only XML Schema built‐in simple types, such as strings and integers,
then types element is not required.
WSDL allows the types to be defined in separate elements so that the types are reusable
with multiple Web services.
<types>
<schema targetNamespace="http://example.com/stockquote.xsd"
xmlns="http://www.w3.org/2000/10/XMLSchema">
<element name="TradePriceRequest">
<complexType>
<all>
<element name="tickerSymbol" type="string"/>
344
</all>
</complexType>
</element>
<element name="TradePrice">
<complexType>
<all>
<element name="price" type="float"/>
</all>
</complexType>
</element>
</schema>
</types>
Data types address the problem of how to identify the data types and formats you intend to use
with your Web services. Type information is shared between sender and receiver. The
recipients of messages therefore need access to the information you used to encode your data
and must understand how to decode the data.
WSDL Message Element
The <message> element describes the data being exchanged between the Web service
providers and consumers.
Each Web Service has two messages: input and output.
The input describes the parameters for the Web Service and the output describes the
return data from the Web Service.
Each message contains zero or more <part> parameters, one for each parameter of the
Web Service's function.
Each <part> parameter associates with a concrete type defined in the <types> container
element.
Lets take a piece of code from the Example Session:
<message name="SayHelloRequest">
<part name="firstName" type="xsd:string"/>
</message>
<message name="SayHelloResponse">
<part name="greeting" type="xsd:string"/>
</message>
Here, two message elements are defined. The first represents a request message
SayHelloRequest, and the second represents a response message SayHelloResponse.
Each of these messages contains a single part element. For the request, the part specifies the
345
function parameters; in this case, we specify a single firstName parameter. For the response,
the part specifies the function return values; in this case, we specify a single greeting return
value.
WSDL portType Element
The <portType> element combines multiple message elements to form a complete oneway or
round‐trip operation.
For example, a <portType> can combine one request and one response message into a single
request/response operation. This is most commonly used in SOAP services. A portType can
define multiple operations.
Lets take a piece of code from the Example Session:
<portType name="Hello_PortType">
<operation name="sayHello">
<input message="tns:SayHelloRequest"/>
<output message="tns:SayHelloResponse"/>
</operation>
</portType>
The portType element defines a single operation, called sayHello.
The operation itself consists of a single input message SayHelloRequest
The operation itself consists of a single output message SayHelloResponse
Patterns of Operation
WSDL supports four basic patterns of operation:
One‐way :
The service receives a message. The operation therefore has a single input element. The
grammar for a one‐way operation is:
<wsdl:definitions .... > <wsdl:portType .... > *
<wsdl:operation name="nmtoken">
<wsdl:input name="nmtoken"? message="qname"/>
</wsdl:operation>
</wsdl:portType >
</wsdl:definitions>
346
Request‐response:
The service receives a message and sends a response. The operation therefore has one input
element, followed by one output element. To encapsulate errors, an optional fault element can
also be specified. The grammar for a request‐response operation is:
<wsdl:definitions .... >
<wsdl:portType .... > *
<wsdl:operation name="nmtoken" parameterOrder="nmtokens">
<wsdl:input name="nmtoken"? message="qname"/>
<wsdl:output name="nmtoken"? message="qname"/>
<wsdl:fault name="nmtoken" message="qname"/>*
</wsdl:operation>
</wsdl:portType >
</wsdl:definitions>
Solicit‐response:
The service sends a message and receives a response. The operation therefore has one output
element, followed by one input element. To encapsulate errors, an optional fault element can
also be specified. The grammar for a solicit‐response operation is:
<wsdl:definitions .... >
<wsdl:portType .... > *
<wsdl:operation name="nmtoken" parameterOrder="nmtokens">
<wsdl:output name="nmtoken"? message="qname"/>
<wsdl:input name="nmtoken"? message="qname"/>
<wsdl:fault name="nmtoken" message="qname"/>*
</wsdl:operation>
</wsdl:portType >
</wsdl:definitions>
Notification :
The service sends a message. The operation therefore has a single output element. Following is
the grammer for a notification operation:
<wsdl:definitions .... >
<wsdl:portType .... > *
<wsdl:operation name="nmtoken">
<wsdl:output name="nmtoken"? message="qname"/>
</wsdl:operation>
</wsdl:portType >
</wsdl:definitions>
347
WSDL Binding Element
The <binding> element provides specific details on how a portType operation will
actually be transmitted over the wire.
The bindings can be made available via multiple transports, including HTTP GET, HTTP
POST, or SOAP.
The bindings provide concrete information on what protocol is being used to transfer
portType operations.
The bindings provide information where the service is located.
For SOAP protocol, the binding is <soap:binding>, and the transport is SOAP messages
on top of HTTP protocol.
You can specify multiple bindings for a single portType.
The binding element has two attributes ‐ the name attribute and the type attribute.
<binding name="Hello_Binding" type="tns:Hello_PortType">
The name attribute defines the name of the binding, and the type attribute points to the port
for the binding, in this case the "tns:Hello_PortType" port.
SOAP Binding
WSDL 1.1 includes built‐in extensions for SOAP 1.1. This enables you to specify SOAPspecific
details, including SOAP headers, SOAP encoding styles, and the SOAPAction HTTP header. The
SOAP extension elements include:
soap:binding
This element indicates that the binding will be made available via SOAP. The style attribute
indicates the overall style of the SOAP message format. A style value of rpc specifies an RPC
format.
The transport attribute indicates the transport of the SOAP messages. The value
http://schemas.xmlsoap.org/soap/http indicates the SOAP HTTP transport, whereas
http://schemas.xmlsoap.org/soap/smtp indicates the SOAP SMTP transport.
soap:operation
This element indicates the binding of a specific operation to a specific SOAP implementation.
The soapAction attribute specifies that the SOAPAction HTTP header be used for identifying the
service.
soap:body
This element enables you to specify the details of the input and output messages. In the case of
HelloWorld, the body element specifies the SOAP encoding style and the namespace URN
348
associated with the specified service.
Here is the piece of code from Example section:
<binding name="Hello_Binding" type="tns:Hello_PortType">
<soap:binding style="rpc"
transport="http://schemas.xmlsoap.org/soap/http"/>
<operation name="sayHello">
<soap:operation soapAction="sayHello"/>
<input>
<soap:body
encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
namespace="urn:examples:helloservice"
use="encoded"/>
</input>
<output>
<soap:body
encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
namespace="urn:examples:helloservice"
use="encoded"/>
</output>
</operation>
</binding>
WSDL Ports Element
A <port> element defines an individual endpoint by specifying a single address for a binding.
Here is the grammer to specify a port:
<wsdl:definitions .... >
<wsdl:service .... > *
<wsdl:port name="nmtoken" binding="qname"> *
<‐‐ extensibility element (1) ‐‐>
</wsdl:port>
</wsdl:service>
</wsdl:definitions>
The port element has two attributes ‐ the name attribute and the binding attribute.
The name attribute provides a unique name among all ports defined within in the
enclosing WSDL document.
The binding attribute refers to the binding using the linking rules defined by WSDL.
Binding extensibility elements (1) are used to specify the address information for the
port.
349
A port MUST NOT specify more than one address.
A port MUST NOT specify any binding information other than address information.
Here is the pice of code from Example session:
<service name="Hello_Service">
<documentation>WSDL File for HelloService</documentation>
<port binding="tns:Hello_Binding" name="Hello_Port">
<soap:address
location="http://www.examples.com/SayHello/">
</port>
</service>
WSDL Service Element
The <service> element defines the ports supported by the Web service. For each of the
supported protocols, there is one port element. The service element is a collection of
ports.
Web service clients can learn from the service element where to access the service,
through which port to access the Web service, and how the communication messages
are defined.
The service element includes a documentation element to provide human‐readable
documentation.
Here is a pice of code from Example Session:
<service name="Hello_Service">
<documentation>WSDL File for HelloService</documentation>
<port binding="tns:Hello_Binding" name="Hello_Port">
<soap:address
location="http://www.examples.com/SayHello/">
</port>
</service>
The binding attributes of por element associate the address of the service with a binding
element defined in the Web service. In this example this is Hello_Binding
<binding name="Hello_Binding" type="tns:Hello_PortType">
<soap:binding style="rpc"
transport="http://schemas.xmlsoap.org/soap/http"/>
<operation name="sayHello">
<soap:operation soapAction="sayHello"/>
<input>
350
<soap:body
encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
namespace="urn:examples:helloservice"
use="encoded"/>
</input>
<output>
<soap:body
encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
namespace="urn:examples:helloservice"
use="encoded"/>
</output>
</operation>
</binding>
Web Services
Web services are open standard ( XML, SOAP, HTTP etc.) based Web applications that interact
with other web applications for the purpose of exchanging data.
To learn more about Web Services visit Web Services Tutorial
UDDI
UDDI is an XML‐based standard for describing, publishing, and finding Web services.
To learn more about UDDI visit UDDI Tutorial
SOAP
SOAP is a simple XML‐based protocol that allows applications to exchange information over
HTTP.
To learn more about SOAP visit SOAP Tutorial
How WSDL works
The following diagram illustrates how a Web service is registered, found and called in a scenario
based on Java technology. In this diagram, the Web service is registered in a UDDI repository
using the Java API for XML Registries (JAXR), where a business partner or other system can find
the service. The registry information from UDDI is used to locate a WSDL document that details
the call semantics for the Web service. With the WSDL document in hand, the Java programmer
can then feed it to a tool that can generate a Java object proxy to the Web service, or simply
use it as a reference document along with a lower‐level SOAP API.
351
Web service Policy
Introduction
The Web Service Policy Framework (WS‐Policy) specification defines a syntax and semantic for
service providers and service requestors to describe their requirements, preferences, and
capabilities. The syntax provides a flexible and concise way of expressing the needs of each
domain in the form of policies. A domain in this context is a generic field of interest that applies
to the service, such as the following:
• Security
• Privacy
• Application priorities
• User account priorities
• Traffic control
What is WS‐Policy?
• Allows a Web service to have a set of rules that must be met by the client before it
access the Web service.
• Clients that access the Web service check to see whether or not they can adhere to
these policies.
352
• >Example: Web service has a policy “all messages be encrypted or signed in a certain
way” ‐ client cannot
• access the service without meeting this policy requirement
• > Example: Web service has a policy requiring that every message has to have a
timestamp
What is WS‐Policy?
Also defines processing models for these policies that operate independently of the domains.
• There are three defined operations for processing policies
• Normalize
• Merge
• Intersect
WS‐Policy Terms
• Policy
> A collection of policy alternatives
• Policy alternative
> A collection of assertions
> In normal form, a policy contains a list of policy alternatives specified in wsp:All tags
• Policy assertion
> Represents a requirement or capability.
> For example, a policy assertion could require that a certain type of encryption be used in
encrypting transmitted data.
WS‐Policy Terms
• Policy assertion type
> A class of policy assertions
• Policy expression
> An XML representation of a policy
• Policy subject
> An entity with which a policy can be applied
> Examples: an endpoint, message, resource, interaction
• Policy scope
> A set of policy subjects
• Policy attachment
> A mechanism for associating policy with one or more policy scope
353
The specification also describes processing models for these policies that operate
independently of the domains. There are three defined operations for processing policies;
Normalize, Merge, and Intersect. This paper discusses these functions and highlights some of
the less obvious implications of this processing.
It is worth taking a moment to cover some of the basic terminology that the WS‐Policy
framework specification uses. The words Assertion and Alternative are used extensively in this
paper, so the following sections describe these terms so that you are familiar with them.
The assertion
An assertion is the basic unit of policy. It can be thought of as an instruction to a policy
processing infrastructure. For example, the assertion could declare that the message be
encrypted; the actual definition of this assertion would be in the WS‐Security Policy domain
specification.
Thus, the meaning of individual assertions is specific to each domain and identified in their own
separate specifications; the details of each domain‐specific assertion is beyond the scope of the
WS‐Policy framework. However, the WS‐Policy framework treats each assertion as an opaque.
Each assertion is identified by its Qualified Name (QName). An assertion can be a simple string
or can be a complex object with many sub elements and attributes. However, it is only the root
XML element QName that is involved in any WS‐Policy framework processing.
The alternative
354
A policy is built up using assertions and nested combinations of the operators <wsp:All>,
<wsp:ExactlyOne>, and the attribute Optional. This policy syntax is used to describe acceptable
combinations of assertions to form a complete set of instructions to the policy processing
infrastruction, for a given Web service invocation. Each set of assertions is termed an
alternative.
Note that only a policy which includes the Optional attribute or the <wsp:ExactlyOne> operator
has more than one alternative. Put the other way round, a policy that is built up using only
assertions and the <wsp:All> operator can be collapsed into a single alternative.
Discovering Web Services
Web services provide access to software systems over the Internet using standard protocols. In
a minimalistic scenario there exists at least a Web service provider that publishes some service
such as a weather service and a Web service consumer that uses this service. Web service
discovery is the process of finding a suitable Web service for given task.
Publishing a Web service involves at the bare minimum to create the software artifact and
make it accessible to potential consumers. In order that a consumer can use a service, providers
usually augment a Web service endpoint with an interface description using the Web Services
Description Language (WSDL).
Optionally a provider can explicitly register a service with a Web services registry such as UDDI
or publish additional documents intended to facilitate discovery such as Web Services
Inspection Language (WSIL) documents. The service users or consumers need to search Web
services manually or automatically. The implementation of UDDI servers and WSIL engines
should provide simple search APIs or web‐based GUI to help find Web services.
Web services may also be discovered using multicast mechanisms like WS‐Discovery, thus
reducing the need for centralized registries in smaller networks.
355
Web services can be located through a public business registry, a private business registry, or a
WSIL document. UDDI manages the discovery of Web services by relying on a distributed
registry of businesses and their service descriptions implemented in a common XML format.
Prerequisites:
1. Register with a registry
2. Launch the Web Services Explorer
3. Add the registry to the Web Services Explorer
Service registries and discovery Service registries and discovery
Universal Description, Discovery and Integration
Web Services Inspection Language (WSIL) is a service discovery mechanism that is an
alternative to UDDI as well as complementary to UDDI. WSIL allows you to go directly to the
service provider and ask for the services it provides. For more information on the Web Services
Inspection Language specification, refer to
You can discover a Web service in two ways using the Web Services Explorer:
• Discovering a Web service from a UDDI registry
• Discovering a Web service from a WSIL document
UDDI
UDDI
UDDI is an XML‐based standard for describing, publishing, and finding Web services.
UDDI stands for Universal Description, Discovery and Integration.
In this tutorial you will learn what is UDDI and Why and How to use it.
What is UDDI?
UDDI is an XML‐based standard for describing, publishing, and finding Web services.
UDDI stands for Universal Description, Discovery and Integration.
UDDI is a specification for a distributed registry of Web services.
UDDI is platform independent, open framework.
UDDI can communicate via SOAP, CORBA, Java RMI Protocol.
UDDI uses WSDL to describe interfaces to web services.
UDDI is seen with SOAP and WSDL as one of the three foundation standards of web
services.
356
UDDI is an open industry initiative enabling businesses to discover each other and
define how they interact over the Internet.
UDDI has two parts:
A registry of all a web service's metadata including a pointer to the WSDL description of
a service
A set of WSDL port type definitions for manipulating and searching that registry
History of UDDI
UDDI 1.0 was originally announced by Microsoft, IBM, and Ariba in September 2000.
Since the initial announcement, the UDDI initiative has grown to include more than 300
companies inclduing Dell, Fujitsu, HP, Hitachi, IBM, Intel, Microsoft, Oracle, SAP, and
Sun.
May 2001, Microsoft and IBM launched the first UDDI operator sites and turned the
UDDI registry live.
June 2001, UDDI announced Version 2.0.
As of this writing, the Microsoft and IBM sites implement the 1.0 specification and plan
2.0 support in the near future
Currently UDDI is sponsored by OASIS
Partner Interface Processes ‐ PIPs
Partner Interface Processes (PIPs) are XMLbased interfaces that enable two trading partners to
exchange data. Dozens of PIPs already exist. Few are listed here:
PIP2A2 : Enables a partner to query another for product information.
PIP3A2 : Enables a partner to query the price and availability of specific products.
PIP3A4 : Enables a partner to submit an electronic purchase order and receive
acknowledgment of the order
PIP3A3 : Enables a partner to transfer the contents of an electronic shopping cart.
PIP3B4 : Enables a partner to query status on a specific shipment.
The Organization of UDDI
Directory Operation Information
357
number, and other contact services registers itself categories, contacts, URLs, and other
information of a given things necessary to interact with a given
business business.
Yellow pages: Categories of Find: How an Service information: Describes a group of
businesses based on application finds a Web services. These are contained in a
existing (nonelectronic) particular Web service businessService object.
standards
Green pages: Technical Bind: How an Binding information: The technical details
information about the Web application connects necessary to invoke Web services. This
services provided by a given to and interacts with includes URLs, information about method
business Web services after it's names, argument types, and so on. The
been found bindingTemplate object represents this data.
Service specification detail: This is
metadata about the various specifications
implemented by a given Web service.
These are called tModels in the UDDI
specification.
Private UDDI Registries
As an alternative to using the public federated network of UDDI registries available on the
Internet, companies or industry groups may choose to implement their own private UDDI
registries.
These exclusive services would be designed for the sole purpose of allowing members of the
company or of the industry group to share and advertise services amongst themselves.
However, whether the UDDI registry is part of the global federated network or a privately
owned and operated registry, the one thing that ties it all together is a common web services
API for publishing and locating businesses and services advertised within the UDDI registry.
Anatomy of UDDI
UDDI Elements
A business or company can register three types of information into a UDDI registry. This
information is contained into three elements of UDDI.
These three elements are :
358
(1) White pages:
This category contains:
Basic information about the Company and its business.
Basic contact information including business name, address, contact phone number etc.
A unique identifiers for the company tax IDs. This information allows others to discover
your web service based upon your business identification.
(2) Yellow pages:
This category contains:
This has more details about the company, and includes descriptions of the kind of
electronic capabilities the company can offer to anyone who wants to do business with
it.
It uses commonly accepted industrial categorization schemes, industry codes, product
codes, business identification codes and the like to make it easier for companies to
search through the listings and find exactly what they want.
(3) Green pages:
This category contains technical information about a web service. This is what allows someone
to bind to a Web service after it's been found. This includes:
The various interfaces
The URL locations
Discovery information and similar data required to find and run the Web service.
NOTE: UDDI is not restricted to describing web services based on SOAP. Rather, UDDI can be
used to describe any service, from a single web page or email address all the way up to SOAP,
CORBA, and Java RMI services.
UDDI Technical Architecture
The UDDI technical architecture consists of three parts:
UDDI data model:
An XML Schema for describing businesses and web services. The data model is described in
detail in the "UDDI Data Model" section.
UDDI API Specification:
A Specification of API for searching and publishing UDDI data.
359
UDDI clo
oud services::
This is op
perator sitess that provide implemen
ntations of th
he UDDI specification an
nd synchroniize
on a scheduled basis.
all data o
The UDDDI Business R
Registry (UBR wn as the Public Cloud, is a conceptu
R), also know ually single
system b
built from muultiple nodes that has th
heir data syn
nchronized through replication.
The curreent cloud services provid
de a logicallyy centralized
d, but physiccally distribu
uted, directo
ory.
This meaans that dataa submitted to one root node will au utomaticallyy be replicateed across all the
other roo
ot nodes. Cuurrently, dataa replication
n occurs every 24 hours.
UDDI cloud services are currently provided b by Microsoftt and IBM. A
Ariba had origginally plann
ned
to offer aan operator as well, but has since baacked away from the commitment. Additional
operatorrs from other companiess, including H Hewlett‐Packard, are plaanned for th
he near futurre.
It is also possible to sset up privatte UDDI registries. For example, a laarge compan ny may set up its
own privvate UDDI registry for reggistering all internal web As these registries are no
b services. A ot
automatically synchrronized with h the root UDDDI nodes, they are not considered part of the U UDDI
cloud.
UDDI Datta Model
360
UDDI includes an XML Schema that describes four five data structures:
businessEntity
businessService
bindingTemplate
tModel
publisherAssertion
businessEntity data structure:
The business entity structure represents the provider of web services. Within the UDDI registry,
this structure contains information about the company itself, including contact information,
industry categories, business identifiers, and a list of services provided.
Here is an example of a fictitious business's UDDI registry entry:
<businessEntity businessKey="uuid:C0E6D5A8‐C446‐4f01‐99DA‐70E212685A40"
operator="http://www.ibm.com"
authorizedName="John Doe">
<name>Acme Company</name>
<description>
We create cool Web services
</description>
<contacts>
<contact useType="general info">
<description>General Information</description>
<personName>John Doe</personName>
<phone>(123) 123‐1234</phone>
<email>jdoe@acme.com</email>
</contact>
361
</contacts>
<businessServices>
...
</businessServices>
<identifierBag>
<keyedReference
tModelKey="UUID:8609C81E‐EE1F‐4D5A‐B202‐3EB13AD01823"
name="D‐U‐N‐S"
value="123456789" />
</identifierBag>
<categoryBag>
<keyedReference
tModelKey="UUID:C0B9FE13‐179F‐413D‐8A5B‐5004DB8E5BB2"
name="NAICS"
value="111336" />
</categoryBag>
</businessEntity>
businessService data structure:
The business service structure represents an individual web service provided by the business
entity. Its description includes information on how to bind to the web service, what type of web
service it is, and what taxonomical categories it belongs to:
Here is an example of a business service structure for the Hello World web service
<businessService serviceKey="uuid:D6F1B765‐BDB3‐4837‐828D‐8284301E5A2A"
businessKey="uuid:C0E6D5A8‐C446‐4f01‐99DA‐70E212685A40">
<name>Hello World Web Service</name>
<description>A friendly Web service</description>
<bindingTemplates>
...
</bindingTemplates>
<categoryBag />
</businessService
Notice the use of the Universally Unique Identifiers (UUIDs) in the businessKey and serviceKey
attributes. Every business entity and business service is uniquely identified in all UDDI registries
through the UUID assigned by the registry when the information is first entered.
bindingTemplate data structure:
Binding templates are the technical descriptions of the web services represented by the
business service structure. A single business service may have multiple binding templates. The
binding template represents the actual implementation of the web service.
362
Here is an example of a binding template for Hello World
<bindingTemplate serviceKey="uuid:D6F1B765‐BDB3‐4837‐828D‐8284301E5A2A"
bindingKey="uuid:C0E6D5A8‐C446‐4f01‐99DA‐70E212685A40">
<description>Hello World SOAP Binding</description>
<accessPoint URLType="http">
http://localhost:8080
</accessPoint>
<tModelInstanceDetails>
<tModelInstanceInfo
tModelKey="uuid:EB1B645F‐CF2F‐491f‐811A‐4868705F5904">
<instanceDetails>
<overviewDoc>
<description>
references the description of the
WSDL service definition
</description>
<overviewURL>
http://localhost/helloworld.wsdl
</overviewURL>
</overviewDoc>
</instanceDetails>
</tModelInstanceInfo>
</tModelInstanceDetails>
</bindingTemplate>
Because a business service may have multiple binding templates, the service may specify
different implementations of the same service, each bound to a different set of protocols or a
different network address.
tModel data structure:
The tModel is the last core data type, but potentially the most difficult to grasp. tModel stands
for technical model.
A tModel is a way of describing the various business, service, and template structures stored
within the UDDI registry. Any abstract concept can be registered within UDDI as a tModel. For
instance, if you define a new WSDL port type, you can define a tModel that represents that port
type within UDDI. Then, you can specify that a given business service implements that port type
by associating the tModel with one of that business service's binding templates.
Here is an example of A tModel representing the HelloWorldInterface port type
<tModel tModelKey="uuid:xyz987..."
operator="http://www.ibm.com"
authorizedName="John Doe">
363
<name>HelloWorldInterface Port Type</name>
<description>
An interface for a friendly Web service
</description>
<overviewDoc>
<overviewURL>
http://localhost/helloworld.wsdl
</overviewURL>
</overviewDoc>
</tModel>
publisherAssertion data structure:
This is a relationship structure putting into association two or more businessEntity structures
according to a specific type of relationship, such as subsidiary or department.
The publisherAssertion structure consists of the three elements fromKey (the first businessKey),
toKey (the second businessKey) and keyedReference.
The keyedReference designates the asserted relationship type in terms of a keyName keyValue
pair within a tModel, uniquely referenced by a tModelKey.
<element name="publisherAssertion" type="uddi:publisherAssertion" />
<complexType name="publisherAssertion">
<sequence>
<element ref="uddi:fromKey" />
<element ref="uddi:toKey" />
<element ref="uddi:keyedReference" />
</sequence>
</complexType>
UDDI Interfaces
A registry is no use without some way to access it. The UDDI standard version 2.0 specifies two
interfaces for service consumers and service providers to interact with the registry.
Service consumers use Inquiry Interface to find a service, and service providers use Publisher
Interface to list a service.
The core of the UDDI interfaces is the UDDI XML Schema definitions.These define the
fundamental UDDI data types through which all the information flows.
364
The Publisher Interface:
The Publisher interface defines sixteen operations for a service provider managing its entries in
the UDDI registry:
get_authToken: Retrieves an authorization token.All of the Publisher interface
operations require that a valid authorization token be submitted with the request.
discard_authToken: Tells the UDDI registry to no longer accept a given authorization
token. This step is equivalent to logging out of the system.
save_business: Creates or updates a business entity's information contained in the UDDI
registry.
save_service: Creates or updates information about the web services that a business
entity provides.
save_binding: Creates or updates the technical information about a web service's
implementation.
save_tModel: Creates or updates the registration of abstract concepts managed by the
UDDI registry.
delete_business: Removes the given business entities from the UDDI registry
completely.
delete_service: Removes the given web services from the UDDI registry completely.
delete_binding: Removes the given web service technical details from the UDDI registry.
delete_tModel: Removes the specified tModels from the UDDI registry.
get_registeredInfo: Returns a summary of everything the UDDI registry is currently
keeping track of for the user, including all businesses, all services, and all tModels.
set_publisherAssertions: Manages all of the tracked relationship assertions associated
with an individual publisher account.
add_publisherAssertions: Causes one or more publisherAssertions to be added to an
individual publisher's assertion collection.
delete_publisherAssertions: Causes one or more publisherAssertion elements to be
removed from a publisher's assertion collection.
get_assertionStatusReport: Provides administrative support for determining the status
of current and outstanding publisher assertions that involve any of the business
registrations managed by the individual publisher account.
get_publisherAssertions: Obtains the full set of publisher assertions that is associated
with an individual publisher account.
The Inquiry Interface:
The inquiry interface defines ten operations for searching the UDDI registry and retrieving
details about specific registrations:
find_binding: Returns a list of web services that match a particular set of criteria based
on the technical binding information.
find_business: Returns a list of business entities that match a particular set of criteria.
find_ltservice: Returns a list of web services that match a particular set of criteria.
find_tModel: Returns a list of tModels that match a particular set of criteria.
365
get_bindingDetail: Returns the complete registration information for a particular web
service binding template.
get_businessDetail: Returns the registration information for a business entity, including
all services that entity provides.
get_businessDetailExt: Returns the complete registration information for a business
entity.
get_serviceDetail: Returns the complete registration information for a web service.
get_tModelDetail: Returns the complete registration information for a tModel.
find_relatedBusinesses: Discovers businesses that have been related via the uddi‐
org:relationships model.
UDDI Usage Example
Consider a company XYZ wants to register its contact information, service description, and
online service access information with UDDI. The following steps are necessary:
1. Choose an operator with which to work. Each operator has different terms and
conditions for authorizing access to its replica of the registry.
2. Build or otherwise obtain a UDDI client, such as those provided by the operators.
3. Obtain an authentication token from the operator.
4. Register information about the business. Include as much information as might be
helpful to those searching for matches.
5. Release the authentication token.
6. Use the inquiry APIs to test the retrieval of the information, including binding template
information, to ensure that someone who obtains it can use it successfully to interact
with your service.
7. Fill in the tModel information in case someone wants to search for a given service and
find your business as one of the service providers.
8. Update the information as necessary to reflect changing business contact information
and new service details, obtaining and releasing a new authentication token from the
operator each time. Whenever you need to update or to modify the data you've
registered, you have to go back to the operator with which you entered the data.
The following examples how the XYZ Company would register its information and how a
distributor interested in carrying the XYZ's product line might find information about how to
contact the company and place an order, using the XYZ.com Web services.
Creating Registry:
After obtaining an authentication token from one of the operators‐Microsoft, for example‐the
XYZ.com developers decide what information to publish to the registry and use one of the UDDI
tools provided by Microsoft. If necessary, the developers can also write a Java, C#, or VB.NET
program to generate the appropriate SOAP messages. Here is an example.
366
POST /save_business HTTP/1.1
Host: www.XYZ.com
Content‐Type: text/xml; charset="utf‐8"
Content‐Length: nnnn
SOAPAction: "save_business"
<?xml version="1.0" encoding="UTF‐8" ?>
<Envelope xmlns="http://schemas/xmlsoap.org/soap/envelope/">
<Body>
<save_business generic="2.0" xmlns="urn:uddi‐org:api_v2">
<businessKey="">
</businessKey>
<name>
XYZ, Pvt Ltd.
</name>
<description>
Company is involved in giving Stat‐of‐the‐art....
</description>
<identifierBag> ... </identifierBag>
...
</save_business>
</Body>
</Envelope>
This example illustrates a SOAP message requesting to register a UDDI business entity for SXYZ
Company. The key element is blank because the operator automatically generates the UUID key
for the data structure. Most fields are omitted for the sake of showing a simple example.
Company XYZ can always execute another save_business operation to add to the basic
information required to create a business entity.
Retrieving Information:
After XYZ Company has updated its UDDI entry with the relevant information, companies that
want to become XYZ distributors can look up contact information in the UDDI registry and
obtain the service descriptions and the access points for the two Web services that XYZ.com
publishes for online order entry: preseason bulk orders and in‐season restocking orders.
This example illustrates a sample SOAP request to obtain business detail information about the
XYZ Company. Once you know the UUID, or key, for the specific business that's been registered,
you can use it in the get_businessDetail API to return specific information about that business.
POST /get_businessDetail HTTP/1.1
Host: www.XYZ.com
Content‐Type: text/xml; charset="utf‐8"
367
Content‐Length: nnnn
SOAPAction: "get_businessDetail"
<?xml version="1.0" encoding="UTF‐8" ?>
<Envelope xmlns="http://schemas/xmlsoap.org/soap/envelope/">
<Body>
<get_businessDetail generic="2.0" xmlns="urn:uddi‐org:api_v2">
<businessKey="C90D731D‐772HSH‐4130‐9DE3‐5303371170C2">
</businessKey>
</get_businessDetail>
</Body>
</Envelope>
UDDI with WSDL
The UDDI data model defines a generic structure for storing information about a
business and the Web services it publishes. The UDDI data model is completely
extensible, including several repeating sequence structures of information.
However, WSDL is used to describe the interface of a web service. WSDL is fairly
straightforward to use with UDDI.
WSDL is represented in UDDI using a combination of businessService, bindingTemplate,
and tModel information.
As with any service registered in UDDI, generic information about the service is stored in
the businessService data structure, and information specific to how and where the
service is accessed is stored in one or more associated bindingTemplate structures. Each
bindingTemplate structure includes an element that contains the network address of
the service and has associated with it one or more tModel structures that describe and
uniquely identify the service.
When UDDI is used to store WSDL information, or pointers to WSDL files, the tModel
should be referred to by convention as type wsdlSpec, meaning that the overviewDoc
element is clearly identified as pointing to a WSDL service interface definition.
For UDDI, WSDL contents are split into two major elements the interface file and the
implementation file.
The Hertz reservation system web service provides a concrete example of how UDDI and WSDL
work together. Here is the <tModel> for this web service:
<tModel authorizedName="..." operator="..." tModelKey="...">
<name>HertzReserveService</name>
<description xml:lang="en">
WSDL description of the Hertz reservation service interface
</description>
<overviewDoc>
<description xml:lang="en">
368
WSDL source document.
</description>
<overviewURL>
http://mach3.ebphost.net/wsdl/hertz_reserve.wsdl
</overviewURL>
</overviewDoc>
<categoryBag>
<keyedReference
tModelKey="uuid:C1ACF26D‐9672‐4404‐9D70‐39B756E62AB4"
keyName="uddi‐org:types" keyValue="wsdlSpec"/>
</categoryBag>
</tModel>
The key points are:
The overviewURL element gives the URL to where the service interface definition WSDL
file can be found. This allows humans and UDDI/WSDLaware tooling to locate the
service interface definition.
The purpose of the keyedReference element in the categoryBag is to make sure that this
tModel is categorized as a WSDL specification document.
UDDI Implementations
A number of UDDI implementations are currently available. These implementations make it
easier to search or publish UDDI data, without getting mired in the complexities of the UDDI
API. Here is a brief synopsis of the main UDDI implementations available.
Java Implementations:
There are two UDDI implementations for Java.
UDDI4J (UDDI for Java): UDDI4J was originally created by IBM. In January 2001, IBM
turned over the code to its own open source site. UDDI4J is a Java class library that
provides an API to interact with a UDDI.
jUDDI: jUDDI is an open source Java implementation of a UDDI registry and a toolkit for
accessing UDDI services.
Perl Implementation:
UDDI::Lite : provides a basic UDDI client for inquiry and publishing.
Ruby Implementation:
UDDI4r: provides a basic UDDI client for inquiry and publishing.
369
Python Implementation:
UDDI4Py: UDDI4Py is a Python package that allows the sending of requests to and
processing of responses from the UDDI Version 2 APIs.
UDDI Specifications
The UDDI project also defines a set of XML Schema definitions that describe the data formats
used by the various specification APIs. These documents are all available for download at
www.uddi.org. The current version of all specification groups is Version 2.0.
The specifications include:
UDDI Replication:
This document describes the data replication processes and interfaces to which a registry
operator must conform to achieve data replication between sites. This specification is not a
programmer's API; it defines the replication mechanism used among UBR nodes.
UDDI Operators:
This document outlines the behavior and operational parameters required by UDDI node
operators. This specification defines data management requirements to which operators must
adhere.
UDDI Programmer's API:
This specification defines a set of functions that all UDDI registries support for inquiring about
services hosted in a registry and for publishing information about a business or a service to a
registry. This specification defines a series of SOAP messages containing XML documents that a
UDDI registry accepts, parses, and responds to. This specification, along with the UDDI XML API
schema and the UDDI Data Structure specification, makes up a complete programming
interface to a UDDI registry.
UDDI Data Structures:
This specification covers the specifics of the XML structures contained within the SOAP
messages defined by the UDDI Programmer's API. This specification defines five core data
structures and their relationships to one another.
The UDDI XML API schema is not contained in a specification; rather, it is stored as an XML
Schema document that defines the structure and datatypes of the UDDI data structures.
SOAP
370
SOAP is a simple XML‐based protocol that allows applications to exchange information over
HTTP.
If you want to learn more about SOAP, please visit our SOAP tutorial.
WSDL
WSDL is the standard format for describing a web service in XML format.
WSDL is an integral part of UDDI
If you want to learn more about WSDL, please visit our WSDL Tutorial.
Web Services
Web services can convert your applications into web‐applications.
Web service inspection
WSIL (Web Services Inspection Language, also WS‐Inspection) is an XML‐based specification
about how to locate Web services without the necessity of using UDDI. However, WSIL can be
also used together with UDDI, that is, it is orthogonal to UDDI and does not replace it. Most
business partners today do not find one another from UDDI registries; rather they are based on
existing relationships. That is where the Web Services Inspection Language fits in. WSIL
decentralizes the centralized model of service dublication within a UDDI registry and distributes
the pieces such that each service provider itself can advertise its Web Services offerings. WSIL
thus facilitates the behavior that most businesses desiring to use Web Services (today) are most
comfortable with (today). Yet, WSIL is less widely used today as Web Service Registries take
their place.
WS‐Inspection: Web Services Inspection Language (WSIL) WS‐Inspection describes how to
locate Web service descriptions on some server and how this information needs to be
structured. As such, WSIL can be viewed as a lightweight UDDI.
Web Services Inspection Language (WSIL) WSIL, like UDDI, provides a method of service
discovery for web services. Unlike UDDI, WSIL uses a decentralized, distributed model, rather
than a centralized model. WSIL documents, which are essentially pointers to lists of services,
allow consumers of web services to browse available services on web sites. The WSIL
specification provides standards for using XML‐formatted documents to inspect a site for
services and a set of rules for how the information is made available. A WSIL document gathers
multiple references to pre‐existing service description documents in one document. The WSIL
371
document is then hosted by the provider of the service, so consumers can find out about
available services.
An overview of the Web Services Inspection Language
.Summary: Service discovery defines a process for locating service providers and retrieving
service description documents, and is a key component of the overall Web services model.
Service discovery is a very broad concept, which means that it is unlikely to have one solution
that addresses all of its requirements. The Universal Description, Discovery and Integration
(UDDI) specification addresses a subset of the overall requirements by using a centralized
service discovery model. This article provides an overview of the Web Services Inspection
Language (WS‐Inspection), another related service discovery mechanism that addresses a
different subset of requirements using a distributed usage model. The WS‐Inspection
specification is designed around an XML‐based model for building an aggregation of references
to existing Web service descriptions, which are exposed using standard Web server technology.
The Web services architecture is based upon the interactions between three primary roles:
service provider, service registry, and service requestor. These roles interact using publish, find,
and bind operations. The service provider is the business that provides access to the Web
service and publishes the service description in a service registry. The service requestor finds
the service description in a service registry and uses the information in the description to bind
to a service. A logical view of the Web services architecture is shown in Figure 1. In this view of
the Web services architecture, the service registry provides a centralized location for storing
service descriptions. A UDDI registry is an example of this type of service registry.
372
Figure 1: Web services architecture
Although it is important, the centralized service registry is not the only model for Web service
discovery. The simplest form of service discovery is to request a copy of the service description
from the service provider. After receiving the request, the service provider can simply e‐mail
the service description as an attachment or provide it to the service requestor on a transferable
media, such as a diskette. Although this type of service discovery is simple, it is not very
efficient since it requires prior knowledge of the Web service, as well as the contact information
for the service provider.
Between these two extremes, there is a need for a distributed service discovery method that
provides references to service descriptions at the service provider's point‐of‐offering. The Web
Services Inspection Language provides this type of distributed discovery method, by specifying
how to inspect a Web site for available Web services. The WS‐Inspection specification defines
the locations on a Web site where you could look for Web service descriptions.
Since the Web Services Inspection Language focuses on distributed service discovery, the WS‐
Inspection specification complements UDDI by facilitating the discovery of services available on
Web sites, but which may not be listed yet in a UDDI registry. Additional information on the
relationship between the Web Services Inspection Language and UDDI can be found in The WS‐
Inspection and UDDI Relationship
Inspection overview
The WS‐Inspection specification does not define a service description language. WS‐Inspection
documents provide a method for aggregating different types of service descriptions. Within a
WS‐Inspection document, a single service can have more than one reference to a service
description. For example, a single Web service might be described using both a WSDL file and
within a UDDI registry. References to these two service descriptions should be put into a WS‐
Inspection document. If multiple references are available, it is beneficial to put all of them in
the WS‐Inspection document so that the document consumer can select the type of service
373
description that they are capable of understanding and want to use. Figure 2 provides an
overview of how WS‐Inspection documents are used.
Figure 2: WS‐Inspection overview
The WS‐Inspection specification contains two primary functions, which are discussed in more
detail in the next two sections.
• It defines an XML format for listing references to existing service descriptions.
• It defines a set of conventions so that it is easy to locate WS‐Inspection documents.
WS‐Inspection document format
A WS‐Inspection document provides an aggregation of references to service descriptions. These
service descriptions can be defined in any service description format, such as WSDL, UDDI, or
plain HTML. As mentioned previously, a WS‐Inspection document is generally made available at
the point‐of‐offering for the services that are referenced within the document.
A WS‐Inspection document can contain a list of references to service descriptions, as well as
references to other WS‐Inspection documents. A WS‐Inspection document will contain one or
374
more <service> and <link> elements. A <service> element will contain one or more references
to different types of service descriptions for the same Web service. The <link> element may
contain references to only one type of service description, but these service descriptions do not
have to reference the same Web service.
Listing 1 contains a simple example of a WS‐Inspection document. This example contains two
references to different service descriptions, and a single reference to another WS‐Inspection
document. The first <service> element contains only one service description, and it is a
reference to a WSDL document. The second <service> element also contains only one service
description reference. This reference is to a business service entry in a UDDI registry. The UDDI
service key identifies one unique business service. The UDDI service reference also contains
extensibility elements which are discussed in the next section. The <link> element is used to
reference a collection of service descriptions. In this case, it is referencing another WS‐
Inspection document.
<?xml version="1.0"?>
<inspection xmlns="http://schemas.xmlsoap.org/ws/2001/10/inspection/">
<service>
<description referencedNamespace="http://schemas.xmlsoap.org/wsdl/"
location="http://example.com/exampleservice.wsdl" />
</service>
<service>
<description referencedNamespace="urn:uddi‐org:api">
<wsiluddi:serviceDescription location=
"http://example.com/uddi/inquiryapi">
<wsiluddi:serviceKey>
52946BB0‐BC28‐11D5‐A432‐0004AC49CC1E</wsiluddi:serviceKey>
</wsiluddi:serviceDescription>
</description>
</service>
<link referencedNamespace=
"http://schemas.xmlsoap.org/ws/2001/10/inspection/"
location="http://example.com/tools/toolservices.wsil"/>
</inspection>
WS‐Inspection document extensibility
The WS‐Inspection specification does not limit the type of service descriptions that can be
referenced. Both the <description> and <link> element may contain extensibility elements that
represent information for a specific service description technology. The WS‐Inspection
specification defines a set of standard extensibility elements for both WSDL and UDDI. Since the
<description> element is used to reference a single service description and the <link> element
is used to reference one or more sets of service descriptions, any extensibility elements that are
defined for these elements should follow this same pattern.
375
WSDL service descriptions can only be referenced from within a <description> element. The
WSDL extensibility elements can be used to indicate whether or not the WSDL document
contains an endpoint specification. If there is more than one service element in the WSDL
document, then the <wsilwsdl:referencedService> element should be used to indicate which
one is associated with the entry in the WS‐Inspection document. One or more
<wsilwsdl:implementedBinding> elements may appear in WSDL service description reference.
Each of these elements references a binding that is implemented by the WSDL document.
Listing 2 contains an example of a WS‐Inspection document that contains all of the WSDL
extensibility elements.
<?xml version="1.0"?>
<inspection xmlns="http://schemas.xmlsoap.org/ws/2001/10/inspection/">
...
<service>
<name xml:lang="en‐US">StockQuoteService</name>
<description referencedNamespace="http://schemas.xmlsoap.org/wsdl/"
<wsilwsdl:reference endpointPresent="true">
location="http://localhost:8080/webservices/wsdl/stockquote/sqs.wsdl">
<wsilwsdl:referencedService
xmlns:tns="http://www.getquote.com/StockQuoteService">
tns:StockQuoteService
</wsilwsdl:referencedService>
<wsilwsdl:implementedBinding
xmlns:interface="http://www.getquote.com/StockQuoteService‐interface">
interface:StockQuoteServiceBinding
</wsilwsdl:implementedBinding>
</wsilwsdl:reference>
</description>
</service>
...
</inspection>
The UDDI extensibility elements may appear within either the <link> or <description> elements.
The elements used within the <link> element can only reference a UDDI business entity. Since
this element references a UDDI business entity, resolving this reference will result in one or
more service descriptions. The elements used within the <description> element may only
reference a single UDDI business service. Listing 3 contains an example of the UDDI bindings for
a WS‐Inspection document.
The <wsiluddi:businessDescription> element is used within a <link> element to specify a
reference to a UDDI business entity. The businessService element may contain either a
discoveryURL, or a businessKey, or both. If a businessKey is specified, then the location
attribute on the businessDescription element must contain an inquiry URL for a UDDI registry.
This URL is used to send a get_businessDetail message to the UDDI registry using the
businessKey that was specified.
376
The <wsiluddi:serviceDescription> element can only be used within the <description> element,
and can reference only one service description. Within the serviceDescription element, a
discoveryURL, a serviceKey, or both can be specified. The location attribute on the
serviceDescription must contain the inquiry URL for a UDDI registry, when the serviceKey is
specified.
For both the businessDescription and serviceDescription elements, if both the discoveryURL
and the businessKey or serviceKey are specified, then the person who is processing the WS‐
Inspection document can select which one they want to use. The discoveryURL will always
return a UDDI business entity. So when it is used with the serviceDescription element, the
serviceKey must be used to locate the individual service description within the business entity.
<?xml version="1.0"?>
<inspection targetNamespace="http://schemas.xmlsoap.org/ws/2001/10/inspection/"
xmlns:wsiluddi="http://schemas.xmlsoap.org/ws/2001/10/inspection/uddi/"
xmlns="http://schemas.xmlsoap.org/ws/2001/10/inspection/">
<link referencedNamespace="urn:uddi‐org:api">
<wsiluddi:businessDescription location=
"http://www.getquote.com/uddi/inquiryapi">
<wsiluddi:businessKey>3BF0ACC0‐BC28‐11D5‐A432‐0004AC49CC1E<
/wsiluddi:businessKey>
<wsiluddi:discoveryURL useType="businessEntity">
http://www.getquote.com/uddi?businessKey=
3BF0ACC0‐BC28‐11D5‐A432‐0004AC49CC1E
</wsiluddi:discoveryURL>
</wsiluddi:businessDescription>
</link>
<service>
<name>UDDI Service Description</name>
<description referencedNamespace="urn:uddi‐org:api">
<wsiluddi:serviceDescription location=
"http://www.getquote.com/uddi/inquiryapi">
<wsiluddi:serviceKey>52946BB0‐BC28‐11D5‐A432‐0004AC49CC1E<
/wsiluddi:serviceKey>
<wsiluddi:discoveryURL useType="businessEntity">
http://www.getquotecom/uddi?businessKey=
3BF0ACC0‐BC28‐11D5‐A432‐0004AC49CC1E
</wsiluddi:discoveryURL>
</wsiluddi:serviceDescription>
</description>
</service>
</inspection>
Linking to WS‐Inspection documents
377
One important feature of the WS‐Inspection specification, is the ability to link a WS‐Inspection
document to one or more different WS‐Inspection documents. This feature can be used to
manage service description references by grouping them into different documents. Using the
<link> element, a hierarchy of WS‐Inspection documents can be built using these individual
documents. For example, separate WS‐Inspection documents can be created for different
categories of services, and one primary WS‐Inspection document can link all of them together.
Finding WS‐Inspection documents
The second primary function provided by the WS‐Inspection specification is how to define the
locations where you can access WS‐Inspection documents. There are two conventions which
were created to make the location and retrieval of WS‐Inspection documents easy:
• Fixed name WS‐Inspection documents.
• Linked WS‐Inspection documents.
The fixed name for WS‐Inspection documents is inspection.wsil. A document with this name can
be placed at common entry points for a Web site. For example, if the common entry point is
http://example.com or http://example.com/services, then the location of the WS‐Inspection
document would be http://example.com/inspection.wsil or
http://example.com/services/inspection.wsil, respectively.
References to WS‐Inspection documents may also appear within different content documents,
such as HTML pages. When putting entries in an HTMLpage, a META tag may be used to convey
the location of a WS‐Inspection document. Listing 4 contains an example of an HTML page that
contains the same WS‐Inspection document references listed above. The HTML page that
contains these references should be widely used. This could be the root document for a Web
server, or it could be a Web page that describes, in a human readable format, one or more Web
services that appear in the WS‐Inspection document.
<!DOCTYPE HTML PUBLIC "‐//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<META name="serviceInspection" content=
"http://example.com/inspection.wsil"
<META name="serviceInspection" content="http://example.com/services/inspection.wsil"
<head>
...
<html>
Web Services Toolkit support for the WS‐Inspection specification
The Web Services Toolkit includes integrated support for the Web Services Inspection
Language. This support includes a demonstration of how to use WS‐Inspection documents, and
378
a Java API that allows you to parse existing WS‐Inspection documents and programmatically
create new documents.
Most of the toolkit demos provide an option to use WS‐Inspection technology or UDDI as the
service discovery mechanism. If the WS‐Inspection option is used, the demos request a WS‐
Inspection document from the Web server configured using the toolkit configuration utility.
This request is submitted using the fixed name for the WS‐Inspection document. This document
name is set up to invoke a Java servlet. This servlet will dynamically create the WS‐Inspection
document, by searching for WSDL service description documents within the toolkit directory
structure.
Figure 3 contains an overview of this process:
1. The WS‐Inspection document proxy is used to request the contents of the WS‐
Inspection document using a fixed name.
2. The URL that is used to retrieve the WS‐Inspection document maps to a servlet. This
servlet will search through the local filesystem for all WSDL service descriptions. A
reference to each service description will be put into the WS‐Inspection document.
3. The dynamically generated WS‐Inspection document is returned to the client.
Figure 3: WS‐Inspection document support in the Web Services Toolkit
379
Listing 5 contains a portion of the WS‐Inspection document that is returned by the WS‐
Inspection servlet. The service name is set from the name attribute on the definition element
within the WSDL document. The entry that appears in this listing is for the stock quote demo.
<?xml version="1.0"?>
<inspection xmlns="http://schemas.xmlsoap.org/ws/2001/10/inspection/">
...
<service>
<name xml:lang="en‐US">StockQuoteService</name>
<description referencedNamespace="http://schemas.xmlsoap.org/wsdl/"
location="http://localhost:8080/webservices/wsdl/stockquote/sqs.wsdl">
<wsilwsdl:reference endpointPresent="true">
<wsilwsdl:implementedBinding
xmlns:interface="http://www.getquote.com/StockQuoteService‐interface">
interface:StockQuoteServiceBinding
</wsilwsdl:implementedBinding>
</wsilwsdl:reference>
</description>
</service>
...
</inspection>
Using the Web Services Inspection Language for Java API
The Web Services Inspection Language for Java API (WSIL4J) provides a Java interface, which
can be used to parse existing WS‐Inspection documents or programmatically create new WS‐
Inspection documents. Most of the WSIL4J classes represent the elements that can appear in a
WS‐Inspection document. For example, the <inspection> element is represented by the
Inspection class, and the <service> element is represented by the Service class. There are also
utility classes that make it easy to read and parse a WS‐Inspection document, as well as write
out the contents of the WSIL4J objects as an XML document.
Listing 6 contains an example of how to use this API. In this sample code, a WS‐Inspection
document is read and the service elements are searched for references to WSDL service
descriptions. When a WSDL service description is found, its location is saved in a list which is
displayed on the console. You can view and download the complete WSInspectionExample
application (see Resources). If you have installed the toolkit, you can use the wstkenv command
to set up the classpath that is needed to compile and run these examples. This command is
located in the WSTK bin directory. The purpose of this command is to define a set of
environment variables. One of the environment variables is named WSTK_CP. This environment
variable contains the classpath that is required to compile and run the examples.
...
// Create a new instance of a WS‐Inspection document
WSILDocument document = WSILDocument.newInstance();
380
// Read and parse the WS‐Inspection document
document.read(wsinspectionURL);
// Get the inspection element from the document
Inspection inspection = document.getInspection();
// Obtain a list of all service elements
Service[] services = inspection.getServices();
// Display purpose of list
System.out.println("Display list of WSDL service description references...");
// Process each service element to find all WSDL document references
for (int serviceCount = 0; serviceCount < services.length; serviceCount++)
{
// Get the next set of description elements
descriptions = services[serviceCount].getDescriptions();
// Process each description to find the WSDL references
for (int descCount = 0; descCount < descriptions.length; descCount++)
{
// If the referenced namespace is for WSDL, then save the location reference
if
(descriptions[descCount].getReferencedNamespace().equals(WSDLConstants.NS_URI_WSDL))
{
// Add WSDL location to the list
wsdlList.add(descriptions[descCount].getLocation());
}
}
// If this service has WSDL service descriptions, then display the list
if (wsdlList.size() > 0)
{
// Get service name
serviceName = (services[serviceCount].getServiceNames().length == 0) ?
"[no service name]" : services[serviceCount].getServiceNames()[0].getText();
// Display service name
System.out.println(" Service: " + serviceName);
// Display list
Iterator iterator = wsdlList.iterator();
for (int count = 1; iterator.hasNext(); count++)
{
System.out.println(" [" + count + "] " + ((String) iterator.next()));
}
}
// Clear the list
wsdlList.clear();
}
...
Using the WS‐Inspection proxy
381
The WSIL4J API also provides a WSILProxy class which can be used to easily access certain types
of information within a WS‐Inspection document. The proxy interface will read the WS‐
Inspection document, and then allow you to directly access the WSDL documents for UDDI
business services that you need. Listing 7 contains a portion of an application that shows how
to use the WS‐Inspection proxy to get a list of WSDL documents for a given service name. You
can also view and download the complete WSInspectionProxyExample application (see
Resources).
...
// Create a new instance of a WS‐Inspection document proxy
WSILProxy proxy = new WSILProxy(wsinspectionURL);
// Get all of the WSDL documents using the input service name
WSDLDocument[] wsdlDocuments =
proxy.getWSDLDocumentByServiceName(serviceName);
// Display purpose of list
System.out.println("Display contents of WSDL service
description documents for service name [" + serviceName + "]...");
// Process each WSDL document reference
for (int wsdlCount = 0; wsdlCount < wsdlDocuments.length; wsdlCount++)
{
// Display contents of the document
System.out.println("[" + wsdlCount + "]
" + wsdlDocuments[wsdlCount].serializeToXML());
}
...
XMethods usage of WS‐Inspection
One example of a Web site that has implemented a WS‐Inspection interface is XMethods site,
which provides a list of publicly available Web services. The main page for the XMethods site is
http://www.xmethods.net, and the WS‐Inspection interface is accessible at
http://www.xmethods.net/inspection.wsil.
This WS‐Inspection document contains a list of all of the Web service descriptions that are
listed at this Web site. This document also shows how to use WS‐Inspection extension
elements. Listing 8 shows a portion of the WS‐Inspection document. The XMethods defined
extension elements are identified by the wsilxmethods namespace prefix.
<?xml version="1.0"?>
<inspection xmlns="http://schemas.xmlsoap.org/ws/2001/10/inspection/" ...>
<service>
<abstract>Get a random lyrical phrase from one of the world's
best singer/songwriters, Neil Finn</abstract>
<description referencedNamespace='http://schemas.xmlsoap.org/wsdl/'
location='http://www.nickhodge.com/nhodge/finnwords/finnwords.wsdl'/>
382
<description referencedNamespace='http://www.xmethods.net/'>
<wsilxmethods:serviceDetailPage
location='http://www.xmethods.net/ve2/ViewListing.po?serviceid=90601'>
<wsilxmethods:serviceID>90601</wsilxmethods:serviceID>
</wsilxmethods:serviceDetailPage>
</description>
</service>
...
</inspection>
WSIL4J contribution to Apache Axis
The latest version of WSIL4J has been contributed to the Apache Software Foundation. The
WSIL4J source code will be worked on under the Apache XML project as a part of the Apache
Axis work. This is important since it will allow the open source community to extend the
capabilities associated with WS‐Inspection.
This contribution should also help promote the implementation of a WS‐Inspection interface in
Apache Axis. This interface would return the current list of services deployed using Axis in the
format of a WS‐Inspection document. For example, if you invoke Axis‐based services at
http://hostname:80/axis/services, then http://hostname:80/axis/inspection.wsil might be the
location where you could get a WS‐Inspection document that contained a list of deployed
services.
Summary
The Web Services Inspection Language and how it provides a simple, distributed service
discovery method for any type of Web service description document, the WS‐Inspection
technology is complementary to existing service discovery methods, such as UDDI, because it
defines a process for inspecting a Web site for service descriptions.
We are now beginning to see real world usages of WS‐Inspection for service discovery. The WS‐
Inspection interface for the Web service descriptions listed on the XMethods Web site is just
one example. Based on the contribution of WSIL4J to Apache Axis, we may also see a WS‐
Inspection interface for it soon. In the future, we may also see this technology used for other
applications, such as a Web service crawler. A service crawler would search through Web sites
for WS‐Inspection documents and then aggregate the service description references from
multiple sites. Both the current and future applications of this technology, show that the Web
Services Inspection Language is an important part of the overall Web services usage model.
Adhoc Discovery
383
384
IV UNIT
BB22bb
BB22cc AApppplliiccaattiioonnss
D Off BB22BB IInntteerraaccttiioonn
Diiffffeerreenntt TTyyppeess O
CCoom
mppoonneennttss O MLL SSyysstteem
Off EE‐‐BBuussiinneessss XXM mss
EEbbxxm
mll
RRoosseettttaanneett
AApppplliieedd XXM
MLL IInn VVeerrttiiccaall IInndduussttrryy
W
Weebb SSeerrvviicceess FFoorr M
Moobbiillee D
Deevviicceess
W
Weebb SSeerrvviicceess FFoorr M
Moobbiillee D
Deevviicceess
B2B & B2C Applications
B2B is the use of web‐based technologies to conduct business between two or more
companies. B2B applications use electronic trade between suppliers, customers and business
partners. B2B applications are used to:
* To increase your supply chain efficiency at lower costs
* To help you improve customer service
* To give you total supply chain management (SCM) from the initial ordering process to the
distribution of the final product
The most obvious difference between B2B and B2C is the customer requirement. B2C focuses
on individual customer transactions. The needs are different.
We have adequate experience to develop these applications for you as per your needs.
B2B transaction
B2B stands for business‐to‐business type of transactions wherein the transactions of goods or
services between businesses is done.
Such B2B relationships can exist when there is exchange of one category of products or
services. For example, an enterprise dealing with a particular product range will transact with a
bulk buyer limited to the same category.
Formerly, this term was used to explain ex‐plain electronic communication between business
and enterprises to separate it from B2C type transactions, but now its is also used for marketing
385
purposes.
Formerly the term tended to describe industrial marketing or capital goods marketing only.
However, today it is widely used to describe all products and services used by enterprises.
B2B application is essentially used when large processes are required between trading partners
and in much higher volumes than business‐to‐consumer (B2C) applications.
For B2B, standards like UN/EDIFACT, ANSI ASC X12 are used most popularly.
B2C transaction
B2C is a Business‐to‐consumer transaction. It is a form of electronic commerce in which
products or services are sold from a firm/business or company to a consumer directly.
Advantages of B2C application
B2C e‐commerce has the following advantages:
Purchasing is faster and more convenient.
Prices change instantaneously as per the market requirements.
Challenges faced by B2C application
The difficulties faced by B2C e‐commerce are building traffic and maintaining customers. It is
more difficult for the smaller firms to enter the market, sustain itself and also to remain
competitive. Also it is very difficult for them to acquire new customers online as they are
required to be attracted every time by offering price discounts.
Difference between B2B and B2C
Customer requirement: B2C emphasizes on individual customer transactions, while B2B
focuses on other businesses as the consumer. This variance generates different needs for B2B
applications.
Type of order: Repeat and standing orders are a common requirement of B2B type of
transactions. Whereas the exact opposite is there in case of B2C transactions.
Type of payment: Type of payment is also a different in B2B. When any purchase is made,
varied forms of payment such as lines of credit and open orders are used. B2B are required to
be specially designed with such applications and requirements.
Type of search function: Next difference is the type of search function in B2B applications
where a catalog to is not necessarily required.
Type of connection: In B2B application dealings, in order to purchase, there is a connection to
one partner or several trusted. Since the dealings are generally with static list of trading
partners, virtual private network (VPN) technology can be used to provide secure access to
selected applications inside the firewall. This avoids a need to duplicate the data and
applications outside the firewall.
Complexity: B2B marketing is generally considered more complex and difficult than B2C
386
marketing the reason being that there are often more than one decision‐makers involved in a
B2B sale looking from a buyer's perspective.
Business‐to‐business (B2B) describes commerce transactions between businesses, such as
between a manufacturer and a wholesaler, or between a wholesaler and a retailer. Contrasting
terms are business‐to‐consumer (B2C) and business‐to‐government (B2G).
The volume of B2B transactions is much higher than the volume of B2C transactions. The
primary reason for this is that in a typical supply chain there will be many B2B transactions
involving subcomponent or raw materials, and only one B2C transaction, specifically sale of the
finished product to the end customer. For example, an automobile manufacturer makes several
B2B transactions such as buying tires, glass for windscreens, and rubber hoses for its vehicles.
The final transaction, a finished vehicle sold to the consumer, is a single (B2C) transaction.
B2B
‐ On the Internet, B2B (business‐to‐business), also known as e‐biz, is the exchange of
products, services, or information between businesses rather than between businesses and
consumers. Although early interest centered on the growth of retailing on the Internet
(sometimes called e‐tailing), forecasts are that B2B revenue will far exceed business‐to‐
consumers (B2C) revenue in the near future. According to studies published in early 2000, the
money volume of B2B exceeds that of e‐tailing by 10 to 1. Over the next five years, B2B is
expected to have a compound annual growth of 41%. The Gartner Group estimates B2B
revenue worldwide to be $7.29 trillion dollars by 2004. In early 2000, the volume of investment
in B2B by venture capitalists was reported to be accelerating sharply although profitable B2B
sites were not yet easy to find.
B2B Web sites can be sorted into:
• Company Web sites, since the target audience for many company Web sites is other
companies and their employees. Company sites can be thought of as round‐the‐clock
mini‐trade exhibits. Sometimes a company Web site serves as the entrance to an
exclusive extranet available only to customers or registered site users. Some company
Web sites sell directly from the site, effectively e‐tailing to other businesses.
• Product supply and procurement exchanges, where a company purchasing agent can
shop for supplies from vendors, request proposals, and, in some cases, bid to make a
purchase at a desired price. Sometimes referred to as e‐procurement sites, some serve
a range of industries and others focus on a niche market.
• Specialized or vertical industry portals which provide a "subWeb" of information,
product listings, discussion groups, and other features. These vertical portal sites have a
broader purpose than the procurement sites (although they may also support buying
and selling).
• Brokering sites that act as an intermediary between someone wanting a product or
service and potential providers. Equipment leasing is an example.
• Information sites (sometimes known as infomediary), which provide information about
a particular industry for its companies and their employees. These include specialized
search sites and trade and industry standards organization sites.
Many B2B sites may seem to fall into more than one of these groups. Models for B2B sites are
still evolving.
387
Another type of B2B enterprise is software for building B2B Web sites, including site building
tools and templates, database, and methodologies as well as transaction software.
B2B is e‐commerce between businesses. An earlier and much more limited kind of online B2B
prior to the Internet was Electronic Data Interchange (EDI), which is still widely used.
B2B transaction
B2B stands for business‐to‐business type of transactions wherein the transactions of goods or
services between businesses is done.
Such B2B relationships can exist when there is exchange of one category of products or
services. For example, an enterprise dealing with a particular product range will transact with a
bulk buyer limited to the same category.
Formerly, this term was used to explain explain electronic communication between business
and enterprises to separate it from B2C type transactions, but now its is also used for marketing
purposes.
Formerly the term tended to describe industrial marketing or capital goods marketing only.
However, today it is widely used to describe all products and services used by enterprises.
B2B application is essentially used when large processes are required between trading partners
and in much higher volumes than business‐to‐consumer (B2C) applications.
For B2B, standards like UN/EDIFACT, ANSI ASC X12 are used most popularly.
B2C transaction
B2C is a Business‐to‐consumer transaction. It is a form of electronic commerce in which
products or services are sold from a firm/business or company to a consumer directly.
Advantages of B2C application
B2C e‐commerce has the following advantages:
Purchasing is faster and more convenient.
Prices change instantaneously as per the market requirements.
Challenges faced by B2C application
The difficulties faced by B2C e‐commerce are building traffic and maintaining customers. It is
more difficult for the smaller firms to enter the market, sustain itself and also to remain
competitive. Also it is very difficult for them to acquire new customers online as they are
required to be attracted every time by offering price discounts.
Difference between B2B and B2C
Customer requirement: B2C emphasizes on individual customer transactions, while B2B
focuses on other businesses as the consumer. This variance generates different needs for B2B
applications.
Type of order: Repeat and standing orders are a common requirement of B2B type of
388
transactions. Whereas the exact opposite is there in case of B2C transactions.
Type of payment: Type of payment is also a different in B2B. When any purchase is made,
varied forms of payment such as lines of credit and open orders are used. B2B are required to
be specially designed with such applications and requirements.
Type of search function: Next difference is the type of search function in B2B applications
where a catalog to is not necessarily required.
Type of connection: In B2B application dealings, in order to purchase, there is a connection to
one partner or several trusted. Since the dealings are generally with static list of trading
partners, virtual private network (VPN) technology can be used to provide secure access to
selected applications inside the firewall. This avoids a need to duplicate the data and
applications outside the firewall.
Complexity: B2B marketing is generally considered more complex and difficult than B2C
marketing the reason being that there are often more than one decision‐makers involved in a
B2B sale looking from a buyer's perspective.
What is B2C E‐Commerce?
B2C (Business‐to‐Consumer) is basically a concept of online marketing and distributing of
products and services over the Internet. It is a natural progression for many retailers or
marketer who sells directly to the consumer. The general idea is, if you could reach more
customers, service them better, make more sales while spending less to do it, that would the
formula of success for implementing a B2C e‐commerce infrastructure.
For the consumer, it is relatively easy to appreciate the importance of e‐commerce. Why waste
time fighting the very real crowds in supermarkets, when, from the comfort of home, one can
shop on‐line at any time in virtual Internet shopping malls, and have the goods delivered home
directly.
Who should use B2C E‐Commerce?
• Manufacturers ‐ to sell and to retail the business buyers
• Distributors ‐ to take orders from the merchants they supply
• Publisher ‐ to sell subscriptions and books
• Direct Sales Firms ‐ as another channel to reach the buyers
• Entertainment Firms ‐ to promote new products and sell copies
• Information Provider ‐ to take payment for downloaded materials
• Specialty Retailers ‐ Niche marketers of products ranging from candles, coffees,
specialty foods, books use it to broaden their customer reach.
• Insurance Firms ‐ On‐line rate quotes and premium payments have made it easier for
this industry to attract and retain customers. In fact, virtually any business that can
deliver its products or provide its services outside its doors is a potential user.
Advantages Of E‐Business Applications:
Catalog flexibility and Online fast updating
• Direct "link" capabilities to content information and visual displays already existing on
other client web site. You can update your E‐Catalog anytime, whether it's adding new
389
products, or adjusting prices, without the expense and time of a traditional print
catalog.
• Extensive search capabilities by item, corporate name, division name, location,
manufacturer, partner, price or any other specified need.
Shrinks the Competition Gap
• Reduced marketing/advertising expenses, compete on equal footing with much bigger
companies; easily compete on quality, price, and availability.
Unlimited Market Place and Business Access Which Extend Customer Base
• The Internet gives customers the opportunity to browse and shop at their convenience
and at their place. They can access your services from home, office, or on the road, 24
hours a day, 7 days a week.
• The Internet allows you to reach people around the world, offering your products to a
global customer base.
A 24 Hour Store Reduced Sale Cycle
• Reduce unnecessary phone calls and mailings.
Lower Cost of Doing Business
• Reduce inventory, employees, purchasing costs, order processing costs associated with
faxing, phone calls, and data entry, and even eliminate physical stores. Reduce
transaction costs.
Eliminate Middlemen
• Sell directly to your customers.
Easier Business Administration
• With right software, store inventory levels, shipping and receiving logs, and other
business administration tasks can be automatically stored, categorized and updated in
real‐time, and accessed on demand.
Frees Your Staff
• Reduce customer service and sales support.
Customers will love it
• Gives customers control of sales process. Builds loyalty.
More Efficient Business Relationships
• Better way to deal with dealers and suppliers.
Workflow automation
• Shipping, real time inventory accounting system which adjusts stock levels and site,
location availability instantaneously
• Secured, automated registration verification, account entry and transaction
authorization features
• Automated RFP and RTQ features for vendor bid development and selection.
• Banking and accounting features customized for pre‐approved third party direct sales,
vendor, consignment or internal transfer transactions.
Secure Payment Systems
• Recent advancements in payment technologies allow encrypted, secure payment online.
Disadvantages Of Tradition Business Applications:
Catalog Inflexibility
• The catalog needs to regenerate every time when there are some new information or
items to add in.
High Marketing / Advertising Expenses
390
• Reduced marketing/advertising expenses, compete on equal footing with much bigger
companies; easily compete on quality, price, and availability
Limited Market Place
• Normally, customer will only locally and limited to certain area.
High Sale Cycle
• Usually, a lot of phone calls and mailings are needed.
Higher Cost of Doing Business
• Cost regarding inventory, employees, purchasing costs, and order‐processing costs
associated with faxing, phone calls, and data entry, and even physical stores.
Subsequently, increase transaction costs.
May Require A Middlemen
• Some sales or transaction may taking part indirectly or gone through third party to your
customers.
Inefficient Business Administration
• Store inventory levels, shipping and receiving logs, and other business administration
tasks might need to be categorized and updated manually in and done only when have
time. This cause the information might not the latest or updated.
Need to employ number of staff
• Need staff who gives customer service and sales support
XML agreements
XML provides a standard frame work for making Agreements about communication.
Industry DTDs
Industry schemas
Industry namespaces
But it doesn't make those agreements by itself!
Example: the eCo Architecture
391
Ecommerce definition and types of ecommerce
Ecommerce (e‐commerce) or electronic commerce, a subset of ebusiness, is the purchasing,
selling, and exchanging of goods and services over computer networks (such as the Internet)
through which transactions or terms of sale are performed electronically. Contrary to popular
belief, ecommerce is not just on the Web. In fact, ecommerce was alive and well in business to
business transactions before the Web back in the 70s via EDI (Electronic Data Interchange)
through VANs (Value‐Added Networks). Ecommerce can be broken into four main categories:
B2B, B2C, C2B, and C2C.
• B2B (Business‐to‐Business)
Companies doing business with each other such as manufacturers selling to distributors
and wholesalers selling to retailers. Pricing is based on quantity of order and is often
negotiable.
• B2C (Business‐to‐Consumer)
Businesses selling to the general public typically through catalogs utilizing shopping cart
software. By dollar volume, B2B takes the prize, however B2C is really what the average
Joe has in mind with regards to ecommerce as a whole.
Having a hard time finding a book? Need to purchase a custom, high‐end computer
system? How about a first class, all‐inclusive trip to a tropical island? With the advent
ecommerce, all three things can be purchased literally in minutes without human
interaction. Oh how far we've come!
• C2B (Consumer‐to‐Business)
A consumer posts his project with a set budget online and within hours companies
review the consumer's requirements and bid on the project. The consumer reviews the
bids and selects the company that will complete the project. Elance empowers
consumers around the world by providing the meeting ground and platform for such
transactions.
• C2C (Consumer‐to‐Consumer)
There are many sites offering free classifieds, auctions, and forums where individuals
can buy and sell thanks to online payment systems like PayPal where people can send
and receive money online with ease. eBay's auction service is a great example of where
person‐to‐person transactions take place everyday since 1995.
Companies using internal networks to offer their employees products and services online‐‐not
necessarily online on the Web‐‐are engaging in B2E (Business‐to‐Employee) ecommerce.
G2G (Government‐to‐Government), G2E (Government‐to‐Employee), G2B (Government‐to‐
Business), B2G (Business‐to‐Government), G2C (Government‐to‐Citizen), C2G (Citizen‐to‐
Government) are other forms of ecommerce that involve transactions with the government‐‐
from procurement to filing taxes to business registrations to renewing licenses. There are other
categories of ecommerce out there, but they tend to be superfluous.
Different types of B2B interactions
392
Business‐to‐Business (B2B) technologies pre‐date the Web. They have existed for at least as
long as the Internet. B2B applications were among the first to take advantage of advances in
computer networking. The Electronic Data Interchange (EDI) business standard is an illustration
of such an early adoption of the advances in computer networking. The ubiquity and the
affordability of the Web has made it possible for the masses of businesses to automate their
B2B interactions. However, several issues related to scale, content exchange, autonomy,
heterogeneity, and other issues still need to be addressed. In this paper, we survey the main
techniques; systems, products, and standards forB2Binteractions.We propose a set of criteria
for assessing the different B2B interaction techniques, standards, and products.
The Web offers a unique opportunity for E‐commerce to take a central stage in the fast growing
online economy [8,16,28]. With the advent of the Web, the first generation of Web‐based E‐
commerce was born: Business‐to‐Customer (B2C) Applications. Examples of B2C applications
include virtual malls, customized news delivery, traffic monitoring, and route planning. Another
quieter E‐commerce revolution
with far more dramatic economic implications has been taking place away from the spotlights:
Business‐to‐Business (B2B) E‐commerce. Examples of B2B applications include procurement,
Customer Relationship Management (CRM), billing, accounting, human resources, supply chain,
and manufacturing. B2B E‐commerce far exceeds B2C E‐commerce both in the volume of
transactions and rate of growth [34]. Despite the dot‐com debacle that shook the US economy,
B2B E‐commerce is still strong and predictions agree that B2B ‐commerce future looks even
brighter [34]. While B2B Ecommerce
has been around for at least as long as the Internet it reached its full potential with the
emergence of theWeb as a conduit for efficient B2B transacting. Numerous organizations
started using the Web as a means to automate relationships with their business partners. This
has elicited the formation of alliances in which businesses joined their applications, databases,
and systems to share costs, skills and resources in offering value‐added services. The ultimate
goal of B2B Ecommerce is therefore to have inter‐ and intra‐enterprise applications evolve
independently, yet allow them to effectively and conveniently use each other’s functionality.
An important challenge in B2B E‐commerce is interaction. Interaction is defined as consisting of
interoperation and integration with both internal and external enterprise applications [35]. This
has been a central concern because B2B applications are composed of autonomous,
393
heterogeneous, and distributed components. Interactions among loosely coupled and tightly
coupled systems has been, over the past 20 years, an active research topic in areas such as
databases, knowledge based systems, and digital libraries [15]. Interactions in B2B E‐commerce
offer unique challenges because of issues such as scalability, volatility (dynamism), autonomy,
heterogeneity,
and legacy systems. B2B E‐commerce requires the integration and interoperation of both
applications and data. Disparate data representations between partners’ systems must be dealt
with. Interaction is also required at a higher level for connecting (i) front‐end with back‐end
systems, (ii) proprietary/legacy
data sources, applications, processes, and workflows to the Web, and (iii) trading partners’
systems.
2 Overview of B2B interaction frameworks
In the first part of this section, we present a typical architecture of a B2B interaction
framework. We then identify the different layers that make up such framework. Finally, we
define the dimensions for assessing B2B architectures across these layers. These dimensions
are used as a benchmark for evaluating B2B E‐commerce interaction solutions.
2.1 Architecture of a B2B interaction framework
B2B applications refer to the use of computerized systems (e.g., Web servers, networking
services, databases) for conducting business (e.g., exchanging documents, selling products)
among different partners [16]. The building blocks for B2B applications are provided through a
B2B interaction framework (Fig. 1). These include modules for: (1) defining and managing
internal and external business processes, (2) integrating those processes, and (3) supporting
interactions with back‐end application systems such as ERPs (Enterprise Resource Planning)
[17].Abusiness process is defined as a multistep activity that supports an organization’s mission
such as manufacturing a product and processing insurance claims [17]. We depict in Fig. 1 the
main components of a B2B interaction framework. Translation facilities (e.g., application
adapters) may be used to interconnect back‐end systems (e.g., databases, ERPs) and internal
business processes (e.g., work‐ flows, applications).An external business process implements
the business logic of an organization with regard to its external partners such as processing
messages sent by trading partners’ systems. Interactions between partners’ external business
394
processes may be carried out based on a specific B2B standard (e.g., EDI [67,101], RossettaNet
[76]) or bilateral agreements.
B2B standards define the format and semantics of messages non‐repudiation), etc. A B2B
framework may have to support several B2B standards and proprietary interaction protocols.
(e.g., request for quote), bindings to communication protocols (e.g., HTTP, FTP), business
process conversations (e.g., joint business process), security mechanisms (e.g., encryption,
2.2 Layers of B2B interaction frameworks
Interactions in B2B applications occur in three layers: communication, content, and business
process layers. For example,
ComputerCompany and ProcessorProvider need to agree on their joint business process (e.g.,
delivery mode, contracts). ProcessorProvider needs also to “understand” the content of the
purchase order sent by ComputerCompany. Finally, there must be an agreed upon
communication protocol to exchange messages between ComputerCompany and
ProcessorProvider. The communication layer provides protocols for exchanging messages
among remotely located partners (e.g.,
HTTP, SOAP). It is possible that partners use different proprietary communication protocols. In
this case, gateways should be used to translate messages between heterogeneous protocols.
395
For example, ComputerCompany and ProcessorProvider may use Java RMI (Remote Method
Invocation) [82] and IBM’s MQSeries [48], respectively, for internal communications. The
objective of integration at this
layer is to achieve a seamless integration of the communication protocols. The content layer
provides languages and models to describe and organize information in such a way that it can
be understood and used. Content interactions require that the involved systems understand
the semantics of content and types of business documents. For instance, if ProcessorProvider
receives a message that contains
a document, it must determine whether the document represents a purchase order or request
for quotation. Information translation, transformation, and integration capabilities are needed
to provide for reconciliation among disparate representations, vocabularies, and semantics. The
objective of interactions at this layer is to achieve a seamless integration of data formats, data
models, and languages. For example, if ComputerCompany uses xCBL (XML Common Business
Library) [32] to represent business documents and ProcessorProvider expects documents in
cXML (Commerce XML) [23], there is a need for a conversion between these two formats.The
business process layer is concerned with the conversational interactions (i.e, joint business
process) among services. Before engaging in a transaction, ComputerCompany and
ProcessorProvider need to agree on the procedures of their joint business process. The
semantics of interactions among ComputerCompany and ProcessorProvider must be well
defined, such that there is no ambiguity as to what a message may mean, what actions are
allowed, what responses are expected, etc. The objective of interactions at this layer is to allow
autonomous and heterogeneous partners to come online, advertise their terms and
capabilities, and engage in peer‐to‐peer interactions with any other partners. Interoperability at
this higher level is a challenging issue because it requires the understanding of the semantics of
partner business processes [58].
B2B E‐commerce covers a wide spectrum of interactions among business partners. The types of
interactions depend on the usage scenarios, parties involved, and business requirements. Each
framework makes specific tradeoffs with regard to the requirements of B2B interactions. It is
therefore important to determine the relevant requirements and understand the related
tradeoffs when evaluating models of interactions. In this section, we identify a set of
dimensions to study interaction issues in B2B E‐commerce.We consider the following
dimensions: coupling among partners, heterogeneity, autonomy, external manageability,
adaptability, security, and scalability. • Coupling among partners: this dimension refers to the
degree of tightness and duration of coupling among business partners. Two partners are tightly
coupled if they are strongly dependent on each other. For example, one partner may control
the other, or they may control one another. Loosely coupled partners exchange business
information on demand. The duration of a B2B relationship may be transient (also called
dynamic) or long term. In transient relationships, businesses may need to form a fast and short
term partnership (e.g., for one transaction), and then disband when it is no longer profitable to
stay together. Businesses need to dynamically discover partners to team up with to deliver the
required service. In long term relationships, businesses assume an a priori defined partnership.
• Heterogeneity: heterogeneity refers to the degree of dissimilarity among business partners.
The need to access data across multiple types of systems has arisen due to the increased level
of connectivity and increased complexity of the data types. Applications use different data
structures (e.g., XML, relational databases), standard or propriety semantics (e.g., standardized
396
ontologies). There may also be structural heterogeneity at the business process layer (e.g., use
of APIs, document exchange protocols, inter‐enterprise workflows). In addition, organizations
may, from a semantic point of view, use different strategies for conducting business that
depend on business laws and practices [18]. • Autonomy: autonomy refers to the degree of
compliance of a partner to the global control rules. Partner systems may
be autonomous in their design, communication, and execution. This means that individual
partners select the process and content description models, programming models, interaction
models with the outside world, etc. In a fully autonomous collaboration, each partner is viewed
as a black box, that is able to exchange information (i.e., send and receive messages). Partners
interact via well‐defined interfaces allowing them to have more local control over
implementation and operation of services, and flexibility
to change their processes without affecting each other. Usually, a completely autonomous
collaboration may be difficult to achieve because it may require sophisticated translation
facilities. • External manageability: this dimension refers to the degree of external visibility and
manageability of partners’ applications. In order to be effectively monitored by external
partners, an application must be defined in a way that facilitates the supervision and control of
its execution, measurement of its performance, and prediction of its status and availability. For
example, ComputerCompany may need to get the status (e.g., pending, approved) of the
purchase order sent to ProcessorProvider. This requires that ProcessorProvider exposes
sufficient information pertaining to measurements and control points to be used by
ComputerCompany. While desirable in principle, high visibility may require complex
descriptions of partners’applications.However, the overhead to provide such descriptions may
be well justified if it provides other advantages such as Quality of Service (QoS). • Adaptability:
adaptability refers to the degree to which an application is able to quickly adapt to changes.
B2B applications operate in a highly dynamic environment where new services could come on‐
line, existing services might be removed, and the content and capabilities of services may be
updated. For example, ComputerCompany may decide to partner with a new processor
provider for QoS purposes (e.g., cost, time). Businesses must be able to respond rapidly to
changes whereby both operational (e.g., server load) and market (e.g., changes of availability
status, changes of user’s requirements) environment are not predictable. For example, if
ProcessorProvider decides to stop its supply activities (e.g., for local maintenance),
ComputerCompany would then need to adapt to such change. Changes may be initiated to
adapt applications to actual business climate (e.g., economic, policy, or organizational changes).
They may also be initiated to take advantage of new business opportunities. Since applications
interact with both local back‐end systems and partner applications, it is important to consider
the impact of changes in both local and external applications to ensure local and global
consistency. In general, the impact of changes depends on the degree of tightness among
applications. • Security: security is a major concern for inter‐enterprise applications. Before B2B
E‐commerce reaches its real potential, sophisticated security measures must be in place to
boost E‐commerce partner’s confidence that their transactions are safely handled [103]. For
instance, ProcessorProvider may need to check the authenticity of the purchase order before
processing it. B2B applications must support mutual authentication, fine grain authentication,
communication integrity, confidentiality, non‐repudiation, and authorization. B2B interactions
may be based on limited mutual trust, little or no prior knowledge of partners, and transient
collaborative agreements.
397
Shared information may include limited capabilities of services. • Scalability: scalability refers to
the ability of a system to grow in one or more dimensions such as the volume of accessible
data, the number of transactions that can be supported in a given unit of time, and the number
of relationships that can be supported. More importantly, changes in business climate are
forcing organizations to merge in order to be effective in the global market. Thus, the cost and
effort to support new relationships is an important criterion to consider when evaluating
interaction solutions in B2B E‐commerce. Clearly, a low cost establishment of new relationships
is desirable. However, in case of long term relationships, the cost of establishing a new
relationship is not of great significance.
EBXML
Thee bXML initiative
Joint effort of OASIS and UN/CEFACT
Goal:"An open technical frame work to enable XML to be utilized in a consistent and uniform
manner for the exchange of Electronic Business (EB)data in application to application
,application to human and human to application environments, thus creating a single global
market"
http://www.ebxml.org/
398
ebXML project teams
Business Process
Core Components
Transport/Routing and Packaging
Trading Partners
Registry and Repository
Requirements
Technical Architecture
Security
Proof of Concept
Quality Review
Marketing
ebXML (Electronic Business using eXtensible Markup Language), is a modular suite of
specifications that enables enterprises of any size and in any geographical location to conduct
business over the Internet. Using ebXML, companies now have a standard method to exchange
business messages, conduct trading relationships, communicate data in common terms and
define and register business processes.
• ebXML Value
o Provides the only globally developed open XML‐based Standard built on a rich
heritage of electronic business experience.
o Creates a Single Global Electronic Market Enables all parties irrespective of size
to engage in Internet‐based electronic business. Provides for plug and play
shrink‐wrapped solutions.
o Enables parties to complement and extend current EC/EDI investment expand
electronic business to new and existing trading partners.
o Facilitates convergence of current and emerging XML efforts.
• ebXML delivers the value by
o Developing technical specifications for the open ebXML infrastructure.
o Creating the technical specifications with the world's best experts.
o Collaborating with other initiatives and standards development organizations.
o Building on the experience and strengths of existing EDI knowledge.
o Enlisting industry leaders to participate and adopt ebXML infrastructure.
o Realizing the commitment by ebXML participants to implement the ebXML
technical specifications.
ebXML was started in 1999 as an initiative of OASIS and the United Nations/ECE agency CEFACT.
The original project envisioned and delivered five layers of substantive data specification,
including XML standards for:
• Business processes
• Core data components
• Collaboration protocol agreements
• Messaging
• Registries and repositories
Summary: ebXML is a big project with a lot of pieces. In this article David Mertz outlines how
the pieces all fit together. This overview provides an introduction to the ebXML concept and
then looks a bit more specifically at the representation of business processes, an important
399
starting point for ebXML implementations. Two short bits of sample code demonstrate the
ProcessSpecification DTD and a package of collaborations.
When you read about ebXML, it's difficult to get a handle on exactly what it is ‐‐ and on what it
isn't. The 'eb' in ebXML stands for "electronic business," and you can pronounce the phrase as
"electronic business XML," "e‐biz XML," "e‐business XML," or simply "ee‐bee‐ex‐em‐el."
What is ebXML?
On one hand, ebXML seems to promise a grand unification of everything businesses do to
communicate with each other. On the other hand, one could be forgiven for thinking that
ebXML amounts to little more than a pious, but vacuous, declaration that existing standards are
worth following. As with every "next big thing," the truth lies somewhere in the middle.
ebXML terminology
Registry: A central server that stores a variety of data necessary to make ebXML work. Amongst
the information a Registry makes available in XML form are: Business Process & Information
Meta Models, Core Library, Collaboration Protocol Profiles, and Business Library. Basically,
when a business wants to start an ebXML relationship with another business, it queries a
Registry in order to locate a suitable partner and to find information about requirements for
dealing with that partner.
Business Processes: Activities that a business can engage in (and for which it would generally
want one or more partners). A Business Process is formally described by the Business Process
Specification Schema (a W3C XML Schema and also a DTD), but may also be modeled in UML.
Collaboration Protocol Profile (CPP): A profile filed with a Registry by a business wishing to
engage in ebXML transactions. The CPP will specify some Business Processes of the business, as
well as some Business Service Interfaces it supports.
Business Service Interface: The ways that a business is able to carry out the transactions
necessary in its Business Processes. The Business Service Interface also includes the kinds of
Business Messages the business supports and the protocols over which these messages might
travel.
Business Messages: The actual information communicated as part of a business transaction. A
message will contain multiple layers. At the outside layer, an actual communication protocol
must be used (such as HTTP or SMTP). SOAP is an ebXML recommendation as an envelope for a
message "payload." Other layers may deal with encryption or authentication.
Core Library: A set of standard "parts" that may be used in larger ebXML elements. For
example, Core Processes may be referenced by Business Processes. The Core Library is
contributed by the ebXML initiative itself, while larger elements may be contributed by specific
industries or businesses.
Collaboration Protocol Agreement (CPA): In essence, a contract between two or more
businesses that can be derived automatically from the CPPs of the respective companies. If a
CPP says "I can do X," a CPA says "We will do X together."
Simple Object Access Protocol (SOAP): A W3C protocol for exchange of information in a
distributed environment endorsed by the ebXML initiative. Of interest for ebXML is SOAP's
function as an envelope that defines a framework for describing what is in a message and how
to process it.
The ebXML.org homepage offers this brief characterization:
ebXML is a set of specifications that together enable a modular electronic business framework.
The vision of ebXML is to enable a global electronic marketplace where enterprises of any size
400
and in any geographical location can meet and conduct business with each other through the
exchange of XML‐based messages.
Or in other words, ebXML hopes to succeed Electronic Data Interchange, more often known by
its abbreviation, EDI. (Official descriptions tend to emphasize learning from EDI rather than
throwing it out.)
ebXML terminology
Sorting out ebXML involves a few steps. Perhaps the first thing necessary for understanding the
details of ebXML is to digest an alphabet soup of new acronyms and other special terms. There
are a number of these terms in the sidebar to the right ( ebXML terminology) to consider before
looking at the whole "vision" of ebXML interactions. Additional terms fit into the entire system,
but these particular terms make a good starting point. With this new vocabulary in mind, and a
bit of the following background on where ebXML comes from, you can begin to make sense of
how all of the differing processes in ebXML hold together.
After describing what ebXML does (at least in outline) at the beginning of this article, a final
section looks in more detail at the Business Process Specification Schema, which makes up one
of the most important elements of ebXML's underlying infrastructure.
Background
ebXML is an initiative whose participants and endorsers consist of just about every big company
and association of government standards worldwide that you can think of. Well, maybe not
every one you can think of, but certainly hundreds of large companies and bodies.
Computer/technology companies are not the only entities that endorse ebXML; backers include
a large number of industrial, shipping, banking, and other general‐interest companies. The
direct sponsors of ebXML are OASIS (Organization for the Advancement of Structured
Information Standards) and UN/CEFACT (United Nations Centre for Trade Facilitation and
Electronic Business). Lots of standards bodies also have a finger in the pie, including NIST
(National Institute of Standards and Technology) and W3C (World Wide Web Consortium).
With such a collection of supporters, it would seem that ebXML is destined to take over the
world. I tend to have a cynical attitude toward industry buzzwords and hype. In the case of
ebXML, however, I mostly expect it to live up to its billing as a global protocol for most business
transactions within the next five years.
In my opinion, ebXML will succeed in becoming universal by incorporating into the
specifications more and more of what businesses do anyway as much as it will by actually
getting businesses to do business differently. I'm not sure if my estimation is cynical or if it is
encouragement at the openness of ebXML specifications, but the ebXML initiative clearly holds
an embrace‐existing‐standards‐and‐methods attitude.
401
The Classsic ebXML m model
This thenn forms thee basis for what
w we cann call ‘classic ebXML’, ebMS
e with CPAs contro olling
transaction exchangge based processes
p b
between paartners. Whhile there is
i some lim mited
involvemment of Registry services in certain deploymen ority of implementations are
nts the majo
done witthout using aa formal Reggistry, insteaad websites p perform the
role of reegistry facilittation. The classic ebXMML approach h has proved d its worth bby also being the
basis of a formal certification
c program for
f ebMS im mplementations. UCCNet providess this
certificattion in coop
peration with h the eBusinessReady.o org service. Now custommers are ab ble to
purchasee solutions that
t are guaaranteed to be interopeerable with each other. This is a crritical
advantagge that ebXM ML has todayy.
Using this classic ebX XML model implementers create two‐player bussiness exchaanges.
402
An archetypal exchange is that of purchase orders, shipping notices and invoices between a
buyer and a seller. In figure 2 below we see the activity model for such a Requester /Responder
configuration that is supported using the classic ebXML components.
The individual main steps are ‘Create Order’ and ‘Order Fulfillment’, along with the business
transactions that enable those. There is an initiating request from the requester partner, and
then the responder replies with a selection of transactions depending on the business state of
the interaction, either rejecting or confirming the order accordingly.
The ‘join’ indicates that the process will only proceed when both an order confirmation and a
ship delivery notice have been received. The ‘fork’ allows more than one action depending on a
condition. In this case either a payment notice has to be created or not,
based on the requirement of the particular supplier’s application system (if it can reconcile
electronic payments, or requires information to be able to reconcile them).
An illustration (Figure 1) based on the ebXML Technical Architecture Specification (see
Resources) will probably go a long way toward sorting out what ebXML means for business.
Company A in Figure 1 below will first review the contents of an ebXML Registry, especially the
Core Library which may be downloaded or viewed there. The Core Library (and maybe other
registered Business Processes) will allow Company A to determine the requirements for their
own implementation of ebXML (and whether ebXML is appropriate for their business needs).
403
Figure 1: High‐level overview of ebXML interaction between two companies
Based on a review of the information available from an ebXML Registry, Company A can build
or buy an ebXML implementation suitable for its anticipated ebXML transactions. The hope of
the ebXML initiative is that vendors will support all of the elements of ebXML. At such time, an
"ebXML system" might be little more than a prepackaged desktop application. Or maybe, more
realistically, the ebXML system will at least be as manageable as a commercial database system
(which still needs a DBA). Figure 1 suggests that the hypothetical Company B uses something
like this prepackaged application.
Either way, the next step is for Company A to create and register a CPP with the Registry.
Company A might wish to contribute new Business Processes to the Registry, or simply
reference available ones. The CPP will contain the information necessary for a potential partner
to determine the business roles in which Company A is interested, and the type of protocols it is
willing to engage in for these roles.
Once Company A is registered, Company B can look at Company A's CPP to determine that it is
compatible with Company B's CPP and requirements. At that point, Company B should be able
to negotiate a CPA automatically with Company A, based on the conformance of the CPPs, plus
agreement protocols, given as ebXML standards or recommendations.
Finally, the two companies begin actual transactions. These transactions are likely to involve
Business Messages conforming to further ebXML standards and recommendations. At some
point in all of this, however, "real‐world" activities will probably occur (for example, the
shipment of goods from one place to another, or the rendering of services). ebXML will have
helped in agreeing to, monitoring, and verifying these real‐world activities. Of course, in our
"information economy," a lot of what goes on might stay within the realm of ebXML ‐‐ maybe
everything within a particular business relationship.
The Business Process Schema
404
The UN/CEFACT Modeling Methodology (UMM), which utilizes UML, may be instrumental in
modeling the ebXML Business Processes. However, such modeling is simply a recommendation,
not a requirement. In any case, since this article targets XML developers and does not address
OOD (object‐oriented design), it is more interesting herein to look at the representation of the
models in XML documents conformant to the Business Process Specification DTD and XML
Schema. The DTD (named "ebXMLProcessSpecification‐ v1.00.dtd") appears, at this time, to be
the primary rule representation. Both this DTD and a W3C XML Schema, which is (presumably)
semantically and syntactically compatible, may be found in the EbXML_BPschema_1.0
recommendation (see Resources).
ebXML process specifications have a root element ProcessSpecification. A particular process
specification may contain subnode references to other process specifications, as well as to
document specifications and other information. The DTD declaration for ProcessSpecification
provides an overview of the structure of a Business Process document:
Listing 1: ProcessSpecification DTD declaration
<!ELEMENT ProcessSpecification
(Documentation*,
(Include* | DocumentSpecification* |
ProcessSpecification* | Package |
BinaryCollaboration | BusinessTransaction |
MultiPartyCollaboration)*)>
<!ATTLIST ProcessSpecification
name ID #REQUIRED
version CDATA #REQUIRED
uuid CDATA #REQUIRED >
The attribute uuid is a globally unique identifier for a process specification; the name and
version are specific to the model represented (the name should not collide with nested process
specifications).
Within a process specification, a Package defines a set of collaborations that may be either
MultiPartyCollaboration elements or BinaryCollaboration elements. Collaborations, in turn,
contain a variety of roles for the parties. An excerpt from the sample process specification
contained in the EbXML_BPschema_1.0 recommendation (see Resources) is helpful in sorting
out this structure:
Listing 2: A package of collaborations
<Package name="Ordering">
<!‐‐ First the overall MultiParty Collaboration ‐‐>
<MultiPartyCollaboration name="DropShip">
<BusinessPartnerRole name="Customer">
<Performs authorizedRole="requestor"/>
<Performs authorizedRole="buyer"/>
<Transition fromBusinessState="Catalog Request"
toBusinessState="Create Order"/>
405
</BusinessPartnerRole>
<BusinessPartnerRole name="Retailer">
<Performs authorizedRole="provider"/>
<Performs authorizedRole="seller"/>
<Performs authorizedRole="Creditor"/>
<Performs authorizedRole="buyer"/>
<Performs authorizedRole="Payee"/>
[...]
<BinaryCollaboration name="Request Catalog">
<AuthorizedRole name="requestor"/>
<AuthorizedRole name="provider"/>
<BusinessTransactionActivity name="Catalog Request"
businessTransaction="Catalog Request"
fromAuthorizedRole="requestor"
toAuthorizedRole="provider"/>
</BinaryCollaboration>
[...]
ebXML stands for Electronic Business XML.
ebXML is a modular suite of specifications that gives businesses of any size the ability to
conduct business over the internet.
This tutorial will give you an understanding of ebXML architecture and its components.
Businesses inevitably talk to each other in a variety of ways. Some time back, even till now
many large companies communicate automatically through EDI, which allows two companies to
communicate using predetermined signals.
The trouble with EDI (Electronic Data Interchange ) is that it is very expensive and originally it
was created for the mainframe world. Now ebXML is replacing EDI.
What is ebXML ?
• ebXML stands for Electronic Business Extensible Markup Language.
• ebXML is Global Standard for electronic business.
• ebXML is an end‐to‐end B2B XML Framework.
• ebXML is a set of specifications that together enable a modular framework.
• ebXML enables anyone, anywhere to do business with anyone else over the Internet.
• ebXML relies on the Internet's existing standards such as HTTP, TCP/IP, MIME, SMTP,
FTP, UML, and XML.
• ebXML can be implemented and deployed on virtually any computing platform.
• ebXML provides concrete specifications to enable dynamic B2B collaborations.
ebXML Vision:
ebXML is designed to create a global electronic market place where enterprises of any size,
anywhere can:
• Find each other electronically.
• Conduct Business ‐
o Using exchange of XML messages
406
o According to standard business process sequences.
o With clear business semantics.
o Using off the shelf purchased business applications.
o According to mutually agreed trading partner protocol agreements.
Why ebXML ?
• Existing B2B Frameworks are not adequate:
o EDI and RosettaNet are too heavy‐weight and too rigid.
o BizTalk is proprietary, single‐vendor, single‐platform.
• SOAP, WSDL and UDDI alone are not adequate:
o WSDL does not address business collaboration
o SOAP in its basic form does not provide secure and reliable message delivery
o UDDI does not provide repository capability for business objects.
• Need for standardizing business collaboration to address the followings:
o What are the business processes?
o Who are the parties involved in business collaboration? What are their roles?
o What and how do XML documents get exchanged in the business collaborations?
o What are the security, reliability, quality of service requirements of this business
collaboration?
o All these are addressed by ebXML.
Founding organizations:
ebXML is a joint initiative by OASIS and UN/CEFACT.
UN/CEFACT:
• Stands for United Nations Centre for Trade Facilitation and Electronic Business.
• Created and maintains the UN/EDIFACT standards for Electronic Data Interchange (EDI).
OASIS:
• Stands for Organization for Advancement of Structured Information Standards.
• Creates and maintains XML interoperability specifications, broad industry support.
ebXML Architecture
By definition, the iterative life cycle of B2B collaboration includes following steps:
• Process Definition
• Partner Discovery
• Partner Sign‐up
• Electronic Plug‐in
• Process Execution
• Process Management
• Process Evolution
The overall ebXML specifications are intended to cover almost the entire process of B2B
collaboration and are designed to meet the needs described above.
ebXML architecture as defined by ebXML team provides:
• A way to define business processes and their associated messages and content.
407
• A way to register and discover business process sequences with related message
exchanges.
• A way to define company profiles.
• A way to define trading partner agreements.
• A uniform message transport layer.
Consequently, the technical architecture of ebXML is composed of five modules:
1. Business Process Specifications
2. Partner Profile and Agreements
3. Registry and Repository
4. Core Components
5. Messaging Service
These modules will be covered in next five subsequent chapters. Below is the diagram showing
simplified architecture of ebXML.
ebXML Business Process
The Business Process and Information model defines how to describe the basic information
elements used in business messages and to describe business processes.
A Business Process is something that a business does, such as buying computer parts or selling a
professional service. It involves the exchange of information between two or more Trading
Partners in some predictable way.
The specification for business process definition enables an organization to express its business
processes so that they are understandable by other organizations. This enables the integration
of business processes within a company, or between companies.
The ebXML Business Process Specification Schema (BPSS) provides the definition of an XML
document that describes how an organization conducts its business. An ebXML BPSS is a
declaration of the partners, roles, collaborations, choreography and business document
exchanges that make up a business process.
Following diagram gives a conceptual view of Business Process.
408
Business Collaborations:
A Business Collaboration is a choreographed set of Business Transaction Activities, in which two
Trading Partners exchange documents.
The most common one is a Binary Collaboration, in which two partners exchange documents. A
Multiparty Collaboration takes place when information is exchanged between more than two
parties.
Multiparty Collaborations are actually choreographed Binary Collaborations.
At its lowest level, a Business Collaboration can be broken down into Business Transactions.
Business Transactions:
A Business Transaction is the atomic level of work in a Business Process. It either succeeds or
fails completely.
Business Transactions are transactions in which Trading Partners actually transfer Business
Documents.
Business Document flows:
A business transaction is realized as Business Document flows between the requesting and
responding roles. There is always a requesting Business Document, and optionally a responding
Business Document, depending on the desired transaction semantics, e.g. one‐way notification
vs. two‐way conversation.
409
Actual document definition is achieved using the ebXML core component specifications, or by
some methodology external to ebXML but resulting in a DTD or Schema that an ebXML Business
Process Specification can point to.
Choreography:
The choreography is expressed in terms of states and the transitions between them. A Business
Activity is known as an abstract state, with Business Collaborations and Business Transaction
Activities known as concrete states. The choreography is described in the ebXML Business
Process Specification Schema using activity diagram concepts such as start state, completion
state etc.
Business Documents:
The Business Documents are composed of Business Information Objects, or smaller chunks of
information that have previously been identified.
These chunks, or components, don't carry any information, of course. They are merely
structures, such as an XML Schema or a DTD, that define information and how it must be
presented. The end result is a predictable structure into which information is placed, so that the
receiver of the final document can interpret it to extract the information.
Business Process Specification Example:
A partial example of Business Process Specification is given below:
<BusinessTransaction name="Create Order">
<RequestingBusinessActivity name=""
isNonRepudiationRequired="true"
timeToAcknowledgeReceipt="P2D"
timeToAcknowledgeAcceptance="P3D">
<DocumentEnvelope BusinessDocument="Purchase Order"/ >
</RequestingBusinessActivity>
<RespondingBusinessActivity name=""
isNonRepudiationRequired="true"
timeToAcknowledgeReceipt="P5D">
<DocumentEnvelope isPositiveResponse="true"
BusinessDocument="PO Acknowledgement"/>
</DocumentEnvelope>
</RespondingBusinessActivity>
</BusinessTransaction>
410
Conclusion:
A Business Process Specification:
• Describes collaboration between two partners
• Defines roles, relationships and responsibilities
• Defines choreography of business documents.
• Expressed in platform and vendor neutral format.
• Can be modeled with UMM (UN/CEFACT Modeling Methodology).
• Formally described by Business Process Specification Schema (BPSS).
• Referenced by CPP and CPA.
• Refers to Business Document Definitions.
ebXML CPP and CPA
Collaboration Protocol Profile (CPP):
A CPP provides all necessary information how a particular trading partner intends to do
electronic business. A CPP defines following attributes of a trading partner:
• Business Capabilities through Business Process.
• The role for example, buyer or insurer they play within a collaboration
• Defines the delivery channels and transport protocols. (HTTP, SMTP etc.)
• Packaging way of business documents
• Security constraints (SSL, Digital Certificate ).
• Provides per‐party configuration to business process specifications.
A CPP is stored in ebXML registry with a Globally Unique Identifier (GUID)and business partners
can find each other's CPP through registry.
The information within the CPP is available to be searched on, so a potential Trading Partner
can determine whether the organization has the capabilities to do business.
Structure of a CPP
CPP defines namespaces on its root element and and a version to distinguish any subsequent
changes. The structure of a CPP consists of a root CollaborationProtocolProfile element with
following elements:
• PartyInfo: The PartyInfo element provides information about the organization.
• Packaging: The Packaging element provides information about the way in which
messages are actually constructed.Messages are processed as SOAP Messages.
• Signature: Optional part of the document
• Comment elements: can be included.
<CollaborationProtocolProfile
xmlns="http://www.ebxml.org/namespaces/tradePartner"
xmlns:ds="http://www.w3.org/2000/09/xmldsig#"
xmlns:xlink="http://www.w3.org/1999/xlink"
version="1.1">
<PartyInfo>
...
<!‐‐REQUIRED, Repeatable‐‐>
411
...
</PartyInfo>
<Packaging id="ID">
...
<!‐‐REQUIRED‐‐>
...
<Packaging>
<ds:Signature>
...
<!‐‐OPTIONAL‐‐>
...
</ds:Signature>
<Comment>
...
<!‐‐ OPTIONAL ‐‐>
...
</Comment>
</CollaborationProtocolProfile>
Trading Partner Agreement (TPA):
A trading partner agreement (TPA) is a contract defining both the legal terms and conditions
and the technical specifications for both partners in the trading relationship. A CPA is derived
from CPP's of trading partners.
The rules specified by the electronic TPA are independent of the business processes at either
party. A technical description of the terms and conditions from the TPA is expressed in an XML
document, which configures each IT systems to operate under the agreement rules.
TPA properties include its name, partner names, starting and ending dates, roles, and other
parameters. Typically, one party generates a CPA and offers it to the other party for approval.
Once both sides have reached agreement, they each take an electronic copy of the same CPA
and use it to configure their systems.
The CPA may also be added to the registry for reference, but this is not a standard requirement.
Structure of a CPA
CPA defines namespaces on its root element and and a version to distinguish any subsequent
changes.The structure of a CPP consists of a root CollaborationProtocolAgreement element
with following elements:
• Start and End These elements represent, in Coordinated Universal Time, the beginning
and end of the period during which this CPA is active.
• PartyInfo: The PartyInfo element provides information about the organization. Here
PartyInfo elements are included for both parties involved in the agreement.
• Packaging: The Packaging element provides information about the way in which
messages are actually constructed.Messages are processed as SOAP Messages.
• Signature: Optional part of the document
• Comment elements: can be included.
<CollaborationProtocolAgreement
xmlns="http://www.ebxml.org/namespaces/tradePartner"
412
xmlns:ds = "http://www.w3.org/2000/09/xmldsig#"
xmlns:xlink = "http://www.w3.org/1999/xlink"
cpaid="http://www.example.com/cpas/CPAS"
version="1.7">
<Status value = "proposed"/>
<Start>1998‐04‐07T18:50:00</Start>
<End>1999‐04‐07T18:50:00</End>
<ConversationConstraints invocationLimit = "150"
concurrentConversations = "10"/>
<PartyInfo>
...
<!‐‐REQUIRED, repeatable‐‐>
...
</PartyInfo>
<PartyInfo>
...
<!‐‐REQUIRED, repeatable‐‐>
...
</PartyInfo>
<Packaging id="N20">
...
<!‐‐REQUIRED, repeatable‐‐>
...
</Packaging>
<ds:Signature>
<!‐‐OPTIONAL‐‐>
</ds:Signature>
<Comment xml:lang="en‐gb">
<!‐‐OPTIONAL‐‐>
</Comment>
</CollaborationProtocolAgreement>
ebXML Registry and Repository Service
What is Registry and Repository:
An ebXML registry serves as the index and application gateway for a repository to the outside
world, and it contains the API that governs how parties interact with the repository. An ebXML
repository is the holder of the things.
• The ebXML registry is central to the ebXML architecture.
• The registry can also be viewed as an API to the database of items that supports e‐
business with ebXML.
• The ebXML registry serves as a database for sharing of relevant company information
for ebXML business transactions, such as corporate capabilities, business process,
technical blueprints, order forms, invoices, and so on.
413
• Items in the repository are created, updated, or deleted through requests made to the
registry.
• Repositories provide trading partners with the shared business semantics.
• The ebXML registry is an interface for accessing and discovering shared business
semantics.
• The registry interface is designed to be independent of the underlying network protocol
stack, such as HTTP or SMTP over TCP/IP.
The registry provides a stable, persistent store of submitted content, which includes XML
schema and documents, process descriptions, core components, context descriptions, UML
models, information about parties, and even software components. This can be represented as
a software stack of services, as shown in the figure below:
Goals of ebXML Registry:
To enable sharing of information between interested parties for the purpose of enabling
business process integration between such parties.
Benefits of ebXML registry:
An ebXML registry provides the following benefits:
• Discovery and maintenance of registered content.
• Support for collaborative development, where users can create XML content and submit
it to the registry for use and potential enhancement by the authorized parties.
• Persistence of Web Services Business Process Execution Language (WS‐BPEL), WSDL,
and business documents during interactions between trading partners.
• Secure version control of registered content.
• Federation of cooperating registries to provide a single view of registered content by
seamless querying, synchronization, and relocation of registered content.
• Event notification via email or Web services.
414
Compliance:
According to the ebXML Registry Services Specification, a registry implementation complies
with the ebXML specification if it meets the following conditions:
• It supports the ebXML Registry Information Model.
• It supports the syntax and semantics of the registry interfaces and security.
• it supports the ebXML registry DTD.
• Support of the syntax and semantics of SQL query in the registry is optional.
A registry client implementation complies with the ebXML specification if it meets the following
conditions:
• It supports the ebXML CPA and bootstrapping process.
• The syntax and the semantics of the registry client interfaces.
• The ebXML error message DTD.
• The ebXML registry DTD.
Registry Objects and Metadata:
Registry objects ‐
• Refers to an object that has been submitted to a Registry for storage and safekeeping.
• Called "Repository item">
• XML document or DTD, business process models, CPPs, etc.
Metadata ‐
• Used by registry to classify and manage registry objects
• Represented by RegistryEntry
Registry Information Model (RIM)
The Registry Information Model (RIM) provides a high‐level blueprint for meta data in the
ebXML registry. This can be represented as a software stack of services or as a service pyramid
as shown in the figure below.The elements of the information model represent meta data
about the content, not the content itself in the repository. The registry information model
defines what types of objects are stored and organized in the registry.
The information model is a roadmap to the type of meta data and the relationships between
meta data. The registry information model may be mapped to a relational database schema,
object database schema, or some other physical schema.
415
ebXML Core Components
Definition from ebXML "Core Component and Business Process Document Overview"
"A Core Component captures information about a real world business concept, and
relationships between that concept and other business concepts. A Core Component can be
either an individual piece of business information, or a family of business information pieces. It
is core because it occurs in many different areas of industry/business information exchange"
A core component is a basic, reusable building block that contains information representing a
business concept.Some examples of core components for parts of a purchase order are "Date
of Purchase Order," "Sales Tax," and "Total Amount."
In general, core components are used in many different domains, industries, and business
processes. In the ebXML environment, core components are the building blocks for XML
semantics and business vocabulary that are used in messages and documents.
From a specific business document in a business process, we can refer to a core component,
which holds a minimal set of e‐business information. If the business processes are the verbs in
e‐business terms, the core components represent the nouns and adjectives.
A core component can be used across several business sectors, but it also can become context‐
specific to a business domain, such as an individual industry area.
A core component works with a registry, since it is storable and retrievable using a standard
ebXML registry. A central core component library serves as a reference document for common
business practices across industry business processes.
Tools and References:
Following is the list of essential references and tools for core components provided by ebXML
for the business and technical analyst.
• Context and the Re‐usability of Core Components: This document contains context
definitions, the sources of classification value lists, and a pictorial model of core
component and context descriptor relationships.
• Catalog of Context Drivers: This document provides a catalog of context drivers.
• Document Assembly and Context Rules: This describes the procedures and schemas for
assembling documents using contextually driven core components.
416
• Core Components Dictionary: This document is divided into sections. Each section
begins with the information on the applicable category and core component type
• Core Components Editor and Browser: These tools help analysts browse existing core
components and integrate them to define the format of the XML messages exchanged
between trading partners and to properly define and apply the context rules.
Core Components Examples:
• Core component A:
o Vendor (Industry1)
o Manufacturer (Industry 2)
o Supplier (Industry 3)
• Core component B:
o Distributor (Industry 1)
o Wholesaler (Industry 2)
o Merchant (Industry 3)
• Core component C:
o Store (Industry 1)
o Outlet (Industry 2)
o Retailer (Industry 3)
Conclusion:
Core Components are ‐
• Uniquely identifiable.
• Reusable low‐level data structures
o ‐e.g., party, address, phone, date, currency
o ‐Context‐sensitive
• Used to define business process and information models.
• Facilitates interoperability between disparate systems.
• A core component in ebXML can contain another core component.
ebXML Messaging Service
A complete message is called the message package, which is a Multipurpose Internet Mail
Extensions (MIME) object. The message package contains two principal parts:
• SOAP Message Container: This is required part of the message and contains the SOAP
extension elements for ebXML, such as routing information, trading partner
information, message identification, and delivery semantics information.
• Payload Containers: This is optional part of the message and can contain any type of
information that is to be exchanged between parties.
Messaging Design Criteria:
According to the Messaging Service Specification, the design goals for the ebXML message
service are to:
• Leverage existing standards wherever possible.
• Be simple to implement.
• Support enterprises of all sizes.
417
• Support a wide variety of communication protocols (HTTP, SMTP, FTP, etc.)
• Support payloads of any type (XML, EDI transactions, binary data, etc.)
• Support reliable messaging.
• Ensure security.
Messaging Architecture:
The ebXML message service was designed to work within the overall context of the ebXML
initiative. However, the ebXML technical architecture is modular, and the message service can
be used independently of ebXML.
The ebXML message service has three logical architectural levels between the business
application and the network protocols:
• The Message Service Interface (MSI): is an application interface for business
applications to invoke message handler functionality for sending and receiving
messages. Similar to ODBC, JDBC, and other abstract service interfaces, it exposes the
message handler functionality as a defined set of APIs for business application
developers.
• The Message Service Handler (MSH): has basic services, such as header processing,
header parsing, security services, reliable messaging services, message packing, and
error handling.
• The Message Transport Interface (MTI): is designed to send messages over a variety of
network and application‐level communication protocols. The transport interface
transforms ebXMLspecific data to other forms carried by network services and
protocols. This involves a complete exchange between two parties, piggybacking on top
of existing protocols in the network stack.
The ebXML Messaging Architecture is shown in the following diagram.
Message Formating:
418
An ebXML message has to be formatted according to the ebXML Message Service Specification
and must conform to the MIME syntax, format, and encoding rules. The definition of the XML
elements are provided by an XML schema, which extends SOAP to define the ebXML message
header, trace header, manifest, status, and acknowledgment.
Conclusion:
An ebXML message has to be formatted according to the ebXML Message Service Specification
and must conform to the MIME syntax, format, and encoding rules. The definition of the XML
elements are provided by an XML schema, which extends SOAP to define the ebXML message
header, trace header, manifest, status, and acknowledgment.
The ebXML messaging ‐
• Uses SOAP with Attachments as payload envelope.
• Supports higher‐level semantics needed in business transactions. ( Secuirty and
Reliability )
• Runs over various communication protocols like HTTP, SMTP, FTP.
ebXML Usage Example
Following figure shows an ebXML scenario, which makes it easier to pick up the concept of
ebXML. The example is taken from the Technical Architecture Specification.
The example shows how organisations prepare for ebXML, search for new trading partners and
then engage in electronic business.
1. Company A browses the ebXML registry to see what is available online. At best,
company A can reuse all the existing business processes, documents, and core
components common to its industry that are already stored in the ebXML registry.
419
Otherwise company A designs the missing parts, stores them in the ebXML registry and
makes them available for its industry partners.
2. Company A decides to do electronic business the ebXML way and considers
implementing a local ebXML compliant application. An ebXML Business Service Interface
(BSI) provides the link between the company and the outside ebXML world. The
company has to create a Collaboration Protocol Profile (CPP) which describes the
supported business process capabilities, constraints and technical ebXML information
such as choice of encryption algorithms, encryption certificates and choice of transport
protocols.
3. Company A submits its CPP to a ebXML registry. From that point on, company A is
publicly listed in the ebXML registry and is likely to be discovered by other companies
querying for new trading partners.
4. Company B is already registered at the ebXML registry and is looking for new trading
partners. Company B queries the ebXML registry and receives the CPP of company A.
Company B then has two CPP's: Company A's CPP and its own. The two companies have
to come to an agreement on how to do business, which is called a Collaboration
Protocol Agreement (CPA) in the ebXML terminology. Company B uses an ebXML CPA
formation tool to derive a CPA from the requirements of the two CPP's
5. In this scenario company B communicates with company A directly and sends the newly
created CPA for acceptance to company A. Upon agreement of the CPA by company A,
both companies are ready for electronic business.
6. The companies then use the underlying ebXML framework and exchange business
documents conforming to the CPA. This means that both companies follow the business
processes defined in the CPA.
ebXML Summary
This tutorial tought you what is ebXML and what are its various elements. You have seen
complete architecture of ebXML technology. You have seen that the value proposition of
ebXML is that it provides the consistent business semantics and the standard technical
infrastructure for exchanges between businesses. You have explored a realy example on usage
of ebXML.
From this tutorial we concluded:
• ebXML is a worldwide project to standardize the exchange of electronic business data.
• ebXML is a group of related specifications that cover analysis of Business Processes and
Business Documents.
• ebXML is to enable consistent, secure, and interoperable message exchange XML‐based
infrastructure.
• ebXML is supported by hundreds of industry consortia, standards bodies, companies,
and individuals from around the world.
What is Next ?
Hope now you have basic understanding of ebXML. But now you have to explore it more
deeply. To go in detail you would need fore references and resources. We have given a list of
other ebXML resources in next session.
420
Did you ffind ebXML iinteresting ?? How was o
our tutorial, p
please send us your feed dback at
webmastter@tutorialspoint.com.
RosettaN
Net
Fo ounded in 1998, RosettaaNet is an inndependent, self‐fundedd, non‐profit consortium m
dedicated to the develop pment of
XML‐based stand dard electron
nic commercce interfacess
to aliggn the proceesses betweeen supply chhain partnerss on a globall basis
The RosettaN Net consortiu um includes IT companiees like IBM, Microsoft, EEDS, Netscap
pe,
O
Oracle, SAP, CCisco system
ms, Compaq and Intel
RosettaN
Net Framewo ork
Partne er Interface Processes (P
PIPs)
A Masster Dictionaary
Business D Dictionary
Technical Dictionary
A messsaging systeem
RosettaN
Net Framewo ork
Partner Interrface Processes (PIPs)
The se equence of ssteps requireed to executte an atomicc business prrocess between
two su
upply chain partners
The acctivities invo
olved
The ro oles of the p
partners
421
The business docu uments exch hanged
The se ecurity, auth
hentication, time‐outs off messages eexchanged
Some Exaample PIPs
RosettaN
Net PIPs
More than 10
M 00 PIPs grouped into clusters and th hen to segmeents
Foor example, Cluster 3 is Order Manaagement an nd Segment 3A in this clluster is about
Q
Quote and O rder Entry
As an exampl
A le of the PIPs in this segm
ment PIP3A4 4: Manage P Purchase Orrder
PIP 3A4: Manage Purrchase Order
Buyer creates
B s a Purchasee Order and sends it to tthe Seller
Seeller receive
es the Purchase Order and returns aa Purchase O Order Acceptance
The Buyer de etermines suuccess or faillure based o
on message content
RosettaN
Net Business Process Flow Diagram ffor PIP3A4
422
RosettaNet Business Dictionary
Contains information about the trading partners like
Business Properties (e.g. business address),
Business Data Entities (like ActionIdentity), and
Fundemental Business Data Entities (e.g. BusinessTaxIdentifier, AccountNumber)
There is only one business dictionary that encompasses all supply chains like Electronic
Components (EC), Information Technology (IT), etc.
RosettaNet Technical Dictionary and PIPs
Provide properties for describing products and services
The RosettaNet framework enables supply chain business partners to execute
interoperable electronic business (e‐business) processes by developing and maintaining
PIP implementation guidelines
RosettaNet distributes PIPs to the trading partners, who use these guidelines as a road
map to develop their own software applications
PIPs include all business logic, message flow, and message contents to enable alignment
of two processes
Doing Business through RosettaNet
In order to do electronic business within the RosettaNet framework, there are a number
of steps the partners have to go through
First, the supply chain partners come together and analyze their common inter‐
company business scenarios (i.e., public processes), that is, how they interact to do
business with each other, which documents they exchange and in what sequence
These inter‐company processes are in fact, the “as‐is” scenarios of their way of doing
business with each other
Then they re‐engineer these processes to define the electronic processes to be
implemented within the scope of the RosettaNet Framework
An electronic business process includes both the interactions between partner
companies, and the private processes within the company
RosettaNet provides guidelines only for PIPs which are the public part of the inter‐
company processes
423
424
Necessarry to differen
ntiate:
Public Busine ess Processe
es: The proceess among th he trading p
partners
RosettaNet defines and ffixes Public B Business Proocesses in terrms of PIPs
Private Busin ness Processses: The busiiness processses internal to the comp pany
An Example Public Business Proccess
425
Product CCategorization and Classsification in RosettaNet
Product categgorization an nd classificattion in RosettaNet is ach hieved throuugh RosettaN Net
Technical Dictionary (RTD D)
The RTD speccifies classess of productss with their p properties in
n XML DTD
That is, assocciated with aa product typ pe, there is aa collection o
of predefineed XML tags
Each productt class also has a corresp ponding Univversal Standard Products and Services
Classification (UNSPSC) code, basically used to diifferentiate tthe products in the cataalogs
thhat do not faall into IT do
omain
Global Trrade Item Nu umber (GTIN N)
In n RosettaNett, PIPs use G Global Trade Item Number (GTIN) to o identify pro
oducts
In n contrast to
o the producct numberingg that has beeen tradition nally used w
within the
Electronic Components supply chain,, GTINs do not contain eembedded in nformation tto
describe prod ducts
Traditional prroduct numb bers are split into “segmments‘”, each h representin ng specific
product charaacteristics
RosettaNet, o on the otherr hand, in ord der to stream mline the information exchange
thhroughout th he supply ch hain, definess GTINs to bee used by PIPs to identiffy products
In n this way proprietary m manufacture er and custo omer producct numbers aare avoided
426
In RosettaNet product information details can be obtained by querying a supply chain
partner's catalog by using the standard tags through “PIP2A5/EC Query Technical
Information” to return one or more GTINs along with product data
Hence RTD is used in associating the product data with GTINs
To implement the Technical Dictionary, an organization must categorize all saleable
products according to the product classes and class properties specified in the
Technical Dictionary
RosettaNet Messaging Structure
Execution of PIPs involves exchanging messages between the parties, and RosettaNet
provides a Business Message structure for this purpose
RosettaNet business messages (also termed as action or service messages) consist of a
message header and a message body
Both the header and the body are complete, valid XML documents
The header and the body are encoded within a multipart/Related MIME message
The message content is specified in individual PIPs
Each PIP has one or more "actions" that are described by means of an individual DTD or
schema
RosettaNet Implementation Framework (RNIF) specifies and provides for a consistent
mechanism to digitally sign and/or encrypt all RosettaNet messages (as needed),
independent of the transfer protocol, PIP and the specific business document being
exchanged
It also specifies a reliable messaging mechanism based on "Acknowledgements"
RosettaNet Transport
The PIP business message is encapsulated into a RosettaNet protocol message termed
as "RosettaNet Object”
The RosettaNet Object is composed of
a version and content length header,
content comprising a business action message, and
a digital signature length followed by a digital signature trailer
"RosettaNet Object” is encapsulated into a message of HTTP protocol and send as a as a
direct HTTP message
RosettaNet Implementation Successes
427
Compaq & Implementted PIP 3A4 and Redduced the orrder processsing lead tim
me to several
Delta PIP 3A7 minnutes; enabled Compaq to receive th he P.O.
(Taiwan) acknowledgement immediaately and allowed
deppartments innvolved in the process to
o receive
relaated informaation in real ttime
Intel & W
WPI Implementted PIP 3A4 and Auttomated pro
ocess reduce
ed the cycle ttime for
(Taiwan) PIP 3A7 ng between tthe companies
order processin
RosettaN
Net Partners
Applied xxml in verticcal industry
428
Topic: Vertical Industries
Resources Advertising
XML implementations in the advertising industry.
Bioinformatics
Work being done with XML in the field of Bioinformatics: the application of computational and
analytical methods to biological problems.
Business Consortia
Consortiums supporting the open development of industry‐specific XML frameworks and
vocabularies.
Commerce
XML languages and technologies specifically designed to facilitate electronic commerce.
DTDs
Links direct to the DTDs of various XML vocabularies.
Food
XML vocabularies for the food and restaurant industry.
Industrial
Vocabularies and software designed to facilitate trading within the industrial sector (Gas or
Metals, etc.
Insurance
XML vocabularies and initiatives in the Insurance industry.
Legal
Vocabularies describing legal information and analysis of the legal implications surrounding the
use of: metadata registries, digital transactions, schemas, agents, linking, and other XML‐
related technologies.
Medical
XML applications within the medical industry.
Music
Music‐related XML vocabularies designed to express everything from musical scores, to basic
notation, to synthesis diagrams and more.
OASIS
Informational links and specifications from OASIS (Organization for the Advancement of
Structured Information Standards).
Real Estate
XML applications within the real estate industry.
429
Science
XML implementations in the sciences.
Space Exploration
XML vocabularies and software in the field of space exploration and astronomy.
Syndication
XML implementations dealing with content repurposing and syndication.
Telecommunications
Specifications and technologies in the telecommunications, broadcast (tv and radio), and cable
industries.
Travel
Vocabularies and software dealing with the vertical industry of Travel.
Weather
Vocabularies and software dealing with the vertical industry of "weather".
Example
Realising XML benefits in Life Insurance
XML is impacting the life insurance industry in a big way. A large number of life insurance
companies has taken hold at all major life insurance companies. Most companies are in the
process of exploring XML usage and the need for a standard vocabulary in order to facilitate
communication. The widespread adoption and success of XML in the industry is due to comes
from the proliferation of several different versions of XML that are being created by the
vendors, institutions, and other organisations. For the organisations that are looking at XML,
the key is to understand the way how the technology works, and the specific business benefits
that it can bring. From there, a life insurer can begin to investigateing the technology
components it will need in order to implement and support XML within its own operations. This
paper takes a close look at how can the life insurance organisations benefit from the use of XML
in their technology initiatives for sustenance and to gain competitive advantage in a changing
marketplace.
The Promise of XML
XML is a text‐based mark‐up language that can be used to format data for storage or
transmission between and along computer networks. It allows a new level of automated
activity over networks by offering a rich, content‐oriented description of data. XML can
facilitate the sharing and manipulation of vast amounts of data that is stored in the legacy
applications distributed across disparate, incompatible databases in the insurance industry.
XML and Hypertext Markup Language (HTML) are both subsets of the Standard Generalised
430
Markup Language (SGML), which was created over 10 years ago to facilitate the representation
of data over the Internet. HTML was essentially described by the W3C to make it easier to
represent data. XML is being created to enhance the value of this data by allowing users to
utilise data in various ways.
The W3C is addressing the rapidly growing need of the Internet users to take their Web
experience from merely viewing data to manipulating it in the way they want. XML allows the
identification, exchange, and processing of the data in a manner that is mutually understood by
users’ systems, with custom formats for particular applications if required. Extensible Markup
Language (XML) is a flexible, standards‐based data format that is being developed by the World
Wide Web (W3C) Consortium. XML represents a significant improvement over current Internet
technologies like HTML in describing and exchanging data over the Internet. While HTML has
been instrumental in popularising the Web and making it accessible to millions of people by
providing a universal method of viewing data, XML promises to take Internet use a step further
by providing a universal method to represent and manipulate data.
Over the last few years, W3C has released specifications for XML as industry standard. A
number of financial services companies are exploring the possibility of using XML to solve their
business and technical problems and thereby, enhancing service delivery. In the retail banking
sector in US, XML is already being used as the basis for several bill payment and bill
presentment initiatives. On the other hand, in securities industry, it has been used to
define an underlying data format for Financial Information Exchange (FIX) protocol that helps
the investment managers exchange trade instructions with broker/dealers. A move is also
underway in the wholesale banking industry, to make XML the medium for electronic data
interchange (EDI) and business‐to‐business e‐commerce, which will facilitate the usage by
smaller firms, who now find it too expensive to adopt EDI.
How does insurance industry as a whole benefit from adopting XML? With the need of
streamlining processes and improving customer service and increasing focus on B2B approach,
there is certainly a perceptible business case to look forward to and there are organisations and
committees, which are working towards common industry standards.
Benefits of XML
Apart from the general benefits observed across the industries, XML delivers significant
strategic advantage to the insurance industry. The need for a robust and flexible format is
particularly evident to allow disparate systems and users to access and utilise data stored in
multiple formats. The potential impact of XML is enormous. With XML, every participant in the
value chain—agents, insurance companies, brokers, and underwriters—can streamline
their paper‐intensive processes and standardise the way in which they interact and exchange
information. For example, insurance underwriting requires multiple document exchanges
among many parties, and many decision points at which coverage can be approved or declined.
XML is a critical technology for standardising and automating such information exchange
processes. This is due to the main benefits of using XML, which include:
Common data format: Very important for integrating different applications and standards not
only within the organisation but across organisations using proprietary systems owing to the
evolution of formats and standards within organisations and by industry groups and
associations.
Separation of content from presentation: Allows data content to be altered without affecting
the presentation of data thereby, providing the much needed flexibility in standardisation of
presentation and ‘personalised’ experience for the end customers. At the same time, this helps
431
support the specification of deep structures that are needed in representing database schemas
or object‐oriented hierarchies.
System and vendor independence: Because XML is platform‐independent, it does not matter,
which operating systems or platforms you or your trading partners are using. Thus, diverse
organisations such as insurance companies, agents and underwriters do not need to share the
same systems or platforms‐they only need to agree on standard definitions for the document
types they exchange.
Low entry costs: The cost of entry for XML is also much lower than other data exchange
alternatives, such as Electronic Data Interchange (EDI). As EDI is a documented standard, the
technologies needed to implement it are expensive and require extensive customization and
programming. Thus, EDI is used mostly in very large organisations with sizeable IT budgets and
internal resources. As opposed to EDI, XML is far less expensive and cumbersome, opening the
door for small, low‐tech companies to participate in online data exchange with their partners.
Opportunity for standardisation: XML enables the creation of document type definitions
(DTDs) that define the data structure and fields of a given document type. Assuming all the
trading partners in an industry or vertical market use the same DTDs, these partners can trade
data seamlessly, with little need for additional customisation or integration. To this end,
insurance companies have formed consortiums to define standard DTDs for documents
that are commonly exchanged among their trading partners.
Validation: XML allows for “parsing” of document like SGML, i.e. checking of data for structural
validity with respect to defined DTD, when received by the application. Validation ensures
smooth, accurate, and efficient exchange of data between the sendingand the receiving
applications.
These strengths have enabled XML to handle these broad categories of application areas
handled poorly by HTML:
Web client operates between multiple heterogeneous databases. For example, an independent
agent will need to enter policy information just once in order to submit policy applications
across multiple insurance providers Web client provides personalised views of data to different
users, for example, company Web site offering tailored product information based on
customer’s profile Where information needs to be mined tailored to the needs of individual
users. For example, Intranet search engine that captures customer information relevant to
cross selling and product upgrade for sales and marketing representatives Migration of
processing load from Web server to Web client (e.g., an agency management system that can
act as a remote client of an insurer’s system, still integrate all agency activity in a single
environment) .
With these benefits, XML will have a profound impact on three major areas of insurance
business: channel automation, syndication and integration.
Channel Automation
It is increasingly evident that online channels form a viable market for the distribution and
servicing of insurance products. As insurers have recognised the potential benefits of these
channels, the amount of attention that they receive. has increased rapidly. Using the Internet
technologies like XML in support of an online distribution model, or
business‐to‐customer (B2C) interaction in general, is a first step. As a second step, the benefits
of XML will have solid impact on the nexus between consumers and insurance provider legacy
systems for servicing, between agencies and providers for distribution, and for networked data
transmission within provider themselves. The issues that arise e when adopting XML in the
432
distribution support are indicative of the challenges experienced across industries in developing
XML. They include the need for standards‐setting organisations and committees, the progress
of the initiatives led by life insurance companies and vendors, and the danger of an industry
fragmentation around competing XML specifications.
Automating the front‐end processing of these lines involves, first and foremost, automated
quotes on new contracts and valuation of the existing policy. The Internet offers the possibility
of direct interface with an insurance company’s quotation engine, which can mean the
difference between an agent presenting a prospect with an indicative quote and an actual one.
Real‐time projections and quotations present major opportunities for increased operational
efficiencies, apart from increased customer satisfaction.
An Internet link between agent and insurer condenses the whole data submission process.
Besides the efficiency of legacy rating, agents can eliminate the time normally taken in mailing
or storing‐and‐forwarding of the policy amendments and riders. The benefits of real‐time data
transmittal in this case are obvious: it can reduce process time to hours or minutes, that
previously required days or weeks. The real benefit for the agent stems from the consolidation
of many front‐end access points through which the agent communicates with its product
providers. The movement to unify the agency interface in this manner has been developed
under the banner of Single‐Entry, Multiple Carrier Interface (SEMCI) in US. Through much
iteration over time, the SEMCI movement has haltingly attempted to simplify the agency
environment by leveraging the standards in electronic data interchange. Proponents of the
SEMCI initiative believe that real progress in agency automation involves the insurance
433
company supporting transactions in the agency system, not a Web browser. This would involve
reassessing the role of the Internet in agency automation Generally, insurance companies are
reluctant to let agents process business without direct
approval from a company underwriter. Automating the rating and issuance of a policy would
omit the opportunity for direct underwriter review and approval of risk. Consequently,
insurance providers have been slow to relinquish the underwriting authority and the
accompanying policy information to the agent, especially over a world‐wide, open network.
These concerns can be addressed by developing and incorporating company‐unique business
and underwriting rules into Internet‐based interface architecture of the life insurer. The
propensity for writing bad business can also be avoided by rewarding the agents on the basis of
the quality of business they write rather than the quantity. Another approach for using XML for
the agency‐product provider networks can be done by adopting the application service
providers (ASPs) model. The ASP business model has emerged as the Internet’s version of
service bureau, which would help institutions, offset hardware and maintenance costs of in‐
house IT applications. This may be more significant and cost‐effective for the small to middle‐
market business.
Syndication
The other big impact XML will have on insurance companies is the ability to easily syndicate
content out to multiple marketplaces and commerce sites hosted by outside parties. Insurance
exchanges are web‐based business centres for buyers and sellers of insurance products and
services. The exchange would allow prequalified participants to conduct transactions in a
neutral environment and help to determine pricing information dynamically. The potential of
such online exchanges lies primarily in the extra commoditised lines of life products such as
protection or new generation pension products, though transactional conduit would be weak
for the other lines of products requiring advice and customisation. B2B exchanges impact all
lines of business, however, by electronically linking business
partners in the insurance supply chain: sellers, underwriters, and service providers. Whether it
is between third‐party to company or single‐provider exchange or multiple‐product provider
product exchange, all of them have potential use of XML in sharing of information due to its
advantages of data standardisation and validation. For any organisation, the end goal of
syndication is to get its content, products or services in front of as many potential
customers as possible. With literally hundreds of e‐marketplaces in operation today, it makes
sense for organisations to leverage them as a key sales channel. For the marketplaces
themselves, accepting syndicated content from multiple suppliers gives them highly
differentiated sites that will attract more customers and generate more revenue.
For example, if a company formats product information pieces in XML, the various
marketplaces to which this information is sent can ensure that the defined fields (policy type,
state, price, etc.) are used appropriately on their sites. Manually formatting data and content
for different distribution points is highly cumbersome; XML lets you do it once, and syndicate
the information to any number of distribution points.
Integration and Interoperability
Use of a common XML data format within an insurance company can make information
accessible and easy to update across the enterprise. By logically linking separate applications
and lines of business, this can deliver to the insurers greater success in customer relationship
management (CRM). The reduction of the efforts and error can dramatically shrink the costs
associated with data integration and database management. Integration across internal
434
application environment of an insurance company can be achieved using B2B integration
solutions, including any internal middleware solution (i.e., EAI) that the firm may currently
utilise to integrate applications in‐house. It is critical that the integration solution can be able to
support the use of XML to access data stored in the range of back‐office systems.
Internally, the usage of XML makes it easier for data in multiple disparate systems to be used
together or shared in a common application. For example, organisations may have one legacy
system for customer and policy information, and another for accounting and agency
management functions. By generating XML output from a given system, the data can be shared
with different systems or custom applications in order to bring the disparate systems together.
XML can save significant time and effort in integrating internal systems via traditional
programming and custom integration methods. For example, if an insurer needs to import data
from multiple customer databases into a single repository, or if there is a need to integrate the
systems of two different companies brought together by a merger, XML is a great approach.
Externally, XML enables companies to extract data from their back‐end systems and to put it
into a standard format that external parties can easily ingest directly into their own back‐end
systems. For example, a company could take customer details from its channels over the
extranet and send real‐time quotes and illustration on its latest products or can provide billing
or commission information online to its channel partners and the receiving company can direct
the information to its own information systems. In the past, this level of integration would
require costly custom development for every trading partner. Interoperability is one of the
biggest bangs for the buck that XML offers. This includes enabling internal systems to share
data with each other, as well as enabling companies to share data with external business
partners. In practice, however, an insurance company’s existing technology infrastructure is
often a
limiting factor in the move towards automation and integration. Real‐time transactions are
possible only if core processing systems and data warehouses at the insurance company can
435
support them. Restructuring of the legacy systems that are ill prepared for e‐commerce can be
a large and costly affair r.
Case Study: Policy servicing through the Web
An example of use of XML is providing self‐service capabilities for policyholders to effect change
to their policy attributes themselves. These policy changes may include standard product
service features like termination, fund switching, partial surrender or withdrawals. Changes can
become complex and result in a new quote and change of contract like changing effective or
maturity dates, assured sum, premiums, contract type or reinstating or continuations to a new
policy etc. Providing the ability to the customers to effect such changes directly would improve
customer service quality and offer proactive support as well as reduced operational processing
costs. For a large insurance company this would mean mediating between multiple
heterogeneous policy and customer databases. Also the processing load for such a service can
be high and would help to distribute processing loads to the Web client. Also, a policyholder
may have multiple policies like life, pensions, and saving products and that would mean
multiple options and multiple ways of presenting and viewing the same data. All these
scenarios point towards the use of XML in making various types of data available through the
Web client.
436
1.1 eCustoms as a Main Pillar of the Pan‐European eGovernment Strategy
Leveraging information and communication technology (ICT) in the field of public
administration (eGovernment) is regarded as a mission critical factor for achieving growth and
competitiveness in Europe [88] p. 2. Global competitiveness of businesses is significantly
influenced by transaction costs incurred in dealing with public administrations [17] p. 21. In the
process of creating the prerequisites for better and more efficient public administration,
eGovernment is considered to be an enabler as it incorporates the use of information and
communication technologies combined with organizational change in orderto improve public
services [17] p. 7.
At best, successful eGovernment solutions are intended to be beneficial to both public
administrations and businesses [92] p. 345. However, eGovernment initiatives have to face the
challenge of Business‐to‐Government (B2G) and Government‐to‐Government (G2G)
integration, comprising seamless exchange of information, interoperation of independent
eGovernment information systems, and coordination of governmental processes on the one
side with information systems and processes of the economic operators on the other side [70]
p. 889. This requires interoperability within or between organizations (be it public or prívate),
nationally or across Europe [17] p. 19. In the following we will put our focus on seamless B2G
integration. Regarding B2G, Krcmar and Wolf have introduced the notion of 'Collaborative
eGovernment', which postulates seamless integration of eBusiness infrastructures of
enterprises with information systems of administrations from an end‐to‐end process
perspective [91] p. 179. More specifically, we will inquire the interaction between economic
operators and government agencies in the context of European eCustom procedures.
437
Regarding eCustoms procedures, Baida et al. point out that crucial EU control procedures are
still paper based [3] p. 12. Henee, new customs control procedures are required which can be
supported much more effectively and efficiently by the use of IT. However, designing and
implementing changes in customs control procedures has to take into account technological,
financial and political issues and has to balance between greater security demands in
international trade and reduction of administrative work. Meeting both these objectives is
generally perceived as a dilemma (cp. [3] p. 1) that can only be dissolved by process integration
on the basis of seamless exchange of customs information between economic operators and
national customs authorities as well as among national customs authorities.
Therefore, the European Commission has issued the Electronic Customs Multi‐Annual Strategic
Plan (MASP) [18], pointing out the objectives of a European wide eCustoms strategy. On the
one hand, the EU aims at operational efficieney by reducing administrative burdens and
improving dearance times of customs procedures. On the other hand, the EU focuses on
achieving control effectiveness in increasing security of trade and safety of goods as well as
enhancing health and environmental protection while safeguarding intellectual property and
preventing fiscal fraud [18] p. 5. The centerpiece of MASP is the so‐called Single Window Access
(SWA) concept, which means that traders have access to a single electronic point for import,
transit, excise and export transactions, irrespective of the member state in which the
transaction starts or ends [18] p. 6. The SWA concept provides a 'single point of access' to
existing and future computerized customs systems of the respective member states. This can
only be achieved by integration of existing community customs procedures and systems, such
as the New Computerised Transit System (NCTS) and future systems such as the Automated
Export System (AES), the Automated Import System (AIS) and the Excise Movement and Control
System (EMCS) with the ERP systems of the economic operators.
438
1.2 Information Technology for Adoption and Intelligent Design for eGovernment (ITAIDE)
The ITAIDE project is a European research project funded by the European Commission under
the 6th Framework Information Society Technology (IST) programme with the objective to
develop innovative eCustoms solutions [40]. ITAIDE started in January 2006 and will end in June
2010. The ITAIDE consortium consists of tax and customs administrations, IT providers, multi‐
nationals from different sectors, the standardization body of the United Nations, and European
universities from the Netherlands, Germany, Denmark, and Finland. The ITAIDE project aims at
developing recommendations, models, methods, and tools for a more effective and efficient
exchange of taxation and customs information. The goal is to lower the administrative burden
for economic operators while at the same time meeting tightened control, security and
transparency requirements of public administration in international trade. From a business
perspective, research is being conducted by developing a procedure redesign methodology for
simplified customs and taxation procedures and elaborating a network collaboration model for
developing private‐public partnerships between customs and taxation offices, business and
technology providers for more effective provisioning of public administration eServices [40] p.
6.
From an information system perspective, a common information model for electronic trade
documents based on existing international standards from UN/CEFACT, World Customs
Organization (WCO), and EU‐DG Taxation & Customs in combination with service oriented
concepts and standards will form the foundation for ensuring a technically and semantically
interoperable exchange of business documents between different eCustoms applications
throughout Europe [40] p. 8. As depicted in Figure 2, the underlying premise in achieving this
aim is pan‐European interoperability enforcing a common understanding and mutual
agreement on a technical, procedural and organizational level between businesses and
governments as well as between governmental authorities of the different member states.
In order to valídate the theoretical research artifacts elaborated within the project, ITAIDE
avails itself of the Living Lab concept [78]. Four European Living Labs provide the real‐life
settings in which the eCustoms solutions are being developed and their broader influence on
439
diffusion and adoption is being investigated. One of these Living Labs is the Beer Living Lab
(BLL), where the shipment of beer outside the EU (export) and within the EU (intra‐community
supplies) is being examined. Products such as alcohol, tobáceo or energy produets are
subjected to excise duties, which are indirect taxes on consumption of certain products that are
raised in the country of consumption. In our article we refer to the application scenario for
intra‐EU movement of excise goods as the beer transport is subjected to duty suspensión in the
country of origin.
Web services for mobile devices
Introduction
‣ Many Web Services and APIs were originally developed with server to server or server
to browser in mind, not mobile applications
‣ Mobile platforms have their own set of challenges given:
- Bandwidth
- Memory and CPU Availability
- Storage Capacity
- Connectivity Options and Issues
- Security
- User Interaction and Display
Mobile Integration Challenges
API Developer Programs
‣ Is mobile access allowed?
‣ Other considerations:
‣ Call Limitations (# per second, total per hour/day, pricing above)
‣ Caching and Storage of Data
‣ Persistence of Data – Length of time stored
‣ Freshness of Data – Length of time before refreshing
440
Caching and Storage Limits
‣ Persistence Limitations
‣ You may otherwise store Amazon Properties generally (other than pricing or
availability information) for caching purposes for up to 24 hours. However, you
may store the following Amazon Properties for caching purposes for up to 1
month:…
From the Amazon Web Services User Agreement
‣ Refreshing Requirements
‣ You must refresh and re‐display any Amazon Properties (other than pricing or
availability information) at least once every 24 hours or once every month, as
applicable, by making a call to AWS and refreshing your Application’s contents
immediately after the call.
From the Amazon Web Services User Agreement
Application Verification
‣ Certification of your Mobile Application
‣ “If any of your Facebook Platform Applications are client‐resident (including on a
mobile device), you agree to furnish a copy of such Facebook Platform
Applications and any supporting documentation upon request for the purpose of
verifying your compliance with this Agreement; and …”
From the Facebook Developer Terms of Service
‣ Some API sets require certification on top of distributor certification
‣ Determine Costs / Timeframe / Effort given a mobile app
‣ Prepare for testing – typically looking for error handling, API abuse
‣ Certification is a positive!
‣ Credibility, validation, marketing
‣ Keeps the neighborhood safe
Mobile Techniques
Authentication vs. Authorization
‣ The Difference
Authentication from the API provider – API Key
Authorization from the user – authToken
‣ Session Key
- By providing combination of API Key and authToken can then receive the session
key
- What is the shelf life of the session key
‣ Authorization will commonly affect user’s experience on mobile
‣ eBay authentication and authorization screens
Creating Web Services
‣ Try to serve lowest common device
‣ Balance flexibility with overhead in calls:
- Filtering of criteria and paging of data functionality
- Split out high traffic calls versus critical requests
- Easy but secure authentication and authorization for both the consumer and
application
441
‣ Benefit for mobile consumers, application programmers and web service providers
- Decreased round trips
- Increased efficiency of calls and applications
- Better use of call volume restrictions
- User experience improves
Wrap Up
Provide user preference options that translate into better tuned API calls‐ number of items to
show on a page as an example This will translate to data efficiency, fewer callers and greater
use of data retrieved
Minimize the number of trips, combine in single calls when possible Leverage caching and data
storage when allowed Balance data granularity with length of returns for parsing
Final thought to ponder
The mobile device that is typically serving as a web service client could in the now or
future be a web service provider…
A Web Services Strategy for Mobile Phones
In most web services presentations, the speaker has a slide of a mobile phone, a PDA, a
computer, and other devices communicating with a web service via SOAP and HTTP. You quickly
envision a utopia of universal access but overlook the fact that your old Nokia doesn't do XML
web services. If you have a J2ME‐enabled phone connected to the Internet, it's very possible to
interact with web services directly. However, the majority of mobile phone users do not have
these phones, which means an alternative mode of access must be provided.
Some developers assume that deploying web services is all about publishing a WSDL file. This
might be enough for integrators to use your web service, but in order to facilitate widespread
adoption of your service, some type of UI for regular users should be deployed to abstract away
the technical details. It is possible to dynamically generate HTML web pages with input forms
from WSDL files (SOAPClient is a good example of this). This offers a good environment for
testing web services and, in some cases, might be all you really need to offer your users.
Creating an HTML interface to your web service insures access from any computer, but what
about mobile phones?
Mobile web services
Wireless carriers currently offer services that allow information to be "pushed" to your phone
or accessed from your phone such as weather, stock quotes, news, traffic, and sports updates.
With web services, phones now have the potential to actually consume useful services. But
before developing a mobile client, you might want to think twice before taking the SOAP/HTTP
route. First of all, turning your phone into a SOAP client might have some performance costs
related to slow data speeds and processing both HTTP commands and XML. Secondly, most
phones don't come with web services support built in. Finally, you can hide the web services
complexity and leverage existing technologies to make use of their widespread availability. This
would require a gateway to sit in between the phone and the web service to handle the passing
and conversion of messages but you no longer have to worry about client‐side performance
issues or even deploying a client.
The gateway would take care of all the SOAP/HTTP request and response handling and then
returns results back to the mobile phone in a supported format. There are really only a few
442
means you can rely on being available: text‐messaging, voice, and data services. For text‐
messaging, you can deploy a bot that sits between your client and your service. For voice, you
can allow your clients to call a number and issue voice commands by creating a VoiceXML
wrapper around your web service. A good example of this is Tellme's service that gives voice
driving directions using Microsoft's MapPoint web service. For data services, you can have a
WAP gateway that takes requests from a WML page, calls the service, and then returns the
results back to a WAP browser in the WML format. Some phones have XHTML browsers that
you can take advantage of and some even offer development environments that allow you to
call the web service directly from the phone. Even though you cannot guarantee that your users
will have phones that offer such features now, using data services along with a SOAP messaging
library might be a common solution for deploying mobile clients in the future. Development
environments such as J2ME and the .NET Compact Framework give developers robust
platforms for developing advanced mobile clients. However, for simple clients that mainly
access text‐based information, deploying a gateway is sufficient.
Deploy Gateways for Maximum Availability
When deciding what kind of client to deploy, your final decision should be based on making the
client available to as many users as possible. This makes deploying a gateway the best solution.
You no longer have to worry about technology support or even client application installation.
For example, if you deploy a Short Message Service (SMS) gateway, you have just turned every
mobile phone into a potential client. By using SMS you can also take advantage of its "store and
forward" features which guarantee message delivery. Sending a SMS message from a client is
pretty straightforward, but messages sent between service providers and SMS gateways are
quite the opposite. The SMS Forum recently announced a plan to encode these messages using
SOAP and HTTP to insure interoperability ‐‐ another testament to the adoption of web services.
VoiceXML is a language for building voice applications much like you hear when calling
customer service hotlines. It is an XML‐based standard developed by the W3C's Voice Browser
Working Group. Most VoiceXML developer portals give you access to a phone number for
testing your application; however, VoiceXML is not limited to phones and can actually be
accessed by any VoiceXML‐enabled client. This client can be the usual phone, but it could also
be an existing Web browser with a built in VoiceXML interpreter. A good example of this is the
multimodal browser being developed by IBM and Opera based on the XHTML+Voice (X+V)
proposed specification. The term "multimodal" simply refers to multiple modes of interaction
by extending user interfaces to include input from speech, keyboards, pointing devices, touch
pads, electronic pens, and any other type of input device. The W3C also has a Multimodal
Interaction Working Group that is developing standards to turn the concept of universal
accessibility into a reality. The basic concept of VoiceXML is to issue prompts to a user and then
have that user respond using their voice. Once the user's voice is captured, the voice
application can perform a specific task and return the results. Due to the inaccuracies of voice
recognition, I highly recommend writing applications that accept a predefined set of commands
in combination with very little dynamic speech input from the user.
The Wireless Application Protocol (WAP) is a set of standards to enable wireless access to
Internet services from resource‐constrained mobile devices. WAP provides an entire
architecture to make a mini‐Web possible by defining standards such as the Wireless Markup
Language (WML) and WMLScript. Think of WML and WMLScript as HTML and JavaScript,
443
respectively, optimized for the mobile phone world. WAP development is fairly easy and you
can start serving up WML pages instantly by setting the appropriate MIME types on your web
server. Wireless carriers provide WAP gateways that translate the HTTP to the equivalent WAP
optimizations. Since you are using a web server to serve up content, you can use APIs on the
server‐side to call web services and return the results in WML format.
Use J2ME for Advanced UIs
J2ME is gaining a lot of momentum as new J2ME‐enabled phones hit the market. With every
major manufacturer embedding Java on some of their phones, Java in the mobile space is no
longer hype. An interesting note on J2ME is that all Mobile Information Device Profile (MIDP)
implementations must provide support for the HTTP protocol. This guarantees the availability
of HTTP as a transport mechanism for web services. There is currently no standardized web
services support for J2ME, but JSR 172 defines a J2ME Web Services Specification that will
eventually provide standard access from J2ME to web services. In the mean‐time, you can use a
third party library such as kSOAP. J2ME also has SMS support through the Wireless Messaging
API (WMA) optional package.
Text‐messaging
Text‐messaging development might sound simple, but it is the most confusing route to take. I
mentioned SMS earlier, but we also have EMS, MMS, and other Instant Messaging protocols on
the scene. AOL Instant Messenger (AIM), MSN Messenger, and Yahoo Messenger all offer
bridges from their protocols to SMS. T‐Mobile even has AIM embedded into some of its phones.
An easy way to build a service would be to connect to one of these popular IM protocols to
avoid dealing with the SMS mess. When you build SMS applications directly, you have to worry
about possible hardware requirements, SMS gateway deployment, and carrier partnership.
Don't forget, you will require some kind of phone number for users to access your application
and, since you are connecting to a cellular network, a business account.
SMS Workarounds
Most wireless carriers allow anyone to send SMS messages to their customers via email.
Although users cannot reply to the sender of the email, this is useful for notification messages.
You have to be a "premier partner" to be able to send and receive, but the average
programmer can use an SMS Broker to have the same access without going broke. SMS Brokers
partner with a cellular service provider and then lease access to developers. Simplewire offers
one such development environment for creating and testing your wireless messaging
application. They have a free evaluation version and they also offer paid commercial
deployment packages. ActiveBuddy's BuddyScript SDK is very complete solution for developing,
testing, and deploying your interactive agent. The BuddyScript SDK includes its own scripting
language, IDE, server for deploying agents, and much more. You can launch your interactive
agent to mobile users via SMS and WAP without setting up a relationship with a wireless
carrier. The BuddyScript Server included in the SDK can exchange data via SOAP over HTTP. The
only downside to the BuddyScript approach is that you are pretty much tied in to their
development platform.
444
For the tech savvy, you might be able to buy a data cable for your GSM phone and, with the
appropriate software, process the modem commands. Of course this is not suitable for a
commercial application, but they do have dedicated GSM modems. So, theoretically, you can
get a personal account with a wireless GSM carrier, take out the SIM card, put it in a dedicated
GSM modem, and hack your way through to make it work. And, on the subject of hacking, I
must mention Kannel, an open source WAP and SMS gateway.
VoiceXML
Accessing web services with VoiceXML has never been easier using BeVocal's JavaScript SOAP
API. The BeVocal Cafe is an excellent place to get started; it offers a phone number you can call
to access your application as well as documentation, samples, and tools.
Earlier I mentioned how Tellme offers access to voice driving directions using the MapPoint
web service. Tellme also offers a robust development environment for developing VoiceXML
applications. Unlike BeVocal, Tellme does not have built‐in support for SOAP. You would have
to roll your own Javascript Library or you could forward the input to an external application
with SOAP support (e.g. a CGI script or Servlet) and then have that application generate
VoiceXML back to the client. Have you ever picked up the phone only to hear an automated
message? The telemarketers are getting lazy, but now you too can bug people with Tellme's
Notifier service which allows your VoiceXML application to initiate the call to your clients. The
Notifier service can be useful for reminders of appointments and other types of asynchronous
messages.
WAP
Since you cannot view WML on regular browsers, you will need a WAP browser emulator for
testing your WAP applications. WAP also allows applications to initiate sessions through the use
of the use WAP Push Access Protocol. By creating push‐enabled applications, you can
implement asynchronous messaging. Openwave provides a WAP Push Library as part of its
Openwave Mobile Developer Toolkit (OMDT). The OMDT is an excellent place to get started
with its inclusion of emulators, messaging APIs, and support for the latest technologies in the
mobile world. Nokia also has an excellent resource page that has tools and documents to help
you get started.
If you don't have access to a WAP gateway don't forget about the open source Kannel project.
The open source crowd will also enjoy Enhydra's open source Java/XML application server.
Although you can serve up WML pages using any Web server, Enhydra provides an excellent
framework for separating presentation from code. Using Enhydra's XMLC, you can convert a
specially formatted WML document into objects that you can access from your Java code.
J2ME
The Wireless Messaging API (WMA) package gives you access to SMS functionality but there are
third party packages that are more suitable for XML messaging. kSOAP, another open source
project from Enhydra, is a lightweight SOAP implementation suitable for J2ME. If you think
SOAP is too bloated and want to shave off some overhead, Enhydra also has an XML‐RPC
implementation (kXML‐RPC project). For the Java RMI and JMS fans out there, there is a J2ME
445
RMI optional package and a commercial wireless JMS solution available. To speed up your
development efforts, you might want to consider using one of several SDKs available. Sun and
Nokia's toolkits can run as standalone tools and also integrate into SunONE and JBuilder.
.NET Compact Framework
The .NET Compact Framework provides a robust environment for developing mobile
applications. In traditional Microsoft point‐and‐click fashion, you can have an application
running with minimal effort. If you are a .NET developer, you will enjoy its .NET Framework
heritage and its integration with Visual Studio .NET. The .NET Compact Framework will
definitely have its biggest impact in the PDA market where Windows enjoys some success. The
downside is that the .NET Compact Framework has no market share whatsoever in the mobile
phone space and it will try to break into a tight market.
Conclusion
With the availability of packet‐switched, always‐on networks for mobile phones becoming more
widespread, mobile access to data will become easier than ever. web services seem like the
natural solution for integration problems, but mobile phones do not have the privilege of
guaranteeing support for the core web services technologies. However, you can still effectively
deploy a web service for mobile clients by deploying a client interface using existing
technologies available. Technologies such as SMS, WAP, and VoiceXML can be utilized to make
this possible. As more mobile phones support J2ME, you can even choose to deploy a pure
SOAP client without the need for a middleman.
Strategy approach to Web Services standardization and interoperability in mobile devices
• Integrate with the Internet, i.e. whatever is winning out there
• Drive generic Internet standards in relevant organizations such as W3C, OASIS, Liberty
Alliance and WS‐I
• Try to avoid mobile specific standards at all cost (OMA, 3GPP)
• Work with IT‐vendors and other industry players to introduce optimizations that are
beneficial to all parties including mobile sw‐platform providers
• Drive use of well specified, interoperable specifications to minimize cost for all parties
included on the value chain
446
Standardization challenges
• In general, it seems standardization, while doing adequate job, is moving slower than the
marketplace (but this is quite normal)
• For any given problem there seems to be at least 2 competing solutions or....
• There are multiple versions of the solution, some of which are standardized, some of which
are in the midst of standardization
• Mobile SW‐platforms are today mostly not easily upgradeable, thus you need to:
• Build an internal WS‐architecture which anticipates changes
• At any given time try to pick up the “winning horses”on the marketplace to be included in the
platform
• Currently not too many folks think about WS‐clients on devices (PC, TV, Digi TV box, Mobile
phone etc...), a lot of the work is server‐to‐server focused and not always consumer oriented
• Standardization is split between several orgs (W3C, OASIS, WS‐I) but fortunately it seems
they all have found their role
Interoperability challenges
• The Web Services architecture, and what specifications make it, is not yet properly nailed
down
• REST and WS‐* approach gaining wide acceptance, neither properly standardized yet
• Whilst Basic web services building blocks are standardized and quite adequately profiled
(WS‐I Basic Profile 1.x), there is still a lot to do with more advanced Web service specifications
• Many of the advanced Web service specifications, mostly emergingfrom WS‐* family, are
very generic by default and need profiling to ensure interoperability, but this is still in the works
• However, there is good adoption on the marketplace, and thus vendors do need to ship Web
Services solutions and keep on upgrading them as the specs mature
447
• For a Mobile SW‐platform this situation is quite new compared to older Telecom
standardization (standardize first, implement then) and also due to the fact that Over The Air
(OTA) update of Mobile SW is just emerging
• It is vital that emerging specifications contains a well defined, interoperable and tested core
set of features –generality is good but not at the expense of interoperability
• There is no Internet Service platform, Liberty is an effort to that direction, but today there is
just divergence between the players
General Performance worries around mobile devices and Web Services
• Processing requirements:
• Claim: CPUs in mobile devices can’t handle complex XML Parsing and XML Security
(signature, en/decryption) and in general can’t deal with the processing needs of WS
• Truth: Based on Nokia demo/pilot activities during 2002‐2005 we know that current
Smartphone implementations have no problems on handling WS messages/features and it only
gets better very soon…..
• Limited downlink/uplink bandwidth:
• Claim: WS and XML are verbose, thus the downlink/uplink capacitygenerally available for
mobile devices can’t provide acceptable response times for the applications
• Truth: WS applications typically send/receive information only when needed, thus good
design principles can reduce the overhead significantly andprovide acceptable response times
to users even with basic GPRS data rates (<40kbps).
Emergence of widely used faster uplinks such as E‐GPRS (~100kbps) remove this worry, and it
only gets better very soon (W‐CDMA +300kbps, Wlan 11mbps)
• Introducing compression, such as GZIP (part of ZLIB) can help while waiting for W3C EXI
results
SOA for S60 Platform: Service Middleware offering
Standard functionality:
• Connect to WS‐I Basic compliant services
• Send and receive messages over mobile networks and the Internet using Web Services
protocols
• Maintain control over Web Services sessions
• Apply communication security with OASIS WS‐Security Identity Web Services offering:
448
• Connect to Liberty Identity Web Services Framework compliant services
• Manage services
Web Services framework for devices We are working on:
449
450
Unit V
SSeem
maannttiicc W
Weebb
RRoollee O
Off M
Meettaa DDaattaa IInn W
Weebb CCoonntteenntt
RReessoouurrccee DDeessccrriippttiioonn FFrraam
meew
woorrkk
RRDDFF SScchheem
maa
Architecture Of Semantic Web
Content Management Workflow
XXllaanngg
W
Wssffll
SSeeccuurriinngg W
Weebb SSeerrvviicceess
Semantic web
Knowledge Representation in the Web Context
Knowledge Representation
•ԛObjects/Instances/Individuals
–ԛElements of the domain of discourse
–ԛEquivalent to constants in FOL
•ԛTypes/Classes/Concepts
451
–ԛ Sets of objects sharing certain characteristics
–ԛ Equivalent to unary predicates in FOL
•ԛRelations/Properties/Roles
–ԛ Sets of pairs (tuples) of objects
–ԛ Equivalent to binary predicates in FOL
•ԛSuch languages are/can be:
–ԛWell understood
–ԛFormally specified
–ԛ (Relatively) easy to use
–ԛ Amenable to machine processing
Challenges of Web to KR
•ԛ Scale
•ԛ Distributed
•ԛ Dynamic
•ԛ Paradoxes
•ԛ Incomplete language
–ԛ Closed world vs. open world assumptions
Current Web (“Syntactic” Web)
•ԛ Untyped resources named by URLs
•ԛ Untyped relationships (href with anchor text)
•ԛ User oriented – document rendering
•ԛ Machines must infer Information
452
The Information in a Web Page
•ԛ Markup connotes semantics (bold, colors, font…)
•ԛ Humans interpret semantics
•ԛ Rendering semantics is not clear or available to machines
453
Semantic Web
•ԛ Resources typed, types defined by URIs
•ԛ Relationships typed, types defined by URIs
•ԛ Types are structured and are first‐class
•ԛ Machines can inference
Some comments on the reality of the semantic web
•ԛ Lots of the hype seems to imply that the “whole web” will become
a semantic web
•ԛ But too much implies that this will happen through “better
metadata”
–ԛ By whom!
•ԛ Keyword “whole web” search engines keep getting remarkably
better and will continue to dominate
•ԛ But…
–ԛ High recall, low precision
–ԛ Results sensitive to vocabulary
–ԛ Result granularity is single web page
But…
454
•ԛ In constrained domains (b2b, enterprise search, scholarship) better
information
management, knowledge representation makes sense
•ԛ Notions like ontologies are very useful and important
•ԛ There is lots of room for automated learning techniques to be
applied to the problem
•ԛ Some of the tools are very useful right now and being used in large
scale:
–ԛ Network analysis
Role of Meta data in web content
Definitions of Metadata
Metadata is a recent coinage though not a recent concept. In today's jargon,
metadata is data about data: information that communicates the meaning of
other information. As nearly as I can tell, the term has come to prominence in our
context only with the Web, dating from the early 90's, where it surfaced in the
455
face of a newly recognized need: resource discovery on the Web. (See below in
the Notes section, METADATA, the trademark)
Metadata and Web Content Management
Role of Metadata
What is the role of metadata in web content management?
The answer varies from office to office, agency to agency.
The role of metadata varies from “none at all” to being an integral tool in
managing web content.
For most of us, metadata isn’t new, just reborn.
Many agencies have been using “description” and “keywords” for years
with little guidance and no consistency.
Agencies that do use metadata consistently, built it into the content
creation process.
The rest of us are trying to find our way through the jungle of terms,
standards, and concepts.
Web Content Management
¾ Web Content Management also varies by agency
o Some content managers are also content owners
o Some are editors
o Others simply manage the content design/flow, but rely on content
creators/editors
o There are varying levels of understanding of the concept of metadata
¾ However, there are some commonalities
o Our websites are huge
HUD’s for example is something over 200,000 documents
o Our websites change frequently
Last Friday alone, HUD made over 700 changes.
o Our website have become increasingly difficult to maintain
¾ In addition to having too much content, there are also too many websites:
more than 24,000 government websites at last count.
¾ The total number of web documents in the Federal space isn’t known:
easily millions.
¾ There is a proliferation of portals.
o FirstGov lists more than 90 Portals.
456
o Instead of reducing the clutter, many portals end up as duplicating
navigational structures to content that already exists.
¾ And, there are a host of regulations and standards to follow.
o The Web Content Managers Working Group found over 16 laws,
regulations, or policies directly affecting federal public websites.
Metadata Standards
¾ While no standard has been issued, consensus has centered on the use of
the Dublin Core metadata standard.
o Already existing international standard
o Relatively easy to implement
o Already used by Great Britain, Canada, Australia and several federal
agencies in the U.S.
¾ Based on the consensus to use Dublin Core, the Web Advisory Council is
now advising agencies to use the following elements:
o dc.title
o dc.description
o dc.creator
o dc.date.created
o dc.date.reviewed
o dc.language
¾ While not yet a “recommendation,” there is a growing discussion on the
need for standardized vocabularies for:
o dc.audience
o dc.subject
¾ Some limited work is underway to create a controlled vocabulary for
audience.
¾ The Role of Metadata
¾ Web managers throughout the federal government struggle with some of
the same issues:
o It is often difficult to find out who “owns” content
o Ensuring content remains current and accurate is a monumental
undertaking
o Competition for space on the front page makes it difficult to get
people to the content they need.
¾ The role of metadata in web management comes back to those “pros”
o improve search relevancy,
o provide an audit trail,
457
o support website maintenance and administration, and
o allow information to be tracked and assembled government‐wide.
¾ Improve Search Relevancy
¾ While commercial searches like Google and Yahoo may not use metadata
YET:
o We could use metadata to improve the searches on our own sites
o FirstGov, while not currently set‐up to take advantage of metadata,
could be in the future
o “If you build it, they will come…”
¾ Provide an Audit Trail
¾ Several of the elements were selected for their ability to create an audit
trail. For example:
o Creator identifies who/which office is responsible for the content
o Date Reviewed identifies old, possibly obsolete, documents
¾ Support Website Maintenance
¾ Metadata gives us a method for:
o Tracking down who is responsible for content
o Ensuring content is current/accurate
o Archiving appropriate content from one administration to the next
o Identifying content by language
¾ Aggregate Content
¾ With the use of Audience and Subject, we could:
o Discover redundant/duplicative content
o Find content that needs to be “brought up” to higher levels of the
website
o Aggregate content from within our organizations and across
organizations
This could lead to reducing the need for portals.
¾ What is being done now?
¾ Metadata has been the subject of a session at each of the last three web
content managers conferences.
o Interest is high, but there are still doubts
¾ Metadata task group of the Web Advisory Council is:
o Creating guidance/tools for implementing metadata
o Beginning work on a controlled vocabulary for Audience
¾ Metadata has to be easily implemented
o Simple tools need to be developed
o Tools should be integrated into the content creation process
458
o Education developed on the benefits gained by the consistent use of
metadata
We find the first oblique reference to metadata in the "HyperText Markup
Language Specification Version 2.0," which discusses "meta‐information" in the
header section of a HTML document:
Meta‐information has two main functions:
• to provide a means to discover that the data set exists and how it might be
obtained or accessed; and
• to document the content, quality, and features of a data set, indicating its
fitness for use.
The first of these bullets targets resource discovery; the second targets resource
description. The first mention I can locate for the term "metadata" used in this
sense occurs in the Geospatial community and its efforts to define resource
description systems for geospatial data: "Content Standards for Digital Geospatial
Metadata Federal Geographic Data Committee," dated June 8, 1994.
At the risk of adding to the confusion surrounding this term, I would like to
expand the concept of metadata to include a second type: data labeling. Indeed,
this type of metadata can be viewed as primary, as more basic than resource
description. I would like to elaborate briefly both forms of metadata.
Metadata as tags
The most common form of this type of metadata arises from the use of tags to
characterize the content of fields. This kind of metadata has a great variety of
uses. It is found in all information forms: survey instruments, purchase forms of
all sorts, and yes, tax forms. What all of these forms have in common is that they
contain labeled fields: a text definition followed by a blank space. The different
fields are meant to be filled in and later processed. Labeled fields of this sort are
also found in all commercial record keeping, most particularly in the world of
electronic data processing, where such standards as EDI have been promulgated
to allow information exchange among cooperating commercial firms.
Our focus is exclusively on fields defined by the tagging that occurs in markup
languages. SGML was the first of a series of standards that were initiated in the
459
late 80's and has recently culminated in XML. The tags in these systems occur in
pairs; each pair defines and delimits a field, with the contents of the field
occurring between the two tags. All markup languages (SGML, HTML, XML) make
use of this kind of metadata. A simple example:
<title> Any title </title>
<publisher> Amazon.com </publisher>
<price> $12.50 </price>
Each field (or element in the terminology of markup languages) has a start‐tag
(<...>) and an end‐tag (</...>). The character string within the brackets identifies
the field; the area between the start‐tag and the end‐tag contains a character
string that is the value of the field. In the above example, the pairs of bracketed
names: <title>, </title>; <publisher> ,</publisher>; and <price>, </price> are the
metadata; these metadata convey information about the character strings within
each of the pairs. The data thus described are 'Any title', 'Amazon.com', '$12.50'.
This kind of metadata has the advantages of simplicity, machine and human
readability, and great expressive power, as HTML has demonstrated in the Web
environment. Until recently, HTML tagging has been used to "mark up" all Web
content, promiscuously conveying information about formatting, linkages and
descriptors.
Metadata as descriptors
But here's the kicker: In our example above, the strings occurring between each
start‐tag and end‐tag are also data about data: they are also metadata. In the
example, they are about a publication and are therefore bibliographic in nature.
When discussed in a Web context, the term "metadata" can refer to either type:
the tagging system that defines a set of fields and its contents, or the contents of
certain fields that act as descriptors for other resources. This duality can create
confusion and it doesn't help that the same string of characters can act as
metadata on one level, and data on another, depending on the perspective being
used.
Metadata on the Web
In tackling the problem of providing descriptive surrogates for library‐related Web
resources, we have to be concerned about both kinds of metadata for the
following reason: the tagging systems for Web pages, and the conventions and
460
standards for processing them, create the context within which library practices
reside; the infrastructure of the Web is driven by them and creates the
opportunity for us to build within it a means to achieve our own ends. Since it is
the crucial underpinning for our own efforts, before we focus on resource
description, we need to discuss briefly the general use of metadata tagging in the
Web environment. Such tagging has had a wide variety of applications on the
Web independent of libraries. Each application has had its metadata standard
proposed, debated, implemented and sometimes abandoned. We will consider
some as preparation for our library applications.
General Metadata Systems
By general metadata system, I mean a methodology for fully characterizing all of
the data for an application. The two primary examples of such general systems
are:
• "The Meta Content Framework Using XML," a proposal submitted to the
World Wide Web Consortium (W3C) in June 1997, Netscape's major
contribution to the metadata initiative.
• The "Channel Definition Format," submitted in March 1997, is Microsoft's
major contribution to the metadata initiative. It "extends XML and Web
Collection work that the W3C" has worked on. CDF is the "industry's first"
channel framework for push technology on the Web.
It will not benefit us here to do more than mention general metadata systems
other than to state that their primary aim is to enable the precise mark up of data
streams for system interoperability.
Resource description
Problems of resource description have pervaded the Web since its beginnings.
Not surprisingly, however, metadata for resource description have not always
been provided explicitly in Web pages. The "Head" section of the HTML Standard
was introduced in version 2.0 (early 1994) when the Web was 2 years old. It
included the "Meta" element for the first time with such attributes as "title".
Metadata in this form proved very popular, with its use growing very rapidly. By
1998, 70 % of public Web sites made use of them, with an average of 2.75 meta
fields for each site that used them. ("Web Characterization Project: An Analysis of
Metadata Usage on the Web," Edward T. O'Neill, et al)
461
This form of resource description, our primary topic here, engages virtually all
Web users, and ranges from search engines and directories of all types to the
identification and discovery of special interest communities.
PICS and other content controllers
The Platform for Internet Content Selection (PICS), an activity related to resource
description, both historically and practically, is based on the desire to filter or
restrict access to materials of certain types. The most obvious is pornography and
the filter or restriction is with respect to juvenile access; but there are many
cultures that wish to restrict access to other materials, mostly of a political
nature. How to do this within a Web context is the primary question, and the
answer is through characterizing the content of resources from this vantage. The
O'Neill study noted above does not find much use of PICS tagging. See
(www.w3.org/TR/REC‐DSig‐label/#DSig_1_0_Overview) or (www.w3.org/PICS/)
for further information on PICS.
Commerce ‐ BizTalk and SOAP
From a Microsoft June, 1999 press release, "the BizTalk Framework is an open
specification for XML‐based data routing and exchange. The BizTalk Framework
makes it easy to exchange information between software applications and
conduct business with trading partners and customers over the Internet." SOAP,
the "Simple Object Access Protocol" developed by Microsoft, "is a lightweight
protocol for exchange of information in a decentralized, distributed environment.
It is an XML based protocol that consists of three parts: an envelope that defines a
framework for describing what is in a message and how to process it, a set of
encoding rules for expressing instances of application‐defined datatypes, and a
convention for representing remote procedure calls and responses." (Taken from
the document submitted to the W3C recommending the formation of a working
group for Web protocols (Simple Object Access Protocol (SOAP) 1.1, W3C Note 08
May 2000) See (www.microsoft.com/biztalk/) for details.)
Depending on whom you talk to, BizTalk and Soap are either an alternative to the
Resource Description Framework (discussed in the next section) or a complement
to it. In either case, the existence of both, with neither giving any evidence they
are aware of the other, is indicative of the diffuse effort that reigns in the Web
arena over how to solve the need for interoperability and data exchange among
distributed applications that are the norm on the Web.
462
Rights Management
And one such distributed application is the management of intellectual property
rights on the Web. The need is to protect intellectual property rights on the Web
and enable commercial publishers to control effectively the electronic transfer of
such rights. The International DOI Foundation, in collaboration with commercial
publishers, is responsible for advancing the definition and uses of the Digital
Object Identifier (DOI(r) ), and is among the leaders in the endeavor to manage
property rights. The DOI is "an identification system for intellectual property in
the digital environment." Its principle objective is "to develop automated means
of processing routine transactions such as document retrieval, clearinghouse
payments, and licensing." (http://www.doi.org/index.html) Metadata arises in
this context as a means to identify, describe, and allow the tracking of all manner
of intellectual property on the Web, to protect it from misuse, and to enable its
creators to be properly remunerated.
Although part of the objective of DOI Foundation is to provide a basic resource
description to accompany the DOI identifier, much like the elements of the Dublin
Core provides, it is noteworthy that no mention of the Dublin Core occurs on their
site.
RDF: the Resource Description Framework
Before concluding this section on general issues dealing with metadata on the
Web, and before turning to the metadata of resource description, I would like to
discuss briefly the relevance of the Resource Description Framework,
henceforward referred to as RDF. The best overview of what RDF is and what it is
to be used for remains Eric Miller's "An Introduction to the Resource Description
Framework" appearing in D‐Lib Magazine
(http://www.dlib.org/dlib/may98/miller/05miller.html). From the abstract, "The
Resource Description Framework (RDF) is an infrastructure that enables the
encoding, exchange and reuse of structured metadata."
From the W3C RDF FAQ:
RDF emphasizes facilities to enable automated processing of Web resources. RDF
metadata can be used in a variety of application areas; for example: in resource
discovery to provide better search engine capabilities; in cataloging for describing
the content and content relationships available at a particular Web site, page, or
digital library; by intelligent software agents to facilitate knowledge sharing and
463
exchange; in content rating; in describing collections of pages that represent a
single logical "document"; for describing intellectual property rights of Web
pages, and in many others. RDF with digital signatures will be key to building the
"Web of Trust" for electronic commerce, collaboration, and other applications.
(http://www.w3.org/RDF/FAQ)
It is not clear as yet what relevance RDF has to the library world; more broadly,
and perhaps causative, it is not clear as yet what relevance RDF will have in the
Web. The attitude of Web practitioners toward RDF varies greatly. At one end of
this spectrum is the W3C community, which maintains that RDF will provide the
mechanisms to solve many of the interoperability problems in the Web. At the
other end is Microsoft, which, so far at least, has exhibited a deafening
indifference to RDF. The latter attitude is manifested by a total avoidance of its
use within Microsoft's product line, and is an almost reflexive corporate reaction
to any standard not created by Microsoft itself. If the Microsoft reaction is
indicative of the low rate of adoption generally, then RDF is in trouble.
What does the success or failure of RDF matter to the library community? From
the perspective of the library world acting within the boundaries of its own
community, successful resource description standards and methods are possible
without an RDF. Moreover, as with many other Web developments, RDF will
succeed or fail based on the practices of the larger world outside libraries. As is so
often the case with emerging standards, watchful waiting is probably the best
approach.
The Future of XML
The future of RDF is tied closely to the emergence of XML. What is the future of
XML? First and foremost, it appears clear as of this writing that HTML as the
markup language of choice for the Web will eventually give way to XML. XHTML, a
recent variant of HTML, was designed to provide a bridge between the two. I have
heard numerous optimistic predictions about the pace of this evolution, all of
them wrong so far: installed systems are always slower to give way than one
would wish. Two milestones will be worth watching for: when half of all new Web
pages being written are in XML; and second, when half of all the pages on the
Web are in XML. Neither will occur any time soon, certainly not in one year, very
probably not in two.
Below, I discuss the impact of this change on library issues. The primary issue,
however, remains that we are at the mercy of the general Web community in
464
these areas. Progress will occur at a pace dictated by the needs of large movers
on the Web, influenced to some degree by the general problem of resource
discovery experienced by all Web users, and also by all of those other applications
awaiting an effective solution. If this brief consideration of metadata uses on the
Web accomplishes anything, I hope it communicates the diversity of communities
engaged in providing standards and also the lack of cohesive efforts and results
that have been achieved thus far.
Metadata Standards for Resource Description
Now that we have gotten through the preliminaries, we can turn to our major
topic: metadata used for resource description on the Web. It may help clarify
Web efforts to touch first on standards that fall under the general topic of
resource discovery but were not designed specifically for Web resources. They
include such standards as those developed by the Consortium for the Computer
Interchange of Museum Information (CIMI), those standards whose development
is funded or directed by the Federal Geographic Data Committee, mentioned
above in relation to the term "metadata"; and the Government Information
Locator Service (GILS), now used to provide access to government documents.
These three standards were developed outside the library community. Examples
of metadata standards developed within the library community would include the
Text Encoding Initiative (TEI) and the Encoded Archival Description (EAD), which
were created using SGML and pre‐date the Web, but which have since been
converted to XML for use within the Web. Links to all of these are provided in the
"Resource Section" below. None of these can be said to have arisen because of
the Web, nor was their initial focus on Web resources. Rather, they use metadata
to provide finding tools for patrons in their respective applications. They are more
or less parallel to systems of MARC bibliographic records: they are systems
constructed to provide descriptions for various classes of objects in the areas of
application, ranging from the contents of museums to archived papers. As with
almost everything in today's world, the Web is increasingly important as a
mechanism for meeting the needs of users by connecting them to resources,
whether those resources are available for use on the Web, or only described
through the Web and require further action in the non‐web world. Items
purchasable through the Web fall into the latter category.
The major Web mechanism for connecting user to resource is the search or
directory service. Both make use of resource descriptions either to allow the user
to perform a search or allow browsing. Typical and relevant is an OPAC search to
465
locate a book, or a similar search on Amazon.com. In neither case is the book
itself available on the Web, at least not yet.
To the extent that the standards referred to above deal with objects not directly
usable through the Web, they fall outside my concern here because I would like
to focus exclusively on Web resources.
One final point: This distinction between Web resource and objects outside the
Web may appear somewhat arbitrary. While deploying metadata systems, there
is often an overlap between the two. CIMI, for example, has been and is a very
active participant in the Dublin Core community, which is responsible for creating
the Dublin Core, the preeminent resource description standard in the Web
environment. CIMI participates in the Dublin Core at least in part because so
much of its resource description activity is manifested in some form on the Web.
Increasingly, it is possible to link to images of museum objects on the Web; these
images are Web resources par excellence, and thus very much a target of the
Dublin Core community. The same can be said for archival information covered by
the EAD community: one day all of these materials may be accessible on the Web.
The needs of these various communities for resource description capabilities
create a challenge for standards bodies seeking to create tools that can
accommodate them. In their complex combinations, they raise questions about
the nature of surrogate records. The Web is so universal, so all‐encompassing,
that we look toward a time when everything will require its Web surrogate to find
its user. This aim implies a need for surrogate languages with great expressivity.
The ambition of standards such as XML, RDF and the Dublin Core is to achieve this
level of expressivity.
We can now turn to the Dublin Core and assess its attempt to accomplish the
lofty aims set forth here. And we will encounter a regrettable limitation on the
human condition: when we try for too much, we often deliver too little.
The Dublin Core Metadata Standard
The standard central to our purposes is the Dublin Core, which arose within the
diverse standards creation activities of the mid‐90's. From the outset the Dublin
Core had as its focus resource discovery on the Web. As stated in a 1998 IETF
document, "The Dublin Core Metadata Workshop Series began in 1995 with an
invitational workshop which brought together librarians, digital library
researchers, content experts, and text‐markup experts to promote better
466
discovery standards for electronic resources." ([RFC2413] Dublin Core Metadata
for Resource Discovery. Internet RFC 2413. (http://www.ietf.org/rfc/rfc2413.txt))
"Discovery standards for electronic resources" ‐ as noted earlier, I have used the
phrase "resource description" instead of "resource discovery" because description
is more general, and in my view more accurately characterizes what is required.
One may claim that an effort is restricted to resource description, but if one does
not deal with user needs effectively, no justification will satisfy. Resource
discovery is impossible without resource description; adequate resource
description assures effective discovery. The difference is as basic as the difference
between a keyword search and an adequate display of results. The former allows
discovery; the latter, based on resource description, allows effective selection
from an extended list. I will elaborate this more fully below when we discuss
alternatives to cataloging.
In library terms, the Dublin Core is a simple system for cataloging Web resources,
no more, no less. And it should be judged from that perspective.
Issues with the Dublin Core
Many issues surround the primary question of the effectiveness of the Dublin
Core, and I would like to list and discuss them briefly.
Degree of completeness
Unfinished ‐ the most serious problem of the Dublin Core to date. The first official
version of Simple Dublin Core was available in 1997 after 2 years of discussion and
debate. The first published version of a qualified Dublin Core was made available
in July of this year. It is obviously incomplete, with no qualifiers being offered for
the Creator, Contributor, Publisher elements. As yet no one has been able to
provide documentation, extensibility rules or implementation guidelines for a
qualified Dublin Core. What this has caused in the intervening years is the
development of various community versions of qualified Dublin Core's. What this
has also caused in the intervening years in every community attempting to apply
the Dublin Core to a collection is endless debate over what the various elements
mean and how they are to be used. What this has also caused in the intervening
years is very slow adoption of the Dublin Core as a standard for resource
description for the Web. (Again, see O'Neill's report cited above for statistics.)
Institutional support
467
Lack of institutional support is not surprising given the degree of incompleteness
of the Dublin Core. CORC (Cooperative Online Resource Catalog), a new service
from OCLC introduced in July of this year, which incorporates the newly published
qualified Dublin Core, is a strong step in the right direction, but much more is
needed, including a standards body and procedures for evolving and changing the
Dublin Core.
Documentation
Documentation, of course, must follow on a published standard and can't precede
it. After the recent release of a qualified Dublin Core, it may be possible now to
provide at least some usable documentation.
Implementation guidelines
As yet there is no direction on how to implement the qualified Dublin Core in
HTML or XML, though this may change at any time.
Extensibility rules
There is as yet no precise direction on what counts as an allowable extension to
simple Dublin Core, or what syntax extensions must conform to. The absence of a
clear definition of the syntax of qualifiers continues to make implementation
guidelines difficult if not impossible to achieve. Sufficient for this purpose may be
the Dublin Core Metadata Initiative (DCMI) publication prepared by the DCMI
Usage Committee, which "describes the principles governing Dublin Core
qualifiers, the two categories of qualifiers, and lists instances of qualifiers
approved by the Dublin Core Usage Committee." ("Dublin Core Qualifiers," July
2000) (http://purl.org/dc/documents/rec/dcmes‐qualifiers‐20000711.htm)
In that document, two kinds of modifier for elements are recognized: Element
refinement and Encoding scheme. The first is characterized by such modifiers as
"created" for the date element; the second by "LCSH" for the Subject element,
and "URI" for the Identifier element. For explanations and further examples,
please refer to the official publication cited above where all qualifiers defined for
the current version are presented in a table. I have gone into this level of detail
here concerning acceptable qualifiers for Dublin Core because I explore a problem
with respect to them in the next section.
Other Issues
468
The above list of Dublin Core issues may be transitory. Indeed, it is possible that
some of them will be removed or at least alleviated by the DCMI July, 2000
publication cited above. What if they were all fixed? Would our need for a
resource discovery standard for the Web be satisfied? There are two general
areas of concern that I can see. First, if we generously assume that the Dublin
Core in its current form is approximately finished, and that its major focus is on
"document‐like objects", how close is it to an acceptable standard? Will tweaking
over time and through experience in its use gradually provide us with a standard
we can live with? Or are there major fissures that must be bridged? Second, does
the architecture of the Web require a standard that goes beyond an object‐
attribute model for resource discovery? I would like to discuss each of these
briefly.
Difficulties with the current form of the Dublin Core
The current structure of the Dublin Core limits its usefulness in critical ways. As
outlined above, Qualified Dublin Core currently allows an element to have two
modifiers: the first is considered to be a refinement of the element; and the
second, the encoding scheme, is considered to modify the value of an element.
The distinction between these two types of modifiers, and others that might be
used, have been the source of much discourse within the Dublin Core community,
one cause of the delay in completing a draft of a qualified Dublin Core. My
problem is fundamental and practical and can be expressed by citing, as
examples, what I consider to be serious weaknesses in two Dublin Core elements:
the Creator and the Relation elements.
Creator element (and Contributor and Publisher as well)
What is needed for a Creator element (or what I would like to see it have!) is a
structure that provides for the name of the creator as its value, a modifier that
states whether the name is corporate, personal, or geographic, and a further
modifier that is a URI pointing to an authority record for the name. (All modifiers,
like all elements in the Dublin Core, are optional.) The capability of attaching a
URI to a Creator element would not only obviate the need to include
supplemental Creator information such as an email address (which many have
recommended, and which I consider to be highly undesirable), but it would also
allow, and thus encourage, a far more effective means of authority control in the
Web environment. The fundamental Web mechanism is the link; a Creator field
should link directly to the authority record . What could be more natural,
desirable, powerful? My understanding is that a group is investigating how to
469
handle authority linkages with the Dublin Core; I hope this solution is still a
possibility.
Relation element
The Relation element poses a similar problem arising from the same structural
cause: more modifiers are required to give the Relation element what it needs for
effective use. A relation element contains information about a "related item".
Three pieces of information are required for this element to be a useful Web
construct: the name of the relation ("Is part of", "Is version of", etc.), the name of
the item (in the simplest case, a title), and, when available, a URI to get to the
item.
Under the current structure, we can provide either a name or a URI, but not both.
There is a solution to both of these problems and one in accord with the essence
of the Web: define as part of any Dublin Core element a pointer element for
"additional information."
Difficulties with the Object‐attribute model
Web Resources: the medium is the message
Marshall McLuhan's famous dictum, "the medium is the message", recommends
caution in how we understand the workings of a new medium. Our new medium
is the Web and what McLuhan meant, I suppose, and what has application here, is
that the characteristics of the medium often have greater impact or influence
than the actual content. We are moving from a print culture to an online culture.
In the present context, the characteristics that are most at issue involve the
change from "collections" and "objects" to ... pages and pointers? Resources? To
what? And why do we care?
We care for a number of important reasons. It can be argued that AACR2
cataloging is, by its very nature, tied to physical objects, and when we move into a
world without physical objects, the target of the cataloging effort becomes fuzzy
or without boundaries. This lack of definition may create insurmountable
obstacles to the effective application of cataloging principles and practice. I
subscribe to this view without understanding it fully, and I will attempt in what
follows to explain why.
Objects vs. resources vs. whatever
470
Back in 1992 when we undertook to examine "access to Internet resources", ( a
project reported in "Assessing Information on the Internet: Toward Providing
Library Services for Computer‐Mediated Communication," (Spring 1993), Internet
Research, 3(1) 54‐69) we played a simple trick on ourselves to sidestep the issue I
want to discuss here. The trick was tactical and was necessary at the time for us
to make progress: we restricted our investigation to "document‐like objects" on
the Internet. We chose this route to make progress because our first meetings
had become bogged down in discussions about what sorts of things were on the
Internet, how they differed from documents, and what the implications for
cataloging were. After a few rounds of profitless discussion and no progress, by
fiat we restricted our focus.
What is the essence of the problem? I believe it is in the notion of object‐hood
and how that notion does not translate very well to the Web. Consider first one of
the basic principles of Anglo‐American cataloging: the item in hand. Much
depends on this concept including a well‐defined boundary for the cataloger in
the cataloging process. Of course, even in our workaday world where the
cataloging target is a discrete physical package, there are severe problems that
must be overcome. Many of these arise because of the differences between the
class of objects related to what is referred to as the work and the classes of
objects in the work's various manifestations. Questions concerning differences
between one class of manifestations and another are legitimate and deserve the
attention they receive; how they are resolved determines, among other things,
when a new record is required for an item in hand, and when an existing record
will suffice. Though important, discussions of these issues have often been
unsatisfying. It may be that the problems they pose are fundamentally
intractable, that cataloging offers a means for creating round holes into which
through various compromises we force a collection of square pegs.
In the world of physical objects, part of the problem certainly is the
oversimplification encouraged by the illusion that the ground is solid beneath our
objects. One example, long a favorite with me, has to suffice. A trivial pursuit
question:
Category: cataloging. What is the smallest difference between two books that will
lead to the creation of two different bibliographic records?
In more general terms, how big does a difference have to be between two objects
to justify the creation of a second bibliographic record? We are touching on the
question for which the Dublin Core "1:1" rule offers the answer. And the answer
471
may be unwise, wrong‐headed or otherwise misguided, but it assumes object‐
hood: one object generates one record.
The problem I want to address is the following: is object‐hood an effective
metaphor for successful resource description in the Web? Please remember that
we are not dealing with absolutes, either all or none. In the print world, object‐
hood has its limitations: the concept of serial was invented to deal with one of
them and the discussion above exposed a more subtle definitional problem in
dealing with monographs. On a scale of 1 to 10, we could say that for
monographs, item‐in‐hand object‐hood is 9.8 successful. What degree of success
are we likely to achieve using object‐hood as the basis of cataloging on the Web?
The "1:1" rule assumes objects as a given. Its primary purpose is to deal with
problems arising when more than one manifestation of the same work exists.
Simple examples will suffice: differences in format, say PDF and RTF; or different
representations of some object, say image or Html. This oversimplifies but does
no harm here, because the very notion of recognizable objects is undermined in
the Web.
From the perspective of managing those Web resources that are of interest to the
library community, the question becomes: how many conform comfortably to the
notion of an object; conversely, how often will an assumed object‐hood get us
into trouble? Is the use of an object as the underlying metaphor a useful fiction?
Or is it more apt to get us into a heap of trouble?
It is always useful to bring forward examples from the print world when they are
available to shed light on difficulties like the current one. Two occur to me. The
first is the practice of faculty creating a collection of readings gathered from
disparate sources as a quasi text book for a course. I have never heard of anyone
advocating that libraries catalog such an object. But why not? Surely, surrogates
for such objects would be useful if the table of contents were included. Would not
others teaching similar courses benefit from having access to the description of
the book?
Perhaps a more apt example, certainly a more recent one, is the possibility of
anyone creating his or her own book by gathering pieces and parts from a large
database of books, whose contents are themselves stored and accessible in parts.
Not only chapters and sections could be extracted, but pictures and tables and
any other pieces at the whim of the purchaser. As depicted by Lisa Guernsey,
"Under this model, books have not only turned into streams of electronic bits that
472
are downloaded to hand‐held devices or printed on demand. They have also
turned into databases ‐‐ pools of digital information that people can extract and
combine on their own terms." (From "Books by the Chapter or Verse Arrive on the
Internet This Fall," NY Times, July 18, 2000)
Clearly, the results of this process are outside the scope of the cataloger.
I would argue that a Web resource is often much more like a fluid, multi‐
dimensional, multi‐layered, constantly changing complex of things and
relationships than it is like a simple object. Web resources do not have tidy
boundaries.
Web Resources
It is necessary to probe this issue further. Web resources are different from
monographic objects in ways that profoundly change the cataloging problem; this
difference is growing: more of the Web can be thus characterized and the
distance between such resources and the monographic object is growing.
Most simply, the problematic characteristic of the Web resource is one of extent:
it is difficult, if not impossible, to define the extent of a Web resource, to state
where it begins and where it leaves off. Try defining these terms: Web page or
Web site. They are used ambiguously on the Web and in the literature. Moreover,
what relation do they have to the terms: file, directory, or server? The vagueness
of the terminology in this area is symptomatic of the vagueness, in physical terms
as well as conceptual terms, of the underlying concepts.
Before we can catalog something, we have to know what we are talking about.
The Role of Libraries in Web Resource Description
We also have to know what we want to accomplish. Barbara Baruth, in a recent
article in American Libraries ("Is Your Catalog Big Enough to Handle the Web,"
August, 2000, pp. 56‐60) explores the question of the library's role in resource
discovery on the Web. She asks, "Will the impressive second‐generation search
engines out now or third‐generation engines now incubating make the idea of
quality‐based services such as CORC obsolete?" Future search engines, she
continues, may be able to do a fine job, "scouring the net and bringing back
tailored results." And finally she asks the sixty‐four dollar question, "Is it possible
that manual efforts to explore, evaluate, and catalog the vast reaches of the
Internet just can't compete [with these advanced search engines]?"
473
What is the library responsibility with respect to providing access to Web
resources? What is its role, and how should it carry out this role? Until we provide
credible answers to these questions, it is not possible to chart the future course of
libraries, and secondarily, cataloging. Even if we agree with Barbara Baruth's
assessment that search technology will improve sufficiently to eliminate the need
of human resource description, how long will this take? I am always suspicious,
and I recommend this scepticism to all, when delivery is promised of technologies
that are not yet in beta test. Experience tells us that the promised date almost
invariably stretches into the future.
Let me state my own view: I see no hope that searching alone will replace the
need for human cataloging in the forseeable future, that is, the next 5‐10 years.
Here are some reasons for my view:
Wrong, obscure or missing information
Searching is similar to automated cataloging in that neither can overcome the
absence of data inferable from a resource, and Web resources will not evolve
stable self‐describing mechanisms for a long time, if ever; such mechanisms are
not yet even being broadly discussed. Desired characteristics such as creation
date, revision date, and expiration date, just are not easily available from most
Web resources. Inappropriate titling, weak or absent content descriptors ‐ we can
go on and on. The absence of these descriptors, or their presence in corrupt or
unrecognizable form, within a Web resource corrupts the results of any searching;
and we can expect such problems to grow for a long time rather than abate.
Authority control
The problem of coordinating and differentiating names, a modest source of
difficulty within the controlled environments of the library catalog and the
commercial publishing world, becomes a nightmare on the Web. All of the usual
suspects are involved: personal names, corporate names, geographic names,
subject descriptors; all now compounded by language and character set confusion
on an immense scale.
Selection
Finally there is the issue of selection. The Web now has over a billion pages,
whatever that means. The task of culling from this huge morass the population of
stuff that we want to search is almost overwhelming. It can only be accomplished
474
by an equally huge, ongoing effort of thousands of people, effectively coordinated
by well‐designed online systems.
Conclusions and Recommendations
Let me take a final quote from Barbara Baruth's article cited above: "The future of
library systems architecture rests in the development of umbrella software that
digests search results from rapid, coordinated searches of a variety of disparate
databases." That is, the job of resource discovery will be accomplished primarily
through software directly acting on Web resources without benefit of human
intervention, particularly of the cataloging sort. I disagree with this position on a
number of grounds, not least that I believe that searching alone will reach a point
of diminishing return (may have already). A second, library‐centric reason is based
on the assertion that if the library role can be encapsulated by such search
engines, we can dispense with libraries forthwith: this functionality can be
provided by software firms and distributed directly to patrons either as clients or
by glitzy Web portals.
I would argue that it is the responsibility of the library to provide effective access
to knowledge resources on the Web. If the various commercial services can
adequately accomplish this library goal, let's get on with other worthwhile
knowledge management tasks required by our patrons. Barbara Baruth is
certainly not alone in the belief that such services are rapidly succeeding in this
goal. A parallel here is the dependence of libraries on abstracting and indexing
services, which provide tools for accessing the journal literature. Nothern Light
and Google are Web versions of the same idea.
Let us assume that library intervention is required for successful access to Web
resources of interest to patrons. For those resources that are roughly equivalent
to documents in the physical world ‐ self‐contained, more or less static ‐ the
cataloging task emerges in much like its historic form. No small task because there
are a great many such objects. Let us continue to ignore that other class of
resources, those whose object‐hood is in question.
How should libraries provide access to document‐like knowledge resources on the
Web? If the library community decides that it is necessary to establish a form of
bibliographic control for such objects, three paths are open:
1. Use or adapt MARC/AACR2
475
2. Start fresh creating a library metadata system with the same aims as the
Dublin Core
3. Use or adapt the Dublin Core
I will discuss each of these briefly.
Use or Adapt MARC/AACR2
There may have been a time when this was a useful direction to take but it is long
past. The result of such an exercise would have many of the attractive attributes
of the Dublin Core, particularly its simplicity and flexibility.
Start Fresh
A fresh start, guided by the lessons learned from the long parturition of the
Dublin Core is an intriguing idea. But is it realistic? Can the library profession
manage the rapid creation and deployment of such a standard? Nothing in our
history encourages optimism.
Use or Adapt the Dublin Core
We are left with this final option. It is more likely that we can make progress by
either using whatever version of the Dublin Core is current, or, far better in my
view, attack the problem of creating a library‐specific variant of the Dublin Core
that suits the aims of the library. The criticisms of the Dublin Core offered above
provide at least a starting point for what such a variant might look like.
As a final point, I would only strongly recommend that at least one action be
taken fothwith: that a MARC version of the Dublin Core be developed, with
appropriate instructions and examples. The work products of such a MARC
include at least the following:
• The list of fields and sub‐fields defining the MARC Dublin Core record,
including an indicator that the record is a Dublin Core record.
• Necessary documentation with appropriate examples.
•
• A definition for a MARC input screen to guide local system vendors and
utilities.
• A plan to urge cataloging utilities to incorporate this style of record into
their editors.
476
I am not suggesting a multi‐year project; my guess is that this work effort could be
accomplished satisfactorily in a matter of a very few months.
This MARC version and its accompanying documentation would be suitable for
use in library OPACs, if desired, and would be directly convertible to and from any
database of Dublin Core records. The advantages of doing this are obvious. It
would immediately communicate to thousands of catalogers the essential nature
of the Dublin Core and equip them to make use of existing systems and software
to create resource descriptions for Web resources. Would this be a solution to our
problems? No, but it would put us in the game as it is defined in today's Web
world. Consider where we would be today if a library‐defined version of the
Dublin Core existed 3 years ago. If the MARC Dublin Core was adopted and
vigorously applied by thousands of libraries, we would be far better positioned to
serve the Web needs of library patrons and Web knowledge access would be far
different and far better.
Notes and Sources
METADATA, the trademark
Thanks to Rick Pearsall, FGDC Metadata Coordinator, I learned that the term
"Metadata" was trademarked in 1986 by The Metadata Company (The Metadata
Company, http://www.metadata.com). Its invention is credited to Jack E. Myers
who is said to have coined the term in early summer of 1969. The trademark
should be written with capital letters and should be distinguished from both
"meta data" and "meta‐data".
6.2 Metadata System Examples
6.2.1 Content Standard for Digital Geospatial Metadata (CSDGM)
An outstanding example of metadata definition is that developed for Geospatial
data and mandated by the Federal Government.
The standard was developed from the perspective of defining the information
required by a prospective user to determine the availability of a set of geospatial
data, to determine the fitness the set of geospatial data for an intended use, to
determine the means of accessing the set of geospatial data, and to successfully
transfer the set of geospatial data. As such, the standard establishes the names of
data elements and compound elements to be used for these purposes, the
definitions of these data elements and compound elements, and information
about the values that are to be provided for the data elements.
477
As stated in the documentation for the standard, "The first impression of the
CSDGM is its apparent complexity; in printed form it is about 75 pages long. This
is necessary to convey the definitions of the 334 different metadata elements and
their production rules. Do not let the length dismay you."
(http://www.lic.wisc.edu/metadata/metaprim.htm, 'Metadata Primer ‐‐ A "How
To" Guide on Metadata Implementation') If you are dismayed by its length and
complexity, join the crowd!
6.2.2 U.S. Geological Survey. Government Information Locator Service.
URL: http://www.gils.net/
A useful source document is available through the U.S. National Archives and
Records Administration (NARA). Guidelines for the Preparation of GILS Core
Entries.
URL: http://www.ifla.org/documentslibraries/cataloging/metadata/naragils.txt
6.2.3 The Consortium for Interchange of Museum Information (CIMI)
From the introduction at the site: CIMI (the Consortium for the Computer
Interchange of Museum Information) is committed to bringing museum
information to the largest possible audience. We are a group of institutions and
organizations that encourages an open standards‐based approach to the
management and delivery of digital museum information.
A useful overview is provided in, "The use of XML as a transfer syntax for museum
records during the CIMI Dublin Core test bed : some practical experiences."
http://www.cimi.org/documents/XML_for_DC_testbed_rev.doc
6.3 Other Sources
6.3.1 INDECS: interoperability of data in e‐commerce systems
An international initiative of rights owners creating metadata standards for e‐
commerce ‐ "putting metadata to rights" . INDECS provided the metadata model
for the DOI. The site has links to background information on the INDECS project
and its results.
6.3.2 Digital Library: Metadata Resources ‐
478
The single best source for all aspects of resource discovery metadata
6.3.3 The Resource Description Framework
Dave Beckett's Resource Description Framework (RDF) Resource Guide
The offical source document for RDF defines it as
Resource Description Framework (RDF) is a foundation for processing metadata; it
provides interoperability between applications that exchange machine‐
understandable information on the Web. RDF emphasizes facilities to enable
automated processing of Web resources. RDF can be used in a variety of
application areas; for example: in resource discovery to provide better search
engine capabilities, in cataloging for describing the content and content
relationships available at a particular Web site, page, or digital library, by
intelligent software agents to facilitate knowledge sharing and exchange, in
content rating, in describing collections of pages that represent a single logical
"document", for describing intellectual property rights of Web pages, and for
expressing the privacy preferences of a user as well as the privacy policies of a
Web site. RDF with digital signatures will be key to building the "Web of Trust" for
electronic commerce, collaboration, and other applications.
Modeling & Encoding Knowledge: RDF
•ԛ RDF (Resource Description Framework)
•ԛ Provides enabling technology for richly‐structure information
–ԛ Support for and integration of multiple independent vocabularies
•ԛ Rich data model supporting notions of distinct entities and properties
–ԛ Formal model with basis in logic
•ԛ Expressible in machine readable manner (e.g., XML)
RDF Components
•ԛ Formal data model
479
•ԛ Syntax for interchange of data
•ԛ Schema Type system (schema model)
•ԛ Syntax for machine‐understandable schemas
•ԛ Query and profile protocols
•ԛ Ontologies layered on top via extensions to base
RDF language (OWL)
RDF Data Model
•ԛ Provides underlying structural foundation for the expression of application
(instance) data models
–ԛ for consistent encoding, exchange and processing of information
–ԛ Provides for a basis for interoperability
•ԛ Individual communities can then define and express semantics on the basic
model
•ԛ Model is distinct from the syntax for expressing it
–ԛ XML
–ԛ triple notation
–ԛ relational databases (triple‐stores in tables)
RDF Data Model
•ԛ Binary Relationships
–ԛ Triples
–ԛ <subject> <predicate> <object>
–ԛ Carl livesIn Ithaca
480
–ԛ Ithaca hasWeather Terrible
RDF Data Model
•ԛ URIs for subjects, predicates, and objects allows joins
•ԛ Joins produce directed labeled graphs
•ԛ Graphs allow deductive inferences
RDF Model Elements
–ԛ Resource
–ԛ Property
–ԛ Value
–ԛ Statement
–ԛ Containers
RDF Model Primitives
RDF Syntax
•ԛ RDF Model defines a formal relationships among resources,
properties and values
•ԛ Syntax is required to...
481
–ԛ Store instances of the model into files
–ԛ Communicate files from one application to another
•ԛ XML is one well‐supported syntax, N3 is another
482
483
RDF Containers
•ԛ Permit the aggregation of several values for property
•ԛ Express multiple aggregation semantics
–ԛ unordered
–ԛ sequential or priority order
–ԛ alternative
RDF Containers
•ԛ Bag
–ԛ unordered grouping
•ԛ Sequence
484
–ԛ ordered grouping
•ԛ Alternatives
–ԛ alternate values
•ԛ need to choose
–ԛ at least one value
–ԛ first value is default or preferred value
RDF ‐ Bag
•ԛ Unordered group
•ԛ “Carl Lagoze and Stuart Weibel are co‐authors”
<BIB:Author>
<Bag>
<li> Carl Lagoze </li>
<li> Stuart Weibel </li>
</Bag>
</BIB:Author>
RDF ‐ Sequence
•ԛ Ordered or priority group
•ԛ “Carl Lagoze is primary author and Stuart Weibel is
second author”
<BIB:Author>
<Seq>
<li> Carl Lagoze </li>
485
<li> Stuart Weibel </li>
</Seq>
</BIB:Author>
RDF ‐ Alt
•ԛ Client chooses one of several values
•ԛ First value is default
•ԛ “The distance is 15 kilometers or 9.3 miles”
<DC:Coverage>
<Alt>
<li> 15KM </li>
<li> 9.3M </li>
</Alt>
</DC:Coverage>
486
RDF meta‐model
487
•ԛ RDF basic types
–ԛ rdf:Resource – everything that can be identified (with a URI)
–ԛ rdf:Property – specialization of a resource expressing a
binary
relation between two resources
–ԛ Rdf:type – predefined property to express that subject of
property is considered to be an instance of that category or
class defined by the value of the property
–ԛ rdf:statement – a triple with properties rdf:subject,
rdf:predicate,
rdf:object
•ԛ An RDF statement is a triple consisting of a resource
(subject), a property and a second resource (object)
–ԛ (:s :p :o)
•ԛ Expressible also as binary relations
–ԛ P(S,O) – e.g., Title(R, “War & Peace”)
RDF triple model
488
489
RDF meta‐model basic elements
•ԛ All defined in rdf namespace –ԛ http://www.w3.org/1999/02/22‐rdf‐
syntax‐ns#
•ԛ Types (or classes) –ԛ rdf:Resource – everything that can be identified
(with a URI) –ԛ rdf:Property – specialization of a resource expressing a
binary relation between two resources –ԛ rdf:statement – a triple with
properties rdf:subject, rdf:predicate, rdf:object
•ԛ Properties –ԛ rdf:type ‐ subject is an instance of that category or class
defined by the value –ԛ rdf:subject, rdf:predicate, rdf:object – relate
elements of statement tuple to a resource of type statement.
RDFs Namespace
490
Class‐related –ԛ rdfs:Class, rdfs:subClassOf Property‐related –ԛ
dfs:subPropertyOf, rdfs:domain, rdfs:range Problems with RDF/RDFs Non‐
standard, overly “liberal” semantics
•ԛ No distinction between class and instances
–ԛ <Species, type, Class>
–ԛ <Lion, type, Species>
–ԛ <Leo, type, Lion>
•ԛ Properties themselves can have properties
–ԛ <hasDaughter, subPropertyOf, hasChild>
–ԛ <hasDaugnter, type, Property>
•ԛ No distinction between language constructors and ontology vocabulary, so
constructors can be applied to themselves/each other
–ԛ <type, range, Class>
–ԛ <Property, type, Class>
–ԛ <type, subPropertyOf, subClassOf>
•ԛ No known reasoners for these non‐standard semantics
Problems with RDF/RDFs
Weaknesses in expressivity
•ԛ No localized domain and range constraints –ԛ Can’t say the range of
hasChild is person in context of persons and elephants in context of
elephants
•ԛ No existence/cardinality constraints –ԛ Can’t say that all instances of
persons have a mother that is also a person
491
–ԛ Can’t say that persons have exactly two biological parents •ԛ No
transitive, inverse or symmetric properties –ԛ Can’t say isPartOf is a
transitive property –ԛ Can’t say isPartOf is inverse of hasPart –ԛ Can’t say
touches is symmetric
Relationship between OWL and RDF(s)
•ԛ OWL Full is extension of RDF
•ԛ OWL Lite and DL are extensions of a restricted view of RDF
•ԛ Every OWL document is an RDF document
•ԛ Every RDF document is an OWL Full document
•ԛ Only some RDF documents are OWL Lite or OWL DC
•ԛ Constraining an RDF document to be OWL Lite or DL
–ԛ Every individual must have class membership (at least owl:thing)
–ԛ URIs for classes, properties, and individuals must be mutually disjoint.
The “DL” in Owl DL
•ԛ Description Logics
•ԛ Goal: want to be able to reason (infer information) about a
knowledge base
•ԛ Remember: a knowledge base consists of both meta (schema)
information and instance (individual) information
•ԛ Remember: we want to do this based on an open world
assumption
•ԛ OWL (Lite/DL) is then an RDF expression of DL
Semantic Web Service Architecture
492
Ontologies provide a large extent of flexibility and expressiveness, the ability to
express semi‐structured data and con‐straints, and support types and inheritance.
The industry’s Web service (quasi‐)standards, however, provide betterman
ageability, scalability, and modularization. The benefits of both technologies can
be obtained by merging DAML‐S and currentWeb service standards to the
benefits of both. Web Services and Web Service Access For this purpose, we
distinguish two main scenarios. In the first one, a human user wants to access the
Web service. There are (mainly) two possibilities: The user may directly access the
Web service, (cf. arrow 1 in Figure 2), viz. the technology‐based details of the
service description, the business‐related details of the service publication, and the
internal process flow details of the service in service interaction. This direct
invocation, however, is only possible if the user knows the service and its location,
which often is not the case. Alternatively, the user may interact with a registry
located at the service discovery layer which helps her
finding a service and retrieving the descriptions about how to invoke it (arrow 2 in
Figure 2).
493
In the second scenario, agents want to find and invoke a Web service in order to
fulfill a task given to them by a human user. This adds new possibilities to the first
scenario,
because agents may better find services and access them diectly (arrow 4 in
Figure 2) exploiting the crawling of decenralized machine‐processable metadata
on the WWW. Furhermore, they may also take advantage of registries — just ike
a human user (arrow 3 in Figure 2).
In case, the human or the machine agents query for a composed service, e.g. one
may ask for arrangements to visit a conference, the registry will make use of a
service flow
composition description. The description describes services and service flows
(e.g., temporal sequences). With the help of this description, the services
contained in the composed service (atomic or composed itself) may either be
located directly or retrieved via the very same registry for atomic services.
Communication. Like in the IBM quasi‐standard, the two bottom layers are
standard network protocols and SOAP in order to allow for exchange of object
descriptions by standard means. In addition, the purpose of machine agents
requires interaction with a corresponding communication protocol such as
KnowledgeQuery andManipulation Language (KQML) or Agent Communication
Language (ACL) (cf.e.g. the survey paper (Labrou 2001)). These protocols are
currently not “ontology‐capable”. Thus, there is a need for extension in order to
better reflect ontology requirements at these levels.Further Layers. Finally, the
extended layers for Quality of Service, security, and management as depicted in
the IBM Web service architecture are still needed, but are not illustrated in the
figure for ease of presentation. Web Service Binding. The Semantic Web service
architecture supports better service invocation by the use of agents because the
underlying ontologies are extensible. This advantage is inherited by DAML‐S,
which allows for extension of the predefined ontologies.
494
CONTENT MANAGEMENT WORKFLOW
Internet has become the greatest source of all kinds of information, whether it is
business or literary, academic or hobby‐related. Just type in any keyword in your
favorite search engine and millions of websites will pop up before you to answer
your queries. You may be quite aware that these websites were created by
content writers and website designers. However you may be less aware of the
contributions of the content manager. The term "content management" may
sound new to you, but they play a big role in the web content development.
Like any other managerial work, content management also involves coordinating
a network of myriad actions, and is done essentially in a chronological order. The
designing and content development are two simultaneous and important aspects
of website creation. Content management involves the coordination between
these two parts of website development.
Content management involves the integration of two creative aspects, namely
designing and writing: * It is the writers who put pen to paper the thoughts and
messages of the website owner. The designers on the other hand try to bring the
495
content to life by complementing them with suitable graphics and other forms of
visual designs.
* Both aspects of the production are monitored by the content manager. From
time to time the designers are provided with necessary templates, and writers are
provided with guidelines on the articles by the content management team.
* Finally, both productions merge into one to create the product, that is, the
webpage.
* Now is the time for the content management team to give final touches to the
product and here the role of the editorial manager becomes crucial. This is the
most important stage in content management, as it is the final stage before the
product finally hits the market.
* It is the efficiency of the content management system that can actually make or
break the success of the WebPages. It is the scrutinizing power of the content
management that decides the fate of the webpage once it has been uploaded to
the server. However, a thorough scrutinizing or detailed editing does not alone
influence the popularity of the website. It is the foresight of the management in
choosing the topic that determines to some extent how popular the website is
going to be. Thus research work should be of high quality, and it is the
responsibility of the content manager to ensure that the best materials are used
to produce the output.
So the content management workflow involves:
* Knowing the trends and needs of the market before deciding to launch any
product
* Conducting refined research
* Thorough scrutinizing of the contents
* Smoothly coordinating between words and designs.
First, it's important to understand some concepts presented in this article.
Metadata:
Metadata is "defining" data that provides information about content data
managed within the CMS. In other words, metadata assigned to a piece of
496
content describes what the content is, what it relates to, and how it's
associated with other pieces of content. Metadata is commonly described
as "information about information."
For Web sites, content usually has categories, keywords, authors,
publishing dates, and template assignments that control how the content is
displayed and used. This metadata is then used for a variety of purposes,
including searching and indexing.
Template:
A CMS uses templates to control the display of your content ‐‐ that is, the
way your content will look on a page. The templates are created by your
Web designers and are managed separately from the content. At
publication time, the CMS puts the content into the template for final
presentation. Templates are like blank versions of your page types; until the
CMS puts, for example, the specific "Subject" content into the "Subject"
space, there's nothing there. When multiple pages on your site follow the
same formatting, you can use one template for all the pages that will look
the same, even when the actual content is different.
Separation of content and presentation:
When you create content, the CMS isolates the content data from the
content formatting and the content metadata. On a Web site, data usually
consists of the text and the images that will appear on your site. The
formatting is determined by the template assigned to that content, and the
metadata is defined above. The CMS stores these components separately
and maintains the relationship between them. This concept is important,
because it makes your content data portable. Since your content is not
attached to the format for which it was originally written, you're free to use
it in other ways.
At the center of any CMS is some sort of repository, or content database, where
the content is stored. Content contributors get content into the repository via the
authoring interface, and they categorize and organize it using the metadata
management tools. When the content is ready, the CMS helps to get the content
back out for publishing. The end of the publishing process is the beginning of the
presentation process, at which point your audience can view your content. All of
this is managed by the CMS workflow system.
Simple, right? It gets more complicated as more features are added. The basic
feature of a CMS include:
497
Content Authoring:
This allows your content contributors to create content and store it in the
repository. There are many tools and styles.
Workflow Management:
This allows you to monitor, adjust, and maintain the process through which
the creation and publishing tasks are done in your organization. Systems
range from highly complex to quite simple, but all give you a set of tools to
manage the activities of authors and the progress of content.
Content Storage:
This feature keeps the content sensibly organized and accessible. Most CMS
use a relational database; the point is to store the content in one place and
in a consistent fashion.
Publication Management:
This allows you to organize your content with metadata and formatting.
CMS have different ways of approaching this, but the better ones allow you
to define and manage your metadata and your templates.
Publishing:
Publishing allows you to merge the content data and the content
formatting and move it from the repository to your publication. Different
methods exist, but they all allow you to push the content out to some
publicly accessible place without the help of your tech team.
Within these features, there can be hundreds of smaller features that help
accomplish the tasks of creating content publications. It's important to get the
system that meets your basic needs the best, then consider these other features
(which we'll discuss in the third article in the series).
498
Benefits of a CMS
Separation of Content Data and Presentation Data
Because content in a CMS isn't inextricably tied to a particular presentation
format, two powerful abilities are available:
Content portability:
Since the CMS stores content as data, that data can be inserted into any
appropriate output format or template. If you want your article to appear
with a blue background in your Members section, but with a yellow
background in your General Information section, you don't need to write
your article twice. Instead, you write it once and assign it to the blue
template and the yellow template.
Design flexibility:
Similarly, since the CMS stores the templates separate from the content
data, if you want to make a design change, however small (such as
changing the font color on a particular type of page) or sweeping (such as
changing the font color, type, and size throughout your site), you only need
to change the template; the CMS handles the rest.
The whole point of the CMS is to let your authors concentrate on creating
content, freeing them from the duties ad‐hoc Web design, publishing worries,
having to manually repurpose their content for other formats, and so on. A CMS
can save you money and time by stripping away these extraneous tasks.
Single Storage in a Single Place
In a CMS, all the content data is stored in one place, in a consistent way, and
perhaps most importantly, only once.
If you've ever suffered because you have nine different versions of an article and
you can't figure out which one to use, you'll be happier with a CMS. The system
maintains one copy of the content, regardless of how you plan to use it. If, for
example, you have a press release that's displayed in your Press Release section,
your News Section, and your Archives section, and a mistake is discovered, the
process for fixing it will be easier. Without a CMS, you would probably have to fix
499
the mistake in three files; with a CMS, you would fix it in one file (because there's
only one data file anyway), and the change appears in all three locations.
Because your content is stored consistently in one system, it's much easier to
create relationships (usually hyperlinks) between content pieces and maintain
them. For example, if you have several pieces that link to each other, and you
move one, the CMS will make the necessary changes to keep the links working.
It's also simpler to create a new piece of content by aggregating other pieces. For
example, let's say you have a collection of Internet tips, each stored as a separate
piece of content, but all united by the same metadata. A CMS makes it easy to
present all those pieces together by creating a template that shows all content
that had the metadata, in this case, "type: tip" and "subject: internet". It's also
much easier to survey what you have
Finally, should you decide to take all your content and migrate it to some new
format, the process should be much easier
All of this means more time and money saved: you don't duplicate work, you
don't lose content, and you spend less time managing content.
Workflow Management
Any good CMS will have some sort of workflow management scheme. This usually
involves defining certain roles ‐‐ such as author, editor, and publisher ‐‐ and giving
each of those roles some abilities and responsibilities.
Likewise, content can exist in a number of states, such as draft, final, published, or
archive, and each state has certain characteristics.
Combine the roles and the states, wrap some logic around it, and you have a
workflow system. The author is assigned to create the draft, the editor is notified
that the draft is ready to be edited, etc.
Workflow management facilitates better communication, progress tracking, and
more efficient content transitions. Even a basic system will notify the appropriate
role that a piece of content has reached a state where it needs attention. More
advanced systems allow all sorts of triggers and controls to be put into place.
None of these features are going to do the work of managing your processes;
rather, they give you better visibility into the process and better tools to do the
work.
500
The major gain here is control, which saves time and money by speeding
communication and preventing mistakes. The workflow system handles much of
the communication, tracking, and measuring so your authors, editors, and
publishers can concentrate on writing, reviewing, and publishing, instead of
walking around checking on things, looking for lost drafts, and trying to figure out
where all the time has gone.
Automated Publishing
When it comes to freeing technical resources from publishing tasks, almost any
CMS shines. The CMS allows non‐technical people to schedule, trigger, and
otherwise manage the process of moving the content to the production
environment.
If your valuable technical people are constantly distracted by pushing out small
text changes, regularly releasing new articles, or fixing layout issues, the CMS will
change their worlds. With a CMS in place, these tasks become things that
publishers and editors can do, usually with a powerful set of tools available within
the CMS. The technical people maintain the CMS, but it's at much higher level,
and their time is greatly freed to handle more technical issues throughout your
organization.
Usually, the actual time required to publish your content is reduced. More
importantly, the time it does take is spent by the most appropriate people
(authors, editors, publishers), and not by people who are probably supposed to
be working on a new Web site feature or tuning up the network.
Hopefully, you have a more specific idea of what a CMS does, and how a CMS
might save your organization time, effort, and therefore money. On top of that, a
CMS will enable you to better manage your content, therefore making it more
usable for you and your constituency.
Nonprofit‐Focused Content Management Systems
Within the nonprofit world, there are a number of CMS solutions available that
were developed with nonprofits in mind. Many of these have the same abilities as
commercial CMS products, but they offer specific features for nonprofits, such as:
• Membership Management
• Online Donation Facilitation
501
• E‐mail Outreach
• Event Management
• Online Advocacy
\
XLANG
XLANG is an extension of WSDL, the Web Service Definition Language. It provides
both the model of an orchestration of services as well as collaboration contracts
between orchestrations. XLANG, like BPML, were designed with an explicit ‐
calculus theory foundation .
Message Flow
Actions are the basic constituents of an XLANG process definition. The four types
of WSDL operations (request/response, solicit response, one way, and
notification) can be used as XLANG actions. XLANG adds two other kinds of action:
timeouts (deadline and duration) and exceptions. Timeout cannot be properties
of specific actions as they may apply to an arbitrary block of actions. Timeouts
should be viewed as the action of sending a timeout event to the BPMS.
A process definition is specified within a service definition. The XLANG process
definition specifies the behaviour of the service. A service with a behaviour
represents an interaction spanning many operations; the incoming and outgoing
operations of the XLANG service represent interactions with other services,
therefore sequencing the operations of a given service is equivalent to
orchestrating a series of services. The interaction has a well‐defined beginning
and end.
502
Since the interaction may be long running, a given service may initiate many
different "process instances" based on the request of different clients. An
instance can be started in two ways. A service may be explicitly instantiated by a
background process or some application functionality or it may be implicitly
instantiated with an operation. Each time the service receives a message
corresponding to this operation, it will create a new business process instance.
This operation is called an activation operation (in this case, the activation
attribute has a value of true). Such an action must be an input with respect to its
operation within the service definition. A service instance terminates when the
process that defines its behavior ends.
For instance a purchase order service may have two operations; one initiated by
the buyer, which itself activates a process instance, and one initiated by the seller,
which once completed marks the end of the process.
XLANG specifies the notion of message correlation. BPMI is currently working on
that specific issue and it should be part of the final BPML specification. Let's detail
the message correlation concept.
A service instance typically holds one or more conversations with other service
instances representing other participants (users, enterprise systems, partners)
involved in the interaction. It is possible that an enterprise system or a partner is
not set up with a communication protocol that keeps track of the conversation.
For instance, the only way to identify a given process instance might be to look up
the purchase order or the invoice number. Sometimes, correlation patterns can
become even more complex. The scope of correlation is not, in general, the entire
interaction specified by a service, but may span a part of the service behaviour.
XLANG implements message correlation by providing a very general mechanism
to specify correlated groups of operations within a service instance. A correlation
set can be specified as a set of properties shared by all messages in the correlated
group. The corresponding set of operations (in a single service) is called a
correlation group. A correlation group would typically correspond to a BPSS
collaboration. Whenever possible, it is better to keep track of the "collaboration
id" at the protocol level (ebXML protocol that is) rather than at the document
level even if XLANG allows you to work with data elements such Purchase Order
number which are inherently dependent on the document format. If the
correlation is specified at the ebXML envelope level, the service which binds a
collaboration to the business process just needs to keep track of all the open
503
collaborations with their respective collaboration ID. An XLANG collaboration
group has the same lifecycle as an ebXML collaboration:
• Correlation groups are instantiated and terminated, within the scope of
their service instance.
• Correlation groups may go through several instantiations within the
lifetime of a single service instance.
• The instantiation of a correlation group is triggered by a specially marked
operation.
• The correlation group instance lifetime is determined by the lifetime of its
context or service.
Data Flow
Just like BPML, XLANG relies on an XML data flow, which is fed by the message
flow and supports the control flow decisions. XLANG assumes that XML document
types are specified with XML Schema (XLANG does not support DTDs. A property
is bound to an element of an XML document with and XPath statement.
Properties have globally unique qualified names (QNames). Properties may be
either simple or structured. Simple properties are used mainly for correlation,
while structured properties are used for passing port references and participant
bindings for constructing dynamic participant topologies.
Control Flow
The control flow of BPML is very similar to that of XLANG. Elements such as
<sequence>, <switch>, and <all> have a similar meaning. In addition, XLANG
provides support for looping with the <while> element, which specifies that a
given fragment of the process definition is executed until a specified condition is
no longer true. This is particularly useful to support ebXML collaboration patterns
such as review or modify which may have recurrent business transactions.
Like in BPML, XLANG provides semantics to specify exceptions and exception
handlers, with the <pick> construct.
XLANG has introduced the notion of a context for local declaration of correlation
sets and port references, exception handling, and transactional behaviour. A
context provides and limits the scope over which declarations, exceptions, and
transactions apply.
504
XLANG supports open transactions, but unlike BPML, it does not support
coordinated transactions. XLANG transactions follow the model of long‐running
transactions, which are associated with compensating actions in case the
transaction fails.
There is often an issue when specifying the outgoing port addresses: it is rarely
possible to know in advance the address of the outgoing message. XLANG allows
us to specify that the address bound to these outgoing ports will be supplied
dynamically. As with the correlation set, we are confronted with the problem of
locating this information in the content of documents. XLANG enables us to bind
the address (and other parameters if necessary) to a property definition on a
document. If this mechanism is more generic, it is also trickier, since it will
strongly depend on the document formats, which may not have been designed to
support the corresponding information. For instance, a purchase order will carry
the contact information of the buyer, but it may not carry the URL to which the
"acknowledge purchase order" should be sent. In general we recommend treating
the ebXML header as a document and assigning all correlation sets and binding
parameters to the ebXML header whenever possible. We also recommend to
bring the corresponding CPA within the business process instance context in order
to leverage at run‐time its information.
Business Process Contracts
This part of XLANG overlaps with ebXML BPSS. However, unlike their name would
suggest, contracts do not support any business related semantics. It is merely a
mapping between two port types which interact together. There is no notion of
business transaction, non‐repudiation, or legally binding transactions. The
concept is actually fairly difficult to use in real life since the two port types need
to support unidirectional messages in order to establish a contract. Consequently,
if your business relationship requires a request followed by a response, they
cannot belong to the same contract. A contract can only map ports which are
"unidirectional": an input only port will map to an output only port and
conversely:
505
In the rare cases where this is applicable a contract definition would look like this:
<XLANG:contract>
<XLANG:services refs="provider:Create RFQ user:Create RFQ provider:Accept
RFQ
user:Accept RFQ"/>
<XLANG:portMap>
<XLANG:connect port="provider:Create RFQ/port:GetRFQ"
port="user:Create RFQ/port:SendRFQ"/>
<XLANG:connect port="provider:Accept RFQ/port:SendAcceptRFQ"
port="user:Accept RFQ/port:GetAcceptRFQ"/>
</XLANG:portMap>
</XLANG:contract>
WSFL
Introduction
The Web Services Flow Language (WSFL) is an XML language for the description of
Web
Services compositions. WSFL considers two types of Web Services compositions:
The first type specifies the appropriate usage pattern of a collection of Web
Services, in such a way that the resulting composition describes how to achieve a
particular business goal; typically, the result is a description of a business process.
The second type specifies the interaction pattern of a collection of Web
Services; in
this case, the result is a description of the overall partner interactions.
Flow Models
506
In the first case, a composition is created by describing how to use the
functionality provided
by the collection of composed Web Services. This is also known as flow
composition,
orchestration, or choreography of Web Services. WSFL models these
compositions as
specifications of the execution sequence of the functionality provided by the
composed Web
Services. Execution orders are specified by defining the flow of control and data
between
Web Services. For this reason, in this document, we will also use the term flow
model to refer
to the first type of Web Services compositions. Flow models can especially be
used to model
business processes or workflows based on Web Services.
Global Models
In the second case, no specification of an execution sequence is provided. Instead,
the
composition provides a description of how the composed Web Services interact
with each
other. The interactions are modeled as links between endpoints of the Web
Services’
interfaces, each link corresponding to the interaction of one Web Service with an
operation
of another Web Service’s interface. Because of the decentralized or distributed
nature of
these interactions, we will use the term global model in this document to refer to
this type of
Web Services composition.
Recursive Composition
WSFL provides extensive support for the recursive composition of services: In
WSFL, every
Web Service composition (a flow model as well as a global model) can itself
become a new
Web Service, and can thus be used as a component of new compositions. The
ability to do
507
recursive composition of Web Services provides scalability to the language and
support for
top‐down progressive refinement design as well as for bottom‐up aggregation.
For these
reasons, recursive composition has been a central requirement in the design of
the WSFL
language.
Hierarchical and Peer‐to‐Peer Interaction
WSFL compositions support a broad spectrum of interaction patterns between
the partners
participating in a business process. In particular, both hierarchical interactions
and peer‐topeer
interactions between partners are supported. Hierarchical interactions are often
found
in more stable, long‐term relationships between partners, while peer‐to‐peer
interactions
reflect relationships that are often established dynamically on a per‐instance
basis.
Language Overview
Before getting into a more detailed description of WSFL, we will sketch two use
cases for the
application of Web Services composition.
2.1 Use Cases
In the first use case, an enterprise wants to implement a business process for
processing
purchase orders using a set of Web Services.
They would identify the:
B usiness process (for example, check credit history of the customer, reject
order,
process order, ship goods)
Business rules for sequencing of these steps (for example, first check credit,
then
depending on the outcome, either reject the order or process the order followed
by
shipment of the goods)
Flow of information between the process steps (for example, take purchase
order as
input to the process, pass it on to check credit, and so on).
508
In this “bottom‐up” development scenario, they would find Web Services already
offered by
other vendors and companies that can be used to realize the various processing
steps (for
example, a credit‐checking service offered by a financial institution, a goods‐
production
service offered by their favorite supplier, and a shipping service). They would then
use WSFL
to formally define the new business process.
A WSFL flow model defines the structure of the business process: WSFL activities
(circles in
the figure above) describe the processing steps, and WSFL data and control links
represent
the sequencing rules and information flows (eventually performing necessary
data mapping)
between these activities. For each activity, they would identify the WSFL service
provider
509
responsible for the execution of the process step (for example, services offered by
shipping
company A or by goods‐supplier company B) and define the association between
activities
in the flow model and operations offered by the service provider using WSFL
export and
plug link elements. The resulting flow model is shown in the center of the figure
above with
”swim lanes” representing the association of activities with service provider roles.
The second use case is a variant of the previous example. Here, an enterprise
wants to offer
a Web Service that mediates between service requesters (customers) who want
to order
goods and service providers who produce and deliver goods.
As in the previous example, the enterprise would define the business process for
handling
purchase orders as a WSFL flow model. In this case, however, they would not bind
the
activities to particular service providers. Instead they would identify the kind of
service
provider (role) they want for each activity (for example, some goods supplier for
activity
process order, some shipping service for activity ship goods).
They would then define the WSDL Web Service interface of the flow model, that
is, the WSFL
Service Provider Type of the flow model. This interface has two facets: One facet
defines the
interface that a customer would use when requesting processing of a purchase
order, that
is, the operations that the Web Service provides for use by service requesters. For
example,
the new service would provide an operation that takes a purchase order as input
and passes
it on (through a WSFL flow source) to the activities in the flow model for
processing. The
other facet identifies the operations that the service requires from the other
service
providers.
510
For each activity, there is one (proxy) operation on the external interface of the
flow model
that the service would use to interact with a service provider implementing that
activity. The
resulting Web Service is depicted as the dark shape around the flow model in the
figure
above. This Web Service can now be advertised in a service repository where it
would
attract two kinds of parties: those who want to use services provided by the Web
Service (in
our case, customers who want to place orders) and those who want to play the
role of a
service provider (in our example, a shipping or a goods supplying service).
To make this model work, the activities in the flow model must be connected to
operations
that actually perform the process steps represented by each activity. This is done
by a WSFL
global model (the outermost box in the figure above), which describes the
interaction
between service providers and requesters. Our enterprise would use WSFL service
provider
locators to define criteria for selection of a particular service provider and WSFL
plug links to
associate operations on service provider elements with the service‐requesting
operations on
the interface of the flow model.
A Quick Tour of WSFL
The purpose of a WSFL document is to define the composition of Web Services as
a flow
model or a global model. Both models have a declared public interface and an
internal
compositional structure. The composition assumes that the Web Services being
composed
support certain public interfaces, which can be specified as a single port type or as
a
collection of port types. We call this collection a service provider type.
The following code is a simplified example of a WSFL service composition defining
a flow
511
model called totalSupplyFlow. The syntax of many elements has been abbreviated
in the
interest of conciseness. The example assumes a set of WSDL port type and
operation
definitions as public interface of the service provider types referred to: the
supplier and
shipper service provider types are somehow assumed by the flow model; the
totalSupply service provider type appears to be defined by the flow model, but it
has
been already defined somewhere else, which is perfectly valid. Note that the flow
model
imposes “sequencing constraints” for the execution of operations of the
totalSupply
service provider type.
<flowModel name="totalSupplyFlow"
serviceProviderType="totalSupply">
<serviceProvider name="mySupplier" type="supplier">
<locator type=”static” service=”qualitySupply.com”/>
</serviceProvider>
<serviceProvider name="myShipper" type="shipper">
<locator type=”static” service=”worldShipper.com”/>
</serviceProvider>
<activity name=”processPO”>
<performedBy serviceProvider=”mySupplier”/>
<implement>
<export>
<target portType=”totalSupplyPT”
operation=”sendProcOrder”/>
</export>
</implement>
</activity>
IBM Software Group
Web Services Flow Language 10
<activity name=”acceptShipmentRequest”>
<performedBy serviceProvider=”myShipper”/>
<implement>
<export>
<target portType=”totalSupplyPT”
operation=”sendSR”/>
512
</export>
</implement>
</activity>
<activity name=”processPayment”>
<performedBy serviceProvider=”mySupplier”/>
<implement>
<export>
<target portType=”totalSupplyPT”
operation=”sendPayment”/>
</export>
</implement>
</activity>
<controlLink source="processPO" target="acceptShipmentRequest"/>
<dataLink source="processPO" target="acceptShipmentRequest">
<map sourceMessage="anINVandSR" targetMessage="anSR"/>
</dataLink>
</flowModel>
The totalSupplyFlow flow model specifies how to collaborate with two service
provider
types in order to offer to their joint customers a complete business process. Each
of the two
service providers used within the flow model is represented by a separate
<serviceProvider> element. One service provider is of type supplier and is
referred to
as mySupplier in the flow model. The other service provider is of type shipper and
is
called myShipper. Both service providers contain “binding” information as well.
This
information is provided by means of a <locator> element, which specifies the
actual
service that will be used when the model is instantiated. In this case, binding
information is
“static,” but more dynamic binding schemes are possible.
The business process represented by the totalSupplyFlow flow model consists of
three
business tasks, called activities, that have to be performed in order to successfully
complete
the business process: A purchase order has to be processed, a shipment request
must be
513
accepted, and money has to be received. Each of these activities is specified by a
separate
<activity> element.
In our code example, the activities cannot be performed in any order, but there is
a
sequencing constraint between them: the processing of the purchase order by the
supplier
must precede the acceptance of the shipping request by the shipper; the money
can be
received at any time. The precedence rule is specified by simply connecting the
two
corresponding activities. Two kinds of connections are established, a control
connection
(through a <controlLink> element), and a data connection (through a <dataLink>
element).
While the first connects the completion of one activity to the execution of
another, the second
connection represents a data exchange between the two. Note the <map>
element nested
inside the data link: it specifies what information needs to be transferred
between the two
514
linked activities. Also note that the separation of control flow and data flow is
very helpful. For
example, a service might only be enabled after the completion of another service
without
explicitly passing data from the former to the latter.
Web Services interact in a peer‐to‐peer manner. This pattern is immediately
reflected by the
interacting operations. For example, if a flow sends out a message via a
notification
operation, this operation corresponds to a one‐way operation at a service
provider. Pairs of
corresponding operations in this sense are referred to as dual operations. In our
example,
the activity processPO has to send out a process order. For this purpose, the
totalSupply service provider type declared by the flow model is assumed to
include a
port type totalSupplyPT with a sendProcOrder operation, which implements the
activity.
An <implement> element establishes this relation between an activity and its
implementing
operation. The service provider who is supposed to interact with an activity’s
implementation
(for example, to process the message sent) is defined through a <performedBy>
element.
To define the public interface of the composition, the <flowModel> element
includes a
declaration of the supported service provider type as an attribute of the flow
model, and a
mapping of operations of the port types of this service provider type to activities
of the flow
model. As indicated in the following figure, this mapping is specified by an
<export>
element, which relates an activity of the flow model and an operation of its public
interface.
This mapping defines the effect of each operation by relating it to the execution
of the
internal composition. The public interface defines the interaction of a flow model
with the
“outside,” that is, it specifies which messages are sent and which are used.
515
Service Composition Metamodel
This section describes at the conceptual level how Web Services are wired
together into
flows that represent business processes (see Section 3.1 “Flow Metamodel”).
Section 3.2
“Lifecycle Interface” describes how instances of such a business process are
manipulated
as a whole. In Sections 3.3 “Business Process Lifecycle” and 3.4 “Activity
Lifecycle,” we
sketch a minimum set of states and the transitions between them that further
describe a
business process and each of its encompassed activities. Finally, Section 3.5
“Recursive
Composition Metamodel” gives an overview on how new Web Services are
composed out of
other Web Services.
3.1 Flow Metamodel
This section describes the main concepts of the metamodel underlying WSFL for
specifying
flows. This is done by describing its syntax as a special kind of directed graph
(Section 3.1.1
“Activities”) and its semantics by showing how each of the syntax elements is to
be
interpreted in concert with the other syntax element (see Section 3.1.2
“Operational
Semantics”).
3.1.1 Syntax
This section describes the various ingredients of the metamodel in detail and
explains their
operational semantics.
3.1.1.1 Activities
Operations of Web Services are used within business processes as
implementations of
activities. An activity represents a business task to be performed as a single step
within the
context of a business process contributing to the overall business goal to be
achieved. The
operation used may be perceived as the concrete implementation of the abstract
activity to
516
be performed. Refer to Section 3.5.4 “Which Operation Is the Activity
Implementation?” for
more details.
Activities correspond to nodes in a graph. Each activity has a signature that is
related to the
signature of the operation that is used as the implementation of the activity.
Thus, an activity
can have an input message, an output message, and multiple fault messages. Each
message can have multiple parts, and each part is further defined in some type
system.
The figure above depicts an activity A with input message M and output message
M’. Input
message M has three message parts called µ1, µ2, and µ3. Output message M’
has two
message parts, called µ4 and µ5. Message part µ3 is defined through an XML
schema the
root of which is a <sequence> that contains some other complex type, a decimal
simple
type and a simple type that may hold multiple string fields.
3.1.1.2 Control Links
Activities are wired together through control links. A control link is a directed
edge that
517
prescribes the order in which activities will have to be performed (that is, the
potential
“control flow” between the activities of the business process). The endpoints of
the set of all
control links that leave a given activity A represent the possible follow‐on
activities A1, …, An
of activity A.
3.1.1.3 Transition Conditions
Which of the activities A1,…, An actually have to be performed in the concrete
instance of the
business process (that is, the concrete business context or business situation) is
determined
by so‐called transition conditions. A transition condition is a Boolean expression
that is
associated with a control link. The formal parameters of this expression can refer
to
messages that have been produced by some of the activities that preceded the
source of
the control link in the flow.
When an activity A completes, exactly those control links originating at A are
followed to their
endpoints the transition conditions of which evaluate to true. This set of activities
is referred
to as “actual follow‐on activities” of A in contrast to the full set {A1,…, An} of
“possible followon
activities.” It is said that “control flows from A to the actual successors of A,” or
that the
“control flow visits the actual successors of A,” or that “navigation proceeds from
A to its
actual successors,” or something similar like that.
In the following figure, activity B might need to be performed after activity A
completes. The
transition condition of the corresponding control link is specified as an XPath
expression that
references the output message of A: Activity B will be performed (“control flows
to B” or
“navigation proceeds to B”) if, and only if, the integer value returned by A will
have a value
greater than 42.
518
3.1.1.4 The Origin of Flow Dynamics
Note especially that this mechanism is the origin of the whole dynamics within
the control
flow of business processes: Activities produce actual data values for their output
messages,
and these values will be substituted as actual parameters of the formal
parameters of
transition conditions. Exactly those control links will be followed whose transition
conditions
evaluate to true in their actual parameters. And exactly the endpoints of those
control links
are the activities that have to be performed next “in the current business
context.” Thus,
whenever an activity completes, that is, the operation of the Web Service that
implements the
activity returns data, this actual data can be made the basis for deciding which
activities
have to be performed next. And these activities are typically highly dependent on
the data
returned.
3.1.1.5 Control Links As Edges
Control links are the first kind of edges in the graph structure that we use to
represent
models of business processes, or simply, flows. First of all, such an edge is
directed,
519
pointing from its source activity to its target activity, that is, from an activity to its
(or one of
its) potential successor activities. Next, such an edge is “weighted” by a transition
condition,
determining the actual flow of control. We do allow at most one control link
between two
different activities. Finally, the resulting directed graph must be acyclic, that is, we
do not
allow loops within the control structure of a flow (however, see Section 3.1.1.11
“Loops,” on
how loops are supported in a controlled manner).
Note that tools supporting the graphical construction of WSFL‐compliant flow
models can
choose to support drawing loops. But the loops supported by the tool must be
able to be
transformed into the restricted variant of loops supported by WSFL. This
restricted variant
basically corresponds to “do until” loops.
3.1.1.6 Forks And Parallelism
An activity (like activity A in the following figure) is called a fork activity if it has
more than one
outgoing control link. When activity A completes, all control links leaving A will be
determined and all associated transition conditions (pAB and pAC in the figure)
will be
evaluated in their actual parameters. The target activities of all control links
whose transition
conditions evaluated to true are exactly the activities that are to be performed
next within the
flow. For example, if pAB evaluated to true but pAC evaluated to false, exactly
activity B will be
scheduled to be performed; if pAB evaluated to false and pAC evaluated to true,
exactly C is to
be performed next.
In case both pAB and pAC get the truth‐value of true assigned based on the actual
parameters,
and both activities B and C will have to be performed next. (We will explain later
what
happens along paths that are determined by a control link whose transition
condition
520
evaluated to false. See 3.1.2.1 “Death‐Path Elimination”). In particular, it is very
easy to
achieve parallelism in the execution of flows: Simply introduce a fork activity and
the
“subgraphs” that are spawned‐off by the control links with a true transition
condition will be
performed in parallel.
Joins and Synchronization
Typically, parallel work has to be synchronized at a later time. Synchronization is
done through join activities. An activity is called a join activity (like activity F in the
figure above) if it has more than one incoming control link. By default, the
decision whether a join activity is to be performed or not is deferred until all
parallel work that can finally reach the join activity has actually reached it (see
3.1.1.8 “Join Conditions” for potential deviations from this default behavior). In
the figure above, when pAB and pAC had been evaluated to true, B and D can be
performed in parallel with C, and F cannot be performed until control passed from
C to F and from D to F. At that time, the truth‐value of the transition conditions
pDF and pCF are known; based on these truth‐values it can be specified whether F
should be performed if, and only
521
if, both parallel executions successfully reached F (“pDF AND pCF”), or whether it
suffice that at least one of the parallel executions reached F successfully (“pDF OR
pCF”), and so on.
Join Conditions
Thus, the truth‐values of transition conditions of control links that enter a join
activity allow for a more fine‐grained mechanism of synchronizing parallel work at
join activities. This mechanism is introduced through join conditions: A join
condition is a Boolean expression associated with a join activity, and the formal
parameters of this expression refer to the transition conditions of the incoming
control links of the join activity.
Work along parallel paths reaches a join activity at different points in time. For
example, activity C in the figure before might have been completed fast and the
transition condition pCF is evaluated while B is still running, that is, the transition
condition pDF gets evaluated at a later point in time. By default, the decision
whether F is to be performed or not is deferred until pDF has also been evaluated,
even if the join condition is “pDF or pCF,” for example, and is known to be true
long before the truth‐value of p is known.
Thus, join conditions are really a means to synchronize parallel work, that is, to
wait until parallel work comes to an end and then decisions can be made how to
proceed. Sometimes, a weaker semantics of synchronization is appropriate and
supported by the metamodel of WSFL: As soon as the truth‐value of a join
condition is known, the associated join activity is dealt with accordingly (that is,
either performed or skipped). Control flow that reaches the corresponding join
activity at a later time is simply ignored.
Start and End Activities
But what about activities that have no incoming control connector (like A, B, and
X in the following figure), or outgoing control connector at all (like H, I, J, and X)?
These kinds of activities are called start activities or end activities, respectively. In
the following figure, activities A, B, and X are start activities, and activities H, I, J,
522
and X are end activities.
Conceptually, each activity has a join condition associated: A node with a single
incoming control link can be perceived as having a join condition that consists of
the transition condition of the incoming control link. A start activity can be
perceived as having a trivial join condition that consists of the constant “true”
predicate. With this convention in mind, an activity can be started whenever its
join condition is fulfilled. In particular, the join condition of an activity with no
incoming control link is fulfilled when the flow model is “started,” thus, the
corresponding activities are “start activities” also from that perspective.
When a flow model is instantiated, all of its start activities are determined and
scheduled to be performed. Based on the start activities of a flow, the “regular”
navigation through the graph representing the flow model continues. That means,
when a start activity completes, its actual successors are determined based on the
control links originating at the completed start activity.
When an end activity completes, navigation stops at this point because there is no
possible follow‐on activity and thus, no actual successor to determine. But
navigation might continue in other parts of the graph, thus, a lot of activities of
the overall flow might still be awaiting their execution. But if all end activities
within the graph have been reached, the overall flow is done. When the last end
activity completes, the output of the overall flow is determined and returned to
its invoker; and then, the flow ceases to exist.
523
Exit Conditions
The following figure summarizes the flow‐relevant fine structure of an activity
introduced so far. An activity is linked to the operation of a port type as its
implementation, and if the activity is a join activity, it has an associated join
condition. What is also shown is the exit condition associated with an activity: An
exit condition is a Boolean expression, the purpose of which is to determine
whether or not the execution of the implementation of the activity completed the
business task represented by the activity. The expression can refer to the output
message of its associated activity or even to output of any activity that ran before
on the control path of the subject activity; the expression of an exit condition is
provided in
XPath syntax like the expression of a transition condition is. The exit condition is
evaluated once the operation of the implementing port type terminates.
If the exit condition evaluates to true, the activity is treated as “completed.” If the
activity is completed, navigation continues and the next activities to be performed
are determined based on the just‐completed activity; otherwise, the activity is
executed again.
For example, the exit condition can check particular reason codes or return codes
of the activity implementation In doing so, the activity can be retried if a code
indicates an implementation problem (for example, “automatic rollback due to
detected deadlock”). Or the application already aggregates lower‐level reason
codes and provides a return code that basically says whether the implementation
524
executed correctly or not. Or the exit condition checks a field that is implicitly set
by a user (“The customer did not answer the phone call–I’ll try at a later time”).
As all of these examples show, the exit condition allows to distinguish two events,
namely the event that signals that the activity implementation returned from the
event that signals that the associated piece of work (the business task) completed
successfully. And navigation typically should continue only if the business task
completed and not if the implementation has been interrupted for whatever
reason.
Loops
But there is another important use of exit conditions, namely for looping: An
activity is iterated until its exit condition is met. Often, this mechanism for
realizing do‐until loops is used when an activity is implemented by another flow,
that is, by means of the call lifecycle operation (see Section 3.3 “Business Process
Lifecycle” and 4.6.5.2 “Lifecycle Operation call”). Because the metamodel does
not support cyclic graphs, cycles must be realized by separate flows that are
iterated based on exit conditions. This enforces a block‐oriented specification of
loops well known from structured programming.
Supporting arbitrary loops would allow specifying situations that are ambiguous,
difficult to model unambiguously, and much more difficult to comprehend. The
following figure shows a cyclic graph. Assume that control flows from A to B to C,
and D and E are actually executed. We further assume, that when D completes,
navigation can proceed to B again. When B completes the second time, control
flows to C, and may continue to E and D again. Many problems and questions
come up, for example:
B is a join node. When control flows from A to B (the first time) the truth‐value
of the transition condition of the control link from D to B is unknown. The join
condition of B must be an expression in ternary logic to specify the appropriate
behavior. When C completes the second time, should control really flow to E
again? Or does the intended loop just consist of B, C, and D? If the control flow
should proceed to E,
it might happen that E is still running because of its first invocation. What should
happen in this situation? Should E be immediately interrupted and started again,
or should the completion of E be awaited before its next invocation? When D
completes and control flows back to B, and could also flow to F, should F
525
be really started? Or should only the “backward control link” be honored? If F
should be started, the same questions occur as for E before.
Data Links
There is a second kind of directed edges in the graphs of the metamodel, the so‐
called data links. A data link specifies that its source activity passes data to the
flow engine, which in turn has to pass (some of) this data to the target activity of
the data link. For example, the next figure depicts that activity A expects input
data from activity B, which is indicated by a dashed directed edge (while we use
solid edges to draw control links). To make this meaningful, a data link can be
specified only if the target of the data link is reachable from the source of the
data link through a path of (directed) control links. Thus, data always flows along
control links.” This makes sure in an easy manner that a couple of error‐prone
situations are avoided. For example, the spectrum of such situations extends from
trying to consume data that has not been produced yet, to dead‐lock situations in
which one activity requires data from another activity as input but the latter
activity needs the output of the former as its own input.
It is not required that data be always passed to an immediate successor of its
producer. Many different activities might be visited along the path made from
control links from the source of a data link to the target of the data link. An
activity might be the target of multiple data links. For example, this allows
aggregating input from multiple sources, or it allows specifying alternative input
from activities from alternative parallel paths. To facilitate this, data links are
weighted by so‐called map specifications. A map prescribes how a field in a
message part of the target’s input message of a data link is constructed from a
526
field in the output message’s message part of the source of the data link. It even
allows that multiple maps to be defined for the same message part target. This is
needed, for example, when alternative paths in the control are specified and data
needed further on can be produced along each of the paths. .
527