Sei sulla pagina 1di 80

XML

1
What is XML?

XML stands for Extensible Markup Language

XML is a markup language much like HTML

XML was designed to carry data, not to display data

XML tags are not predefined. You must define your


own tags

XML is designed to be self-descriptive

XML is a W3C Recommendation

XML Tutorial, 2
Difference Between XML and HTML

XML is not a replacement for HTML.


XML and HTML were designed with different goals:

XML was designed to transport and store data, with


focus on what data is.
HTML was designed to display data, with focus on how
data looks.

HTML is about displaying information, while XML is


about carrying information.

XML Tutorial, 3
HTML vs. XML HTML tags:
presentation, generic
<h1> Bibliography </h1> document structure
<p> <i> Foundations of DBs</i>, Abiteboul, Hull, Vianu
<br> Addison-Wesley, 1995
<p> <i> Logics for DBs and ISs </i>, Chomicki, Saake, eds.
<br> Kluwer, 1998

<bibliography>
XML tags:
<book> <title> Foundations of DBs </title>
<author> Abiteboul </author> content, "semantic",
<author> Hull </author> (DTD-) specific
<author> Vianu </author>
<publisher> Addison-Wesley </publisher>
....
.</book>
<book> ... <editor> Chomicki </editor>... </book> ...
</bibliography>

XML Tutorial, 4
XML Does not DO Anything

XML was created to structure, store, and transport information.

The following example is a note to Tove from Jani, stored as XML:

<?xml version="1.0" encoding="ISO-8859-1"?>


<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

XML Tutorial, 5
XML is

XML is Just Plain Text

With XML You Invent Your Own Tags

XML is Not a Replacement for HTML

XML is a software and hardware independent tool


for carrying information

XML is a W3C Recommendation

XML is Everywhere

XML Tutorial, 6
How Can XML be used?

XML Separates Data from HTML

XML Simplifies Data Sharing

XML Simplifies Data Transport

XML Simplifies Platform Changes

XML Makes Your Data More Available

XML is Used to Create New Internet Languages

XML Tutorial, 7
XML Tree

XML documents form a tree structure that starts


at "the root" and branches to "the leaves".
An Example XML Document
XML documents use a self-describing and simple
syntax

<?xml version="1.0" encoding="ISO-8859-1"?>


<note> Root element
<to>Tove</to>
<from>Jani</from> Child element
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

XML Tutorial, 8
XML Documents Form a Tree Structure
XML documents must contain a root element. This
element is "the parent" of all other elements.

The elements in an XML document form a document


tree. The tree starts at the root and branches to the
lowest level of the tree.

All elements can have sub elements (child elements):

<root>
<child>
<subchild>.....</subchild>
</child>
</root>

XML Tutorial, 9
The Tree

<?xml version="1.0"?>
<book> Root element
<author> Parent of <lastname>
<lastname>Tennant</lastname>
<firstname>Roy</firstname> Child of <author>
</author>
<title>The Great American Novel</title>
<chapter number=1>
<chaptitle>It Was Dark and Stormy</chaptitle>
Siblings <p>It was a dark and stormy night.</p>
<p>An owl hooted.</p>
</chapter>
</book>
XML Tutorial, 10
The image above represents one book in the XML below:

<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year> <price>30.00</price>
</book>

<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year> <price>29.99</price>
</book>

<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year> <price>39.95</price>
</book>
</bookstore>
XML Tutorial, 11
XML Syntax Rules

All XML Elements Must Have a Closing Tag

XML Tags are Case Sensitive

XML Elements Must be Properly Nested

XML Documents Must Have a Root Element

XML Attribute Values Must be Quoted

XML Tutorial, 12
XML Elements

An XML document contains XML Elements.


An XML element is everything from (including) the
element's start tag to (including) the element's end
tag.
An element can contain other elements, simple text
or a mixture of both.
Elements can also have attributes.

XML Tutorial, 13
XML Element Declarations
Sequence of 0 or Authors followed by
more papers optional fullpaper,
followed by title,
followed by booktitle

<!element bibliography paper*>


<!element paper (authors, fullPaper?, title, booktitle)>
<!element authors author+> Sequence of 1 or
<!element author (#PCDATA)> more authors
<!attlist author age CDATA>
Character content
<!element fullPaper EMPTY>
<!element title (#PCDATA)>
<!element booktitle (#PCDATA)>
<!attlist fullPaper source ENTITY #REQUIRED>
<!attlist paper eid ID>
XML Tutorial, 14
Ex:
<bookstore>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year> <price>29.99</price>
</book>

<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year> <price>39.95</price>
</book>
</bookstore>

In the example above, <bookstore> and <book> has element content,


because they contain other elements.
<author> has text content because it contains text.
In the example above only <book> has an attribute
(category="CHILDREN").

XML Tutorial, 15
Elements and their Content

element
element type
<bibliography>
element
<paper ID="object-fusion"> content
<authors>
<author>Y.Papakonstantinou</author>
<author>S. Abiteboul</author> empty
<author>H. Garcia-Molina</author>
</authors> element
<fullPaper source="fusion"/>
<title>Object Fusion in Mediator Systems</title>
<booktitle>VLDB 96</booktitle>
</paper>

</bibliography>
character content
XML Tutorial, 16
XML Attributes

From HTML you will remember this:


<img src="computer.gif">.
The "src" attribute provides additional information about
the <img> element.
In HTML (and in XML) attributes provide additional
information about elements:
XML Attributes Must be Quoted

XML Tutorial, 17
XML Attribute Declarations

<!element bibliography paper*>


<!element paper (authors, fullPaper?, title, booktitle)>
<!element authors author+>
<!element author (#PCDATA)>

<!element fullPaper EMPTY>


<!element title (#PCDATA)>
<!element booktitle (#PCDATA)>
<!attlist fullPaper source ENTITY #REQUIRED>
<!attlist person pid ID>
<!attlist author authorRef IDREF> Source (IDREF) and
target (ID) declarations
for intradocument pointers

XML Tutorial, 18
Element Attributes

Attribute name
<bibliography>

<paper pid="object-fusion">
Attribute Value
<authors>
<author>Y.Papakonstantinou</author>
<author>S. Abiteboul</author>
<author>H. Garcia-Molina</author>
</authors>
<fullPaper source="fusion"/>
<title>Object Fusion in Mediator Systems</title>
<booktitle>VLDB 96</booktitle>
</paper>

</bibliography>
XML Tutorial, 19
XML Attribute Use
<person pid=j23"> </person>

<bibliography> ID attribute
<paper pubid="wsa" role="publication">

<authors> CDATA (character data)


attribute
<author authorRef=j23 >
J. L. R. Colina </author>
</authors> intradocument
<fullPaper source="http://...confusion"/> reference
<title>Object Confusion in a Deviator System </title> IDREF attribute
<related papers= "deviation101 x_deviators"/>
</paper>
Reference to
</bibliography> external ENTITY
XML Tutorial, 20
Attribute Types (DTD)
Type Meaning
ID Token unique within the document
IDREF Reference to an ID token
IDREFS Reference to multiple ID tokens
ENTITY External entity (image, video, )
ENTITIES External entities
CDATA Character data
NMTOKEN Name token
NMTOKENS Name tokens
NOTATION Data other than XML
Enumeration Choices
Conditional Sec INCLUDE & IGNORE declarations
Attributes may be: REQUIRED, IMPLIED (optional)
can have: default values, which may be FIXED
XML Tutorial, 21
The default-value can be one of the following:
Value Explanation
value The default value of the attribute
The attribute is required
#REQUIRED
Ex :<!ATTLIST element-name attribute_name attribute-type #REQUIRED>
The attribute is not required
#IMPLIED
Ex :<!ATTLIST element-name attribute_name attribute-type #IMPLIED >
The attribute value is fixed
#FIXED value
Ex :<!ATTLIST element-name attribute_name attribute-type #FIXED value>

XML Tutorial, 22
DTD - Entities

Entities are variables used to define shortcuts to


standard text or special characters.
Entity references are references to entities
Entities can be declared internal or external

Syntax: - Internal
<!ENTITY entity-name "entity-value">
DTD Example:
<!ENTITY writer "Donald Duck.">
<!ENTITY copyright "Copyright W3Schools.">

XML example:
<author> &writer;&copyright;</author>

XML Tutorial, 23
External Text Entities
DTD
External Text Entity Declaration

<!ENTITY chap1 SYSTEM "http://...chap1.xml">


URL
XML Entity Reference

<mylife> &chap1; &chap2;</mylife>

Logically equivalent to inlining file contents

<mylife> <teen>yada yada</teen>


<adult> blah blah</adult>
</mylife>
XML Tutorial, 24
Types of Entities

Internal (to a doc) vs. External ( use URI)


General (in XML doc) vs. Parameter (in DTD)
Parsed (XML) vs. Unparsed (non-XML)

XML Tutorial, 25
XML Validation

XML with correct syntax is "Well Formed" XML.


XML validated against a DTD is "Valid" XML.

Processing XML
Non-validating parser:
checks that XML doc is syntactically well-formed

Validating parser:
checks that XML doc is also valid w.r.t. a given DTD or Schema

XML Tutorial, 26
Well Formed XML Documents
Follows general tagging rules:
All tags begin and end
But can be minimized if empty: <br/> instead of <br></br>
All tags are case sensitive
All tags must be properly nested:
<author> <firstname>Mark</firstname>
<lastname>Twain</lastname> </author>
All attribute values are quoted:
<subject scheme=LCSH>Music</subject>
Has identification & declaration tags
Software can make sure a document follows these rules

<?xml version="1.0" encoding="ISO-8859-1"?>


<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
XML Tutorial, 27
Valid XML
Uses only specific tags and rules as codified by one of:
A document type definition (DTD)
A schema definition
Only the tags listed by the schema or DTD can be used
Software can take a DTD or schema and verify that a document adheres to the
rules
Editing software can prevent an author from using anything except allowed tags

<?xml version="1.0" encoding="ISO-8859-1"?>


The DOCTYPE declaration in the <!DOCTYPE note SYSTEM "Note.dtd">
example above, is a reference to an <note>
<to>Tove</to>
external DTD file. <from>Jani</from>
The content of the file is shown in the <heading>Reminder</heading>
paragraph below <body>Don't forget me this weekend!</body>
</note>

XML DTD
The purpose of a DTD is to define the <!DOCTYPE note [
structure of an XML document. <!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
It defines the structure with a list of <!ELEMENT from (#PCDATA)>
legal elements: <!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>]
XML Tutorial, > 28
XML Structure

Each XML document has both a logical and a physical structure.

Physically, the document is composed of units called entities.

An entity may refer to other entities to cause their inclusion in the


document. A document begins in a "root" or document entity.

Logically, the document is composed of declaration , elements


,comments , character references and processing instructions, all of
which are indicated in the document by explicit markup.

A software module called an XML processor is used to read XML


documents and provide access to their content and structure.

It is assumed that an XML processor is doing its work on behalf of


another module, called the application.

XML Tutorial, 29
XML Tutorial, 30
Tags

If XML markup is a structural skeleton for a document,


then tags are the bones.

They mark the boundaries of elements.

It allow insertion of comments and special instructions,


and declare settings for the parsing environment

XML Tutorial, 31
Types of tags in XML
Object Purpose Example
Represent information at a specific
empty element <xref linkend="abc"/>
point in the document.
Group together elements and <p>This is a
container element
character data. paragraph.</p>
Add a new parameter, entity, or
<!ENTITY author "Erik
declaration grammar definition to the parsing
Ray">
environment.
processing Feed a special instruction to a <?print-formatter force-
instruction particular type of software. linebreak?>
Insert an annotation that will be <! here's where I left off
comment
ignored by the XML processor. >
Create a section of character data
that should not be parsed, <![CDATA[Ampersands
CDATA section
preserving any special characters galore! &&&&&&]]>
inside it.
Command the parser to insert some
entity reference &company-name;
text stored elsewhere.

XML Tutorial, 32
Documents
An XML document is a special construct designed to archive data in a way
that is most convenient for parsers.
It has nothing to do with our traditional concept of documents

An XML document has two parts. First is the document prolog, a special
section containing metadata
The second is an element called the document element, also called the root
element. The root element contains all the other elements and content in
the document.
The prolog is optional. If you leave it out, the parser will fall back on its
default settings.
For example, it automatically selects the character encoding UTF-8 (or
UTF-16, if detected) unless something else is specified. 33
XML Tutorial,
The Document Prolog
Being a flexible markup language toolkit, XML use different character
encodings, define your own grammars, and store parts of the document in
many places.

An XML parser needs to know about these particulars before it can start its
work.

There are two parts (both optional):


XML declaration
sets parameters for basic XML parsing
document type declaration(DTD).
is for more advanced settings.

<?xml version="1.0" standalone="no"?> The XML declaration


<!DOCTYPE Beginning of the DOCTYPE declaration
reminder Root element name
SYSTEM "/home/eray/reminder.dtd" DTD identifier
[ Internal subset start delimiter
<!ENTITY smile "<graphic file="smile.eps"/>"> Entity declaration
]
Internal subset end delimiter
<reminder> Start of document element
&smile; Reference to the entity declared above
<msg>Smile! It can always get worse.</msg>
</reminder> End of document element
XML Tutorial, 34
The XML Declaration
The XML declaration is a small collection of details that prepare an XML processor
for working with a document.

It is optional, but when used it must always appear in the first line
version
Declares the version of XML used. At the moment, only version 1.0 is officially
recognized, but version 1.1 may be available soon.
encoding
Defines the character encoding used in the document.
If undefined, the default encoding UTF-8 (or UTF-16, if the document begins with
the xFEFF Byte Order Mark) will be used, which works fine for most documents
used in English-speaking countries.
standalone
It Informs the parser whether there are any declarations outside of the document.
The default value is "no";

setting it to "yes" tells the processor there are no external declarations required for
parsing the document.

<?xml?>
<?xml version="1.0"?>
<?xml version='1.0' encoding='US-ASCII' standalone='yes'?>
<?xml version = '1.0' encoding= 'iso-8859-1' standalone ="no"?>
XML Tutorial, 35
DTD- Document Type Declaration
A Document Type Definition (DTD) defines the legal building blocks of an
XML document.

It defines the document structure with a list of legal elements and


attributes.

A DTD can be declared inline inside an XML document, or as an external


reference.
Internal DTD Declaration
External DTD Declaration

XML Tutorial, 36
Document Type Definitions (DTDs)
Define and Constrain
Element Names & Structure
<!element bibliography paper*>
<!element paper (authors, fullPaper?, title, booktitle)>
<!element authors author+>
<!element author (#PCDATA)> Element Type
<!element fullPaper EMPTY> Declaration
<!element title (#PCDATA)>
<!element booktitle (#PCDATA)>
<!attlist fullPaper source ENTITY #REQUIRED>
<!attlist paper ID ID>

Attribute List
Declaration

XML Tutorial, 37
DTD identifiers
System and public identifiers

system-specific .
the keyword SYSTEM (1) followed by a physical address (3)
such as a file system path or URI, in quotes (2).

It points to a file called simple.dtd in the local file system

<!DOCTYPE doc SYSTEM "/usr/local/xml/dtds/simple.dtd">

public .
a public identifier is never supposed to change

It points to a file called simple.dtd in the local file system

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN"


"http://www.w3.org/TR/HTML/html.dtd">

XML Tutorial, 38
Internal DTD Declaration
If the DTD is declared inside the XML file, it should be wrapped in a
DOCTYPE definition with the following syntax:
<!DOCTYPE root-element [element-declarations]>

<?xml version="1.0"?>
<!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note>
The DTD above is interpreted like this:

!DOCTYPE note defines that the root element of this document is note.
!ELEMENT note defines that the note element contains four elements:
"to,from,heading,body".
!ELEMENT to defines the to element to be of the type "#PCDATA".
!ELEMENT from defines the from element to be of the type "#PCDATA".
!ELEMENT heading defines the heading element to be of the type "#PCDATA".
!ELEMENT body defines the body element to be of the type "#PCDATA".
XML Tutorial, 39
External DTD Declaration
If the DTD is declared in an external file, it should be wrapped in a
DOCTYPE definition with the following syntax:

<!DOCTYPE root-element SYSTEM "filename">

This is the same XML document as above, but with an external DTD :
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM note.dtd >
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

And this is the file "note.dtd" which contains the DTD:

<!ELEMENT note (to,from,heading,body)>


<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
XML Tutorial, 40
XML Namespaces

XML Namespaces provide a method to avoid element name conflicts.


Namespaces enable you to make sure that one set of tags cannot
conflict with another.
A method to keep metadata elements from different schemas from
colliding
This XML carries HTML table information
XML carries information about a table
(a piece of furniture):
<table>
<tr> <table>
<td>Apples</td> <name>AfricanCoffeeTable</name>
<td>Bananas</td> <width>80</width>
</tr> <length>120</length>
</table> </table>

there would be a name conflict. Both contain a <table> element,


but the elements have different content and meaning.
An XML parser will not know how to handle these differences.

XML Tutorial, 41
Solving the Name Conflict Using a Prefix

Name conflicts in XML can easily be avoided using a name prefix.

This XML carries information about an HTML table, and a piece of


furniture:

In the example , there will be no conflict because the two <table>


elements have different names.

<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>

<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
XML Tutorial, 42
XML Namespaces - The xmlns Attribute
When using prefixes in XML, a so-called namespace for the prefix
must be defined.
The namespace is defined by the xmlns attribute in the start tag of
an element.
The namespace declaration has the following syntax.
xmlns:prefix="URI".

<root
<root><h:table xmlns:h=http://www.w3.org/TR/html4/
xmlns:h="http://www.w3.org/TR/html4/"> xmlns:f=http://www.w3schools.com/furniture >
<h:tr> <h:table>
<h:td>Apples</h:td> <h:tr>
<h:td>Bananas</h:td> <h:td>Apples</h:td>
</h:tr></h:table> <h:td>Bananas</h:td>
</h:tr></h:table>
<f:table
xmlns:f="http://www.w3schools.com/furniture"> <f:table>
<f:name>African Coffee Table</f:name> <f:name>African Coffee Table</f:name>
<f:width>80</f:width> <f:width>80</f:width>
<f:length>120</f:length> <f:length>120</f:length>
</f:table></root> </f:table>
XML Tutorial, </root> 43
XML Vocabularies

XML allows authors to create their own tags to describe


data.
People and organizations in various fields of study have
created many different kinds of XML for structuring data.
Some of these markup languages are
MathML (Mathematical Markup language)
WML (Wireless Markup language)
PDML (Product Data Markup language)
XBRL (Extensible Business Reporting language )
XUL ( Extensible User Interface language)

XML Tutorial, 44
MathML (Mathematical Markup language)
- Simple expression
Some specialized software package such as Tex and LaTex for
displaying complex mathematical expression.
MathML is used for describing mathematical notation and
expressions.
One application that can parse and render MathML is the W3Cs
Amaya browser editor.
It is divided into two types of markup
Content markup
It allows programmers to write mathematical notation specific to different
areas of mathematics.
For instance , the multiplication symbol has one meaning in set theory and
another meaning in linear algebra.
Presentation markup
It is directed towards formatting and displaying mathematical notation.
Example for Presentation MathML.

XML Tutorial, 45
Simple MathML :mathml1.html
<html xmlns="http://www.w3.org/1999/xhtml">

<head><title>Simple MathML Example</title></head>

<body>

<math xmlns = "http://www.w3.org/1998/Math/MathML">

<mrow>
<mn>2</mn>
<mo>+</mo>
<mn>3</mn>
<mo>=</mo>
<mn>5</mn>
</mrow>

</math>

</body>
</html>
XML Tutorial, 46
Simple MathML : algebraic equation
mathml2.html
We embed the MathML content into XHTML file by using
math element with the default namespace
http://www.w3.org/1998/Math/MathML.
The mrow element is a container element for
expressions that contain more than one element.
In this case mrow contains five children element.
The mn element markup a number.
The mo element markup an operator(e.g + , * )
Using this markup , we define the expression 2 + 3 = 5
, which a MathML browser can display.
This Example using MathML to markup an algebraic
equation that use exponents and arithmetic operators.

XML Tutorial, 47
Simple MathML : mathml2.html
<html xmlns="http://www.w3.org/1999/xhtml">
<head><title>Simple MathML Example</title></head>
<body>
<math xmlns = "http://www.w3.org/1998/Math/MathML">
<mrow>

<mrow>
<mn>3</mn>
<mo>&InvisibleTimes;</mo>
<msup>
<mi>x</mi>
<mn>2</mn>
</msup>
</mrow>
<mo>+</mo>
<mi>x</mi>
<mo>-</mo>
<mfrac>
<mn>2</mn>
<mi>x</mi>
</mfrac>

<mo>=</mo>
<mn>0</mn>
</mrow>
XML Tutorial, 48
</math> </body> </html>
Example explained

Mrow allow the document author to group related


element properly.
The entity reference &Invisible Times to indicate a
multiplication operation without explicit symbolic
representation. (i.e The multiplication simple does not
appear between 3 an x).
For exponentiation , the msup element, which
represents a superscript. It has two children
The expression to be subscripted (base) and the superscript
(Exponent).
To display variable such as x ,uses identifier element mi
To display fraction, uses element mfrac specify the
numerator and the denominator for fraction.

XML Tutorial, 49
Simple MathML : Integral equation
mathml3.html
<html xmlns="http://www.w3.org/1999/xhtml">
<head><title>Calculus MathML Example</title></head>
<body>
<math xmlns = "http://www.w3.org/1998/Math/MathML">
<mrow>
<msubsup>

<mo>&Integral;</mo>
<mn>0</mn>

<mrow>
<mn>1</mn>
<mo>-</mo>
<mi>y</mi>
</mrow>

</msubsup>

XML Tutorial, 50
<msqrt>
<mrow>

<mn>4</mn>
<mo>&InvisibleTimes;</mo>

<msup>
<mi>x</mi>
<mn>2</mn>
</msup>

<mo>+</mo>
<mi>y</mi>

</mrow>
</msqrt>

<mo>&delta;</mo>
<mi>x</mi>
</mrow>
</math> </body> </html>

XML Tutorial, 51
Example Explained.

The entity & Integral represents the integral symbol


while the msubsup element specifies the subscript and
super script.
Element mo marks up the integral operator.
Element msubsup requires three child element
An operator
The subscript expression
Super script expression
Element msqrt represents a square root expression
The entity &delta representing a lowercase delta
symbol. Delta is an operator.

XML Tutorial, 52
XML Tutorial, 53
XML DOM

XML Tutorial, 54
XML DOM Introduction
The DOM is a W3C (World Wide Web Consortium) standard.
"The W3C Document Object Model (DOM) is a platform and language-
neutral interface that allows programs and scripts to dynamically access
and update the content, structure, and style of a document."

The XML DOM defines a standard way for accessing and manipulating XML
documents.

The DOM presents an XML document as a tree structure, with elements,


attributes, and text as nodes

XML Tutorial, 55
What is the XML DOM?

The XML DOM is:


A standard object model for XML
A standard programming interface for XML
Platform- and language-independent
The XML DOM is a standard for how to get, change, add, or
delete XML elements.
A W3C standard

The XML DOM defines the objects and properties of


all XML elements, and the methods (interface) to
access them

Cross-language API for representing XML documents as


trees

XML Tutorial, 56
The XML DOM Node Tree
The XML DOM views an XML document as a tree-structure is called a node-
tree.
All the nodes in the tree have a relationship to each other.
All nodes can be accessed through the tree. Their contents can be modified or
deleted, and new elements can be created.
The node tree shows
the set of nodes,
the connections between them.
The tree starts at the root node and branches out to the text nodes at the
lowest level of the tree:

XML Tutorial, 57
DOM Nodes

According to the DOM, everything in an XML document


is a node.
The DOM says:
The entire document is a document node
Every XML element is an element node
The text in the XML elements are text nodes
Every attribute is an attribute node
Comments are comment nodes

XML Tutorial, 58
Root node element node
<?xml version="1.0" encoding="ISO-8859-1"?>
<!ENTITY smile "<graphic file="smile.eps"/>">]
<bookstore>
<book category="COOKING"> entity node
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year> Attribute node
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
character node

XML Tutorial, 59
Parsing the XML DOM

All modern browsers have a build-in XML parser that can be


used to read and manipulate XML.

The parser converts XML into a JavaScript accessible object

The parser reads XML into memory and converts it into an


XML DOM object that can be accesses with JavaScript.

The Microsoft parser supports loading of both XML files and


XML strings (text), while other browsers use separate
parsers. However, all parsers contain functions to traverse
XML trees, access, insert, and delete nodes.

XML Tutorial, 60
Loading XML with Microsoft's XML Parser
Microsoft's XML parser is built into Internet Explorer 5
and higher.

var xmlDoc=new ActiveXObject("Microsoft.XMLDOM");


xmlDoc.async="false";
xmlDoc.load("note.xml");

Example explained:
The first line of the script above creates an empty Microsoft XML
document object.
The second line turns off asynchronized loading, to make sure
that the parser will not continue execution of the script before
the document is fully loaded.
The third line tells the parser to load an XML document called
"note.xml".

XML Tutorial, 61
JavaScript fragment loads a string called txt into the parser

var xmlDoc=new ActiveXObject("Microsoft.XMLDOM");

xmlDoc.async="false";

xmlDoc.loadXML(txt);

The loadXML() method is used for loading strings (text)


load() is used for loading files.

XML Tutorial, 62
XML Parser in Firefox and Other Browsers

The following JavaScript fragment loads an XML document


("note.xml") into the parser:

var xmlDoc=document.implementation.createDocument("","",null);
xmlDoc.async="false";
xmlDoc.load("note.xml");

The following JavaScript fragment loads a string called txt into


the parser:
var parser=new DOMParser();
var doc=parser.parseFromString(txt,"text/xml");

Example explained:
The first line of the script above creates an empty XML document object.
The second line tells the parser to load a string called txt.
Internet Explorer uses the loadXML() method to parse an XML string,
while other browsers uses the DOMParser object.
XML Tutorial, 63
Access Across Domains
For security reasons, modern browsers does not allow access across
domains.
This means, that both the web page and the XML file it tries to
load, must be located on the same server.
If you want to use the example above on one of your web pages,
the XML files you load must be located on your own server.
Otherwise the xmlDoc.load() method, will generate the error
"Access is denied".

Examples
In the examples below we use the following DOM code to get the
text from the <to> element:
xmlDoc.getElementsByTagName("to")[0].childNodes[0].nodeValue

xmlDoc - the XML document created by the parser.


getElementsByTagName("to")[0] - the first <to> element
childNodes[0] - the first child of the <to> element (the text node)
nodeValue - the value of the node (the text itself)

XML Tutorial, 64
Parsing an XML File - A Cross browser Example
<html>
<body>
<script type="text/javascript">
try //Internet Explorer {
xmlDoc=new ActiveXObject("Microsoft.XMLDOM");
}catch(e)
{
try //Firefox, Mozilla, Opera, etc. {
xmlDoc=document.implementation.createDocument("","",null);
}
catch(e)
{
alert(e.message) output
} Tove
} Jani
try
{
Don't forget me this weekend!
xmlDoc.async=false;
xmlDoc.load("note.xml");
document.write (xmlDoc.getElementsByTagName("to")[0].childNodes[0].nodeValue);
document.write("<br/>");
document.write (xmlDoc.getElementsByTagName("from")[0].childNodes[0].nodeValue);
document.write("<br/>");
document.write (xmlDoc.getElementsByTagName("body")[0].childNodes[0].nodeValue);
}
catch(e)
{
alert(e.message)
}
XML Tutorial,
</script></body></html> 65
XML DOM Traverse Node Tree
Often you will need to loop through elements in an XML document or string.
<html><body>
<scripttype="text/javascript">
var text="<note>";
text=text+"<to>Tove</to>";
text=text+"<from>Jani</from>";
text=text+"<heading>Reminder</heading>";
text=text+"<body>Don't forget me this weekend!</body>";
text=text+"</note>";// code for IE
if (window.ActiveXObject)
{ var doc=new ActiveXObject("Microsoft.XMLDOM");
doc.async="false";
doc.loadXML(text);
} // code for Mozilla, Firefox, Opera, etc.
Else
{
var parser=new DOMParser();
var doc=parser.parseFromString(text,"text/xml");
} // documentElement always represents the root node Output
var x=doc.documentElement; to=Tove
for (i=0;i<x.childNodes.length;i++)
{
from=Jani
document.write(x.childNodes[i].nodeName); heading=Reminder
document.write("="); body=Don't forget me this weekend!
document.write(x.childNodes[i].childNodes[0].nodeValue);
document.write("<br />");
XML Tutorial, 66
XSL(T) Overview

XSL stylesheets are denoted in XML syntax


XSL components:
1. a language for transforming XML documents
(XSLT: integral part of the XSL specification)
2. an XML formatting vocabulary
(Formatting Objects: >90% of the formatting
properties inherited from CSS)

XML Tutorial, 67
XSLT Processing Model

Transformation

XSLT stylesheet

XML source tree XML,HTML,csv, text result tree

XML Tutorial, 68
XSLT Elements
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

root element of an XSLT stylesheet "program"

<xsl:template match=pattern name=qname priority=number


mode=qname>
...template...
</xsl:template>

declares a rule: (pattern => template)

<xsl:apply-templates select = node-set-expression mode = qname>


apply templates to selected children (default=all)
optional mode attribute

<xsl:call-template name=qname>

XML Tutorial, 69
XSLT Processing Model
XSL stylesheet: collection of template rules
template rule: (pattern template)
main steps:
match pattern against source tree
instantiate template (replace current node . by the template in
the result tree)
select further nodes for processing
control can be a mix of
recursive processing ("push": <xsl:apply-templates> ...)
program-driven ("pull": <xsl:foreach> ...)

XML Tutorial, 70
Template Rule: Example
pattern
<xsl:template match="product"> template
<table>
<xsl:apply-templates select="sales/domestic"/>
</table>
<table>
<xsl:apply-templates select="sales/foreign"/>
</table>
</xsl:template>

(i) match pattern: process <product> elements


(ii) instantiate template: replace each product element with two HTML tables
(iii) select the <product> grandchildren (sales/domestic, sales/foreign) for
further processing

XML Tutorial, 71
Match/Select Patterns

match patterns select patterns = defined in


http://w3.org/TR/xpath
Examples:
/mybook/chapter[2]/section/*
chapter|appendix
chapter//para
div[@class="appendix" and position() mod 2 =
1]//para
../@lang

XML Tutorial, 72
Recursive Descent Processing with XSLT

take some XML file on books: books.xml


now prepare it with style: books.xsl
and enjoy the result: books.html
the recipe for cooking this was:

java com.icl.saxon.StyleSheet books.xml books.xsl > books.html

and now some different flavors: books2.xsl books3.xsl

Source: XSLT Programmer's Reference, Michael Kay, WROX

XML Tutorial, 73
XSLT Example

XML Tutorial, 74
XSLT Example (contd)

XML Tutorial, 75
XSLT Example (contd)

XML Tutorial, 76
Creating the Result Tree...
Literal result elements: non-XSL elements (e.g., HTML)
appear literally in the result tree
Constructing elements:
<xsl:element name = "">
attribute & children definition
</xsl:element>

(similar for xsl:attribute, xsl:text, xsl:comment,)

Generating text:
<xsl:template match="person">
<p>
<xsl:value-of select="@first-name"/>
<xsl:text> </xsl:text>
<xsl:value-of select="@surname"/>
</p>
</xsl:template>

XML Tutorial, 77
Advantages of XML

Moving Beyond Format


Data Reusability Single Source - Multiple output
Flexibility
Accessibility
Portability

XML Tutorial, 78
Applications of XML

Configuration files
Used extensively in J2EE architecture
B2B transaction on the web
Electronic business order(ebXML)
Financial exchange
Messaging exchange (SOAP)
Media for data interchange
A better alternative to proprietary data formats
XML database
An XML database is a data persistence software system that
allows data to be imported, accessed and exported in the XML
format.

XML Tutorial, 79
XML vs. Databases
(a simplistic formula)

If your information is
Tightly structured
Fixed field length
Massive numbers of individual items
You need a database
If your information is
Loosely structured
Variable field length
Massive record size
You need XML

XML Tutorial, 80

Potrebbero piacerti anche