Sei sulla pagina 1di 14

1

Unit -3
Introduction to XML
Introduction
 XML stands for eXtensible Markup Language.
 XML is a markup language same as HTML.
 XML tag you can define yourself for storing data.
 XML store (describe) data, nothing more (either temporary or permanent).

 XML was designed for describing well-formed data. Also XML have some strict rules.
That rules follow every XML document.

 Any database data are easily transformed to a XML format. It's like reasonable storage
format for certain types of data and easily converted into server side along with XSL, etc.

 Data can be inserted or updated into the database tables corresponding to the objects
using XML files.

 XML content must be encoded as UTF-8 in XML files.

Following are some key point differences between XML and HTML.
Key Point XML HTML
stands for eXtensible Markup Language Hyper Text Markup Language
XML derived from SGML(Standard Where as the HTML derived same
Derived from
Generalized Markup Language). from SGML.
XML was designed for holds HTML was designed for specify how
data. Use for transport
Purpose to data should be display on web
data between application and
database. page.

XML was follow strict rules. Any HTML was not following any strict
Rules time terminate the process if rules rules. All browser try to display data
break. to the best as per its ability.

Case Sensitive XML is case sensitive. HTML is not a case sensitive.


XML tags you can define custom
Tags HTML tags are predefined.
tag by ourself.
HTML tags are two type closing
XML tags must have closing tag.
tag or self-closing tag.
Example. <note>Travel
How to write Example. self-closing tag <br />,
experience</note>
Closing tag <p>Travel experience</p>

While HTML was static for displaying


Behavior XML was dynamic for holding data.
data.
2

Key Features of XML


Self describing data When you look at an XML document, it is very easy to describe data in
table structure format.

Creating custom tag XML language easy to creating your defined tag to describing data.

Exchanging data XML data you can sharing easily between different application as well as
database.

Describing Structures in XML

XML structure start to a Parent root from top of the side. Every XML documents have
only one root element.

XML was describe a tree structures of data. And tree structure have one root, child elements,
branches, attributes, values. Following are simple XML structures.

Describing Structures in XML

Above visual tree structure assume our example base on this structure make one XML
document including all that describe information.
<employee>
<emp_info id="1">
<name>
<first_name>Opal</first_name>
<middle_name>Venue</middle_name>
<last_name>Kole</last_name>
</name>
<contact_info>
3

<company_info>
<comp_name>Odoo (formally OpenERP)</comp_name>
<comp_location>
<street>Tower-1, Infocity</street>
<city>GH</city>
<phone>000-478-1414</phone>
</comp_location>
<designation>Junior Engineer</designation>
</company_info>
<phone>000-987-4745</phone>
<email>email@myemail.com</email>
</contact_info>
</emp_info>
</employee>

The Syntax of XML


XML syntax rules are strongly and very simple logically. XML syntax rules are specifies
how to write valid XML document.

1. XML documents must have only one root element


2. Element must have closing tag. Ex: <name>Ramesh</name>
3. Element must be properly nested
4. Comment in XML is similar to a HTML ex: <!-- This is a comment –
5. XML tag case sensitive.
6. XML Attribute value must be quoted ex:
<emp_info id="1">
<name>Raju</name>
</emp_info>
7. White-space is not preserved in XML(XML break multiple white space character to a
single white-space.)
8. Entity Support

Entity Support
In XML special character have some special meaning similar to a HTML.
If you are use < sign or > sign inside XML element, It'll generate a error because document
parse interprets assume it's start new element.

<emp_info>employee number < 15 </emp_info> <!-- Invalid -->

For avoiding this error use entity character instead of some special character (<, >).
4

<emp_info>employee number &lt; 15 </emp_info> <!-- Valid -->

XML specification defines five predefined entities represent special characters. Following
table represent five XML predefined entities lists.

Character Entities Description Standard


" &quot; Double quotation XML 1.0

' &apos; Apostrophe (Single quotation) XML 1.0

& &apos; ampersand sign XML 1.0


< &lt; less than XML 1.0
> &gt; greater than XML 1.0

XML elements vs attributes


XML Elements
 Every document must have a one top level element called root element.
 XML element contents are part of the basic document contents that are store
information data.
 XML elements are represented by a tags.
XML Element Name must be follow this things
 Element names must be alphabetic or numeric character contains.
 Element name can't have white spaces contains and
 name can't start with capital letter, numeric or mixed letter.

If element contents absence(empty) then you can write element following two way to
represent valid standard.
<element />
<element></element>

XML Attributes
XML element can have attributes for identify elements.
<emp_info id="1"> <!-- Attributes represent-->
<name>
<first_name>Opal</first_name>
<last_name>Kole</last_name>
</name>
<emp_info>

XML standard specifies element may have define multiple attributes along with unique
attribute name.
5

<emp_info id="1" name="Raju"> <!-- Attributes represent-->


...
<emp_info>

Namespaces in XML
 Namespaces in XML primary purpose to distinguish between duplicate
elements and attribute names.

 XML data has to be exchanged between several applications.

 Same tag name may have different meaning in different application. So it’s
creating confusion on exchanging documents.

 Specifying prefix name to an element or attribute names to avoid this confusion.


<prefix_name:element_name>
 Each and every prefix name is associated with one URI.
 Prefix name associated with the same URI are in the same namespace.
 Full qualified name including prefix, colon is called the XML qualified name.

Prefixes are bind to a namespace URI using xmlns:prefix attribute to the prefixed element.

<r:student xmlns:r="http://www.w3c.org/xml/">

</r:student>

Name Conflicts Example

Following example XML data for storing student marks,

Example:
<student>
<result>
<name>Raju</name>
<cgpa>8.4</cgpa>
</result>
<cv>
<name>Raju</name>
<cgpa>8.4</cgpa>
</cv>
</student>

Above XML document both <result> and <cv> have the same <cgpa> element, so XML parser
doesn't know which one is parse.
6

That's why XML namespaces is use for mapping between an element prefix and a URI.

XML namespace URI not a point to a information about the namespace but they are identify
unique elements.

Convert the Name Conflict to XML Namespaces

We are specify prefix name as per different element. xmlns attribute with XML namespaces
as follows
<s:student xmlns:s="http://www.w3c.org/some_url1"\
xmlns:res="http://www.w3c.org/some_url2">
<r:result>
<r:name>Raju</r:name>
<r:cgpa>8.4</r:cgpa>
</r:result>
<res:cv>
<res:name>Raju</res:name>
<res:cgpa>8.4</res:cgpa>
</res:cv>
</s:student>

DTD Introduction
 DTD (Document Type Definition) is a type of document schema and define the
structure of XML documents.
 DTD provide a framework for validating XML documents. You can create DTD file
that are shareable to a different application.
 In XML you can define tags without defining what tag are legal. But defined XML
document structure must be conform to, if you specifies DTD rules.
 DTD does not identify root element. Manually you want to inform (write) root
element.
 In short DTD contains number of rules that rules must be follow XML document.

DTD defines following three rules,

 Specifies the tags and attributes that can be used to creating XML document.

 How to tags combines and reuse.

 Specifies the entities which are represent the special characters.


7

 Well-formed XML:

A Well-formated file is follow general XML rules like every open tag must be closed,
tags must be properly nested, empty tag must be end with '/>', attribute values must be
enclosed either single or double quotes etc.

 Valid XML

Valid XML file is conforms to a specific structure and that XML file have DTD that
specifies used tags, attributes those are tag contains.

DTD declaration
DTD declarations section we can define different elements, attributes, entity, notation
rules. Well-structured XML (include DTD) document follow the DTD specified rules.

1. Element Declaration in DTD

Element declaration in DTD Specifies the name of the tag that use to build XML document.
Every (General) XML element declare by following way,
<!ELEMENT element_name (inside_element)>

element_name specifies the general identifier and inside_element specifies what are content
inside the element.

Elements with any Contents

Elements declared with the ANY keyword, Any keyword contain any combination of parse-
able data.

<!ELEMENT element_name ANY> <!-- Syntax-->

<!ELEMENT div ANY> <!-- Example-->

Empty Element
<!ELEMENT element_name (EMPTY)> <!-- Syntax-->

<!ELEMENT br EMPTY> <!-- Example-->

EMPTY keyword specifies the empty tag. Inside no any element content.

Only Text Content Element

If your element content only text data you can declare.

<!ELEMENT element_name (#PCDATA)>

#PCDATA (parsed character data) keyword specifies parsed only character content.
8

Multiple Child Element

Child elements specifies one or more separated by comma (,) sign.

<!ELEMENT div (p,a,span,h3)> <!-- Example-->

2. DTD Attribute Declaration

DTD attribute declaration: If an element have attributes, you have to declare the name of the
attributes in DTD.
<!ATTLIST element_name attr_name attr_token_type attr_declaration>

Description
 element_name specifies the element name.
 attr_name specifies the element attribute name.
 attr_token_type specifies the structure/character string value.
 attr_declaration specifies the default behavior of this attributes.
Attribute declaration specifies the default behavior of the attribute.
 #REQUIRED attribute must have value.
Syntax:
<!ATTLIST element_name attribute_name CDATA #REQUIRED>
EX:
<!ATTLIST employee id CDATA #REQUIRED>
 #IMPLIED attribute value are optional. Not compulsory to have some value.
 default_value attributes default value specifies.
<!ATTLIST email domain CDATA "personal">
 #FIXED defaut_value attribute must have in element and also specifies the default
value.

3. ENTITY Declaration
XML give you control to make your own ENTITY. You can define/declare the entity in
Document DTD (Document Type Definition) section. Once you create ENTITY, you can
ready to use that entity in your XML document.

<!ENTITY name definition>


Example:

<!ENTITY ph "000-478-1414">

4. Notation declaration

General syntax of notation declaration,


<!NOTATION notation_name PUBLIC url>
9

<!NOTATION notation_name SYSTEM url>

A Sample DTD with xml


<?xml version="1.0" standalone="yes"?>
<!DOCTYPE employee [
<!ELEMENT employee (name, designation, email,phone)>
<!ATTLIST employee id CDATA #REQUIRED>
<!ELEMENT name (#PCDATA)>
<!ELEMENT designation (#PCDATA)>
<!ATTLIST employee discipline CDATA #IMPLIED>
<!ELEMENT email (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
<!ENTITY ph "000-478-1414">
]>
<!--now XML code -- >
<employee id="1">
<name>Raju</name>
<designation discipline="Web developer">Senior Engineer
</designation>
<email>email@myemail.com</email>
<phone>&ph</phone>
</employee>

Internal/External DTD Declaration


Internal DTD
 Internal DTD you can declare inside your XML file. In XML file top <!DOCTYPE ... >
declaration to declare the DTD
Example: Sample DTD

External DTD
External DTD are shared between multiple XML documents. Any changes are update in
DTD document effect or updated come to a all XML documents.

Example:
Save file name : external_dtd.dtd
…………………….
10

Sample DTD first part


………………………….

Save another file name : main.xml


<?xml version="1.0" standalone="no"?>
…………………….
Sample DTD second part(xml part)
………………………….
XML Schema
 XML Schema is an XML-based language used to create XML-based languages and
data models.
 An XML schema defines element and attribute names for a class of XML documents.
 The schema also specifies the structure that those documents must adhere to and the
type of content that each element can hold.
 XML schema are said to be instances of that schema.
 If they correctly adhere to the schema, then they are valid instances.

The Power of XML Schema


What is the Difference Between DTD and XSD?
1. DTD vs XSD
DTD is a set of markup declarations that define XSD specifies how to describe the elements
a document type for an SGML – family in an Extensible Markup Language
markup language. document formally.

2. Stands For
DTD stands for Document Type Definition. XSD stands for XML Schema Definition.

3. Control on XML Structure


DTD provides less control over the XML XSD provides more control over the XML
structure. structure.

4. Support for Data Types


DTD does not support data types. XSD supports data types.

5. Simplicity
DTD is harder than XSD. XSD is simple than DTD.

Schemas Fundamentals
 An XML schema describes the structure of an XML instance document by defining
what each element must or may contain.
 Schema authors can define their own types or use the built-in types.
The following is a high-level overview of Schema types.
11

1. Elements can be of simple type or complex type.


2. Simple type elements can only contain text. They cannot have child elements or
attributes.
3. All the built-in types are simple types (e.g, xs:string).
4. Complex-type elements can contain child elements and attributes as well as text.
5. By default, complex-type elements have complex content, meaning that they have
child elements.
6. Complex types may have mixed content - a combination of text and child elements.
The diagram below gives a first look at the types of XML Schema elements

Defining a Simple XML Schema


Author.xsd
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Author">
<xs:complexType>
<xs:sequence>
<xs:element name="FirstName" type="xs:string" />
<xs:element name="LastName" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
12

The document element of XML schemas is xs:schema. It takes the attribute xmlns:xs with
the value of http://www.w3.org/2001/XMLSchema , indicating that the document should
follow the rules of XML Schema.
Validating an XML Instance Document
The code sample below shows a valid XML instance of above XML schema.
/xmlinstance.xml

<?xml version="1.0"?>
<Author xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespace SchemaLocation="Author.xsd">
<FirstName>Mark</FirstName>
<LastName>Twain</LastName>
</Author>

The xmlns:xsi attribute of the document element indicates that this XML document is an
instance of an XML schema. The document is tied to a specific XML schema with
the xsi:noNamespaceSchemaLocation attribute.

XML Schema Data types


XML Schema specifies 44 built-in types, 19 of which are primitive. There are two types of
data types in XML schema.
1. simpleType
2. complexType

simpleType
The simpleType allows you to have text-based elements. It contains less attributes, child
elements, and cannot be left empty.
complexType
The complexType allows you to hold multiple attributes and elements. It can contain
additional sub elements and can be left empty.
password.xsd

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:simpleType name="Password">
<xs:restriction base="xs:string">
<xs:minLength value="6"/>
13

<xs:maxLength value="12"/>
</xs:restriction>
</xs:simpleType>
<xs:element name="User">
<xs:complexType>
<xs:sequence>
<xs:element name="PW" type="Password"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

Password.xml

<?xml version="1.0"?>
<User xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="Password.xsd">
<PW>MyPass</PW>
</User>

Displaying Raw XML Documents


Raw XML files can be viewed in all major browsers.
Don't expect XML files to be displayed as HTML pages.
<?xml version="1.0" encoding="UTF-8"?>
- <note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

 Look at the XML file above in your browser: note.xml


 Most browsers will display an XML document with color-coded elements.
 Often a plus (+) or minus sign (-) to the left of the elements can be clicked to expand
or collapse the element structure.
14

Displaying XML Documents with CSS


college.xml

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<?xml-stylesheet type="text/css" href="college.css"?>

<colleges>

<college>

<name>SJMP</name>

<url>http://www.sjmpbirur.in</url>

</college>

</colleges>

college.css
colleges {
margin:10px;
background-color:#ccff00;
font-family:verdana,helvetica,sans-serif;
}
name {
display:block;
font-weight:bold;
}
url {
display:block;
color:#636363;
font-size:small;
font-style:italic;
}

Potrebbero piacerti anche