Web Technologies
Uttam K. Roy
Department of Information Technology
Jadavpur University
Kolkata
Oxford University Press 2013
Chapter 6
eXtensible Markup Language
(XML)
Oxford University Press 2013
An HTML system
HTML
document Web Server
Internet
Web Client
Parser,
formatter,
interface
Oxford University Press 2013
Role of HTML
HTML
Designed to display data
Focuses on appearance
Has a fixed set of predefined tags
Ambiguity
Oxford University Press 2013
Role of XML
EXtensible Markup Language
W3C recommendation, 1998
Designed to structure, transport and store data
Transformation and Dynamic data customization
Interoperable way to represent and process
documents (not necessarily on web)
Self descriptive
Oxford University Press 2013
Example
<note>
<to>John</to>
<from>Ani</from>
<heading>Reminder</heading>
<body>Return my book on
Monday</body>
</note>
Oxford University Press 2013
Another Example
<song>
<title>Requiem</title>
<composer>Mozart</composer>
</song>
Equivalent HTML code:
<p>Requiem is a song composed by
Mozart</p>
Oxford University Press 2013
Role of XML
Not a replacement of HTML
XML focuses on what data are
HTML focuses on how data look
Tags are custom defined (not predefined)
Functional meaning depends on application
Everything must be marked up correctly
Oxford University Press 2013
XML and Databases
XML brings benefits of DBs to documents
Schema to model information directly
Formal validation, locking, versioning, rollback...
But
Not all traditional database concepts map
cleanly, because documents are fundamentally
different in some ways
Oxford University Press 2013
XML Building blocks
Element
Delimited by angular brackets
Identifies the nature of the content it surrounds
General format: <element> </element>
Empty element: <empty-element/>
Attribute
Name-value pairs that occur inside start-tags after
element name, like:
<element attribute=value>
Oxford University Press 2013
XML Building blocks--Prolog
The part of an XML document that precedes the
XML data
Includes
A declaration: version [, encoding, standalone]
<?xml version="1.0" encoding="ISO-8859-1"
standalone="yes"?>
An optional DTD (Document Type Definition )
<!DOCTYPE greeting SYSTEM "hello.dtd">
Processing Instructions (Optional)
<?xml-stylesheet href="simple.xsl"
type="text/xsl"?>
Oxford University Press 2013
XML Elements
XML Elements are Extensible
More and more elements may be added to carry more
information
XML Elements have Relationships
Elements are related as parents and children
Elements have Content
Elements can have different types of content:
empty content
simple content
element content
mixed content
attributes
XML elements must follow the naming rules
Oxford University Press 2013
XML Elements naming rules
Names can only contain letters, digits and some
other special characters.
Names can not start with a number or
punctuation marks
Names must not contain the string xml, XML
or Xml
Names can not contain while space(s).
Oxford University Press 2013
Anatomy of an element
Element type
Element type
Attribute
(character)
entity
Attribute Attribute
reference
name value
<p type="rule">Use a hyphen: ­.</p>
Start-tag Content End-tag
Element
Oxford University Press 2013
The Basic Rules
XML is case sensitive
<Message>This is incorrect</message>
<message>This is correct</message>
Oxford University Press 2013
The Basic Rules
All start tags must have end tags
<composer>Mozart
<composer>Mozart</composer>
Empty Element
<BR></BR>
<BR/>
<img align=center
src=logo.gif/>
<composer name=Mozart></composer>
<composer name=Mozart/>
Oxford University Press 2013
The Basic Rules
Elements must be properly nested
<b><i>This is incorrect nesting</b></i>
<b><i>This is correct nesting</i></b>
Oxford University Press 2013
The Basic Rules
XML declaration must be the first statement
<?xml version="1.0" encoding="ISO-
8859-1" stAandalone="yes"?>
Oxford University Press 2013
The Basic Rules
Every document must contain a root element
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
Oxford University Press 2013
The Basic Rules
Attribute values must be quoted with inverted
commas
<note date="12/11/2007">
<to>Ani</to>
<from>John</from>
</note>
Oxford University Press 2013
The Basic Rules
Certain characters are reserved for parsing
<message>if salary < 1000 then</message>
<message>if salary < 1000 then</message>
Oxford University Press 2013
Predefined entities
< < less than
> > greater than
& & &ersand
' ' apostrophe
" " quotation mark
Oxford University Press 2013
The Basic Rules
With XML, white space is preserved
With XML, a new line is always stored as LF
Comments in XML: <!-- This is a comment -->
Can go almost anywhere (not inside tags)
Schemas can contain comments, too
Oxford University Press 2013
Common Errors for Element Naming
Do not use white space when creating names
for elements
Element names cannot begin with a digit,
although names can contain digits
Only certain punctuation allowed periods,
colons, and hyphens
Oxford University Press 2013
XML Attributes
Located in the start tag of elements
Provide additional information about
elements
Often provide information that is not a part
of data
Must be enclosed in quotes
Should I use an element or an attribute?
metadata (data about data) should be stored as attributes, and
that data itself should be stored as elements
Oxford University Press 2013
Types of XML Documents
XML document
Well Formed XML.
Syntax is correct
Valid XML.
Well formed
Validated against a DTD/Schema
Oxford University Press 2013
Valid XML
Properties
Well Formed
Comply with the rules defined in a DTD/Schema
Advantage
Clear Understanding
Data verification
Interoperability
Better document processing
Oxford University Press 2013
XML Validation
XML
document
Optimized
XML XML XML
schema Parser document
Error
messages
xmllint --valid sample.xml
Oxford University Press 2013
Dislaying XML
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/xsl"
href="books.xsl"?>
<bookstore>
<book category="literature">
<title lang="beng">Sanchoita</title>
<author>Rabindranath Tagore</author>
<year>2009</year>
<price>200.00</price>
</book>
</bookstore>
Oxford University Press 2013
Displaying XML
XHTML
XML namespace
XML DOM
XPath
XSL (XSLT+XPath)
Client side
By browser
Explicitly by author (using JavaScript)
Server side
Schema
Oxford University Press 2013
Displaying XML
XML documents do not carry information
about how to display the data
We can add display information to XML with
CSS (Cascading Style Sheets)
XSL (eXtensible Stylesheet Language) --- preferred
Oxford University Press 2013
XML into HTML
XSLT can transform into (called "output
method"):
XML
HTML
text
Server-side XSLT engine
content in XML
served as HTML
browser never knows
Oxford University Press 2013
Client-side XSL
XML
XSLT
FO
Oxford University Press 2013
Server-side XSL
XML XSLT
engine HTML
XSLT
Oxford University Press 2013
XML DOM
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
Oxford University Press 2013
XML DOM Tree
Oxford University Press 2013