Introduction to XML
Extensible Markup Language
Extensible Markup Language
• Introduction
• SGML is a meta-markup language is a language for defining markup language it
can describe a wide variety of document types.
• Developed in the early 1980s; In 1986 SGML was approved by ISO std.
• HTML was developed using SGML in the early 1990s - specifically for Web
documents.
• Two problems with HTML:
• 1. HTML is defined to describe the general form and layout of information without
considering its meaning.
• 2. Fixed set of tags and attributes. Given tags must fit every kind of document. No
way to find particular information
• 3. There are no restrictions on arrangement or order of tag appearance in
document.
What is XML
• XML stands for eXtensible Markup Language.
• A markup language is used to provide information about a
document.
• Tags are added to the document to provide the extra
information.
• XML was designed to describe data, not to display data
• XML tags are not predefined. You must define your own tags
• HTML tags tell a browser how to display the document.
• XML tags give a reader some idea what some of the data
means.
What is XML Used For?
• XML documents are used to transfer data from one
place to another often over the Internet.
• XML subsets are designed for particular applications.
• A number of fields have their own subsets. These
include chemistry, mathematics, and books publishing.
• Most of these subsets are registered with the
W3Consortium and are available for anyone’s use.
How Can XML be Used?
• If you need to display dynamic data in your HTML document, it
will take a lot of work to edit the HTML each time the data
changes.
• With XML, data can be stored in separate XML files. This way
you can concentrate on using HTML/CSS for display and
layout, and be sure that changes in the underlying data will not
require any changes to the HTML.
• With a few lines of JavaScript code, you can read an external
XML file and update the data content of your web page.
Advantages of XML
• XML is text (Unicode) based.
– Takes up less space.
– Can be transmitted efficiently.
• XML documents can be modularized. Parts can
be reused.
Example of an HTML Document
<html>
<head><title>Example</title></head.
<body>
<h1>This is an example of a page.</h1>
<h2>Some information goes here.</h2>
</body>
</html>
Example of an XML Document
<?xml version=“1.0”/>
<address>
<name>Alice Lee</name>
<email>alee@aol.com</email>
<phone>212-346-1234</phone>
<birthday>1985-03-22</birthday>
</address>
Difference Between HTML and XML
• HTML tags have a fixed meaning and
browsers know what it is.
• XML tags are different for different
applications, and users know what they
mean.
• HTML tags are used for display.
• XML tags are used to describe documents
and data.
XML Rules
• Tags are enclosed in angle brackets.
• Tags come in pairs with start-tags and
end-tags.
• Tags must be properly nested.
– <name><email>…</name></email> is not allowed.
– <name><email>…</email><name> is.
• Tags that do not have end-tags must be
terminated by a ‘/’.
– <br /> is an html example.
More XML Rules
• Tags are case sensitive.
– <address> is not the same as <Address>
• XML in any combination of cases is not allowed
as part of a tag.
• Tags may not contain ‘<‘ or ‘&’.
• Tags follow Java naming conventions, except
that a single colon and other characters are
allowed. They must begin with a letter and may
not contain white space.
• Documents must have a single root tag that
begins the document.
Encoding
• XML (like Java) uses Unicode to encode characters.
• Unicode comes in many flavors. The most common one
used in the West is UTF-8.
• UTF-8 is a variable length code. Characters are
encoded in 1 byte, 2 bytes, or 4 bytes.
• The first 128 characters in Unicode are ASCII.
• In UTF-8, the numbers between 128 and 255 code for
some of the more common characters used in western
Europe, such as ã, á, å, or ç.
Well-Formed Documents
• An XML document is said to be well-formed if it
follows all the rules.
• An XML parser is used to check that all the rules
have been obeyed.
• Recent browsers such as Internet Explorer 5
and Netscape 7 come with XML parsers.
• Parsers are also available for free download
over the Internet.
• Java 1.4 also supports an open-source parser.
XML Example Revisited
<?xml version=“1.0”/>
<address>
<name>Alice Lee</name>
<email>alee@aol.com</email>
<phone>212-346-1234</phone>
<birthday>1985-03-22</birthday>
</address>
• Markup for the data helps understanding of its purpose.
• A flat text file is not nearly so clear.
Alice Lee
alee@aol.com
212-346-1234
1985-03-22
• The last line looks like a date, but what is it for?
Expanded Example
<?xml version = “1.0” ?>
<address>
<name>
<first>Alice</first>
<last>Lee</last>
</name>
<email>alee@aol.com</email>
<phone>123-45-6789</phone>
<birthday>
<year>1983</year>
<month>07</month>
<day>15</day>
</birthday>
</address>
XML Files are Trees
address
name email phone birthday
first last year month day
XML Trees
• An XML document has a single root node.
• The tree is a general ordered tree.
– A parent node may have any number of
children.
– Child nodes are ordered, and may have
siblings.
Validity
• A well-formed document has a tree structure and
obeys all the XML rules.
• A particular application may add more rules in
either a DTD (document type definition) or in a
schema.
• Many specialized DTDs and schemas have
been created to describe particular areas.
• These range from disseminating news bulletins
to chemical formulas.
• DTDs were developed first, so they are not as
comprehensive as schema.
Document Type Definitions
• A DTD describes the tree structure of a
document and something about its data.
• There are two data types, PCDATA and
CDATA.
– PCDATA is parsed character data.
– CDATA is character data, not usually parsed.
• A DTD determines how many times a
node may appear, and how child nodes
are ordered.
Parsing
• Breaking a data block into smaller chunks by following a set
of rules, so that it can be more easily interpreted, managed, or
transmitted by a computer. Spreadsheet programs, for
example, parse a data to fit it into a cell of certain size.
Document Type Definitions
• The form of an element declaration for
elements that contain elements
• <!ELEMENT element_name(list of names of child elements)>
• The form of an Attribute declaration
• <!ATTLIST element_name attribute_name
attribute_type[default_value]>
• Ex.<!ATTLIST airplane places CDATA “4”>
DTD for address Example
<!ELEMENT address (name, email, phone, birthday)>
<!ELEMENT name (first, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
<!ELEMENT birthday (year, month, day)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT month (#PCDATA)>
<!ELEMENT day (#PCDATA)>
INTERNAL AND EXTERNAL DTDs
• Internal DTD Example:
• External DTD Example: [assuming that the DTD
is stored in the file named planes.dtd]
<!DOCTYPE planes_for_sale SYSTEM “planes.dtd”>
NAMESPACES
• It is often convenient to construct XML documents that include
tag sets that are defined for and used by other documents.
• When a tag set is available and appropriate for particular XML
document, it is better to use it rather than to invent new
collection of element types.
• problem with using different markup vocabularies in the same
document is that collisions between names that are defined in
two or more of those tag sets could result.
• An example of this situation is having a <table> tag for a
category of furniture and a <table> tag from XHTML for
information tables.
NAMESPACES
• An XML namespace is a collection of element and attribute
names used in XML documents. The name of a namespace
usually has the form of a uniform resource identifier (URI).
• The form of a namespace declaration for an element is
• <element_name xmlns[:prefix] = URI>
• The square brackets indicate that what is within them is
optional. The prefix, if included, is the name that must be
attached to the names in the declared namespace.
• <html xmlns = “http://www.w3.org/1999/xhtml”>
NAMESPACES
• The next example declares two namespaces. The first is
declared to be the default namespace; the second defines the
prefix, cap:
XML SCHEMAS
• XML schemas is similar to DTD i.e. schemas are used to
define the structure of the document
• DTDs had several disadvantages:
• The syntax of the DTD was un-related to XML, therefore they
cannot be analyzed with an XML processor
• It was very difficult for the programmers to deal with 2
different types of syntaxes
• DTDs does not support the datatype of content of the tag. All
of them are specified as text
Schemas
• Schemas are themselves XML documents.
• They were standardized after DTDs and provide
more information about the document.
• They have a number of data types including
string, decimal, integer, boolean, date, and time.
• They divide elements into simple and complex
types.
• They also determine the tree structure and how
many children a node may have.
DEFINING A SCHEMA
• Schemas themselves are written with the use of a collection of
tags, from a namespace that is, in effect, a schema of
schemas.
• The name of this namespace is
http://www.w3.org/2001/XMLSchema.
• Every schema has schema as its root element. This
namespace specification appears as follows:
• xmlns:xsd = “http://www.w3.org/2001/XMLSchema”
• The name of the namespace defined by a schema must be
specified with the targetNamespace attribute of the schema
element.
– targetNamespace = “http://cs.uccs.edu/planeSchema”
DEFINING A SCHEMA
DEFINING A SCHEMA INSTANCE
• An instance document normally defines its default namespace to be the one
defined in its schema.
• for example, if the root element is planes, we could have
<planes xmlns = “http://cs.uccs.edu/planeSchema” ... >
• The second attribute specification in the root element of an instance
document is for the schemaLocation attribute. This attribute is used to name
the standard namespace for instances, which includes the name
XMLSchema-instance.
xmlns:xsi = http://www.w3.org/2001/XMLSchema-instance
• Third, the instance document must specify the filename of the schema in
which the default namespace is defined. This is accomplished with the
schemaLocation attribute, which takes two values: the namespace of the
schema and the filename of the schema.
Schema for First address Example
<?xml version="1.0" encoding="ISO-8859-1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="address">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="email" type="xs:string"/>
<xs:element name="phone" type="xs:string"/>
<xs:element name="birthday" type="xs:date"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Explanation of Example Schema
<?xml version="1.0" encoding="ISO-8859-1" ?>
• ISO-8859-1, Latin-1, is the same as UTF-8 in the first 128 characters.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
• www.w3.org/2001/XMLSchema contains the schema standards.
<xs:element name="address">
<xs:complexType>
• This states that address is a complex type element.
<xs:sequence>
• This states that the following elements form a sequence and must
come in the order shown.
<xs:element name="name" type="xs:string"/>
• This says that the element, name, must be a string.
<xs:element name="birthday" type="xs:date"/>
• This states that the element, birthday, is a date. Dates are always of
the form yyyy-mm-dd.
XSLT
• XSL = Style Sheets for XML
• XML does not use predefined tags (we can use any
tag-names we like), and therefore the meaning of
each tag is not well understood.
• A <table> tag could mean an HTML table, a piece of
furniture, or something else - and a browser does
not know how to display it.
• XSL describes how the XML document should be
displayed!
XSLT
• The eXtensible Stylesheet Language (XSL) is a family of
recommendations for defining the presentation and
transformations of XML documents.
• It consists of three related standards:
– XSL Transformations (XSLT),
– XML Path Language (XPath), and
– XSL Formatting Objects (XSL-FO).
• XSLT is used to transform one xml document into another,
often an html document.
• A program is used that takes as input one xml document and
produces as output another.
• If the resulting document is in html, it can be viewed by a web
browser.
• This is a good way to display xml data.
XSLT
• XPath is a language for expressions, which are often used to
identify parts of XML documents.
• such as specific elements that are in specific positions in the
document or elements that have particular attribute values.
OVERVIEW OF XSLT
• XSLT processors take both an XML document and an
XSLT document as input.
• the XSLT document is the program to be executed; the
XML document is the input data to the program.
• An XSLT document consists primarily
of one or more templates.
• One XSLT model of processing XML
data is called the template-driven model
• l
Introduction to XML.ppt
A Style Sheet to Transform address.xml
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="address">
<html><head><title>Address Book</title></head>
<body>
<xsl:value-of select="name"/>
<br/><xsl:value-of select="email"/>
<br/><xsl:value-of select="phone"/>
<br/><xsl:value-of select="birthday"/>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
The Result of the Transformation
Alice Lee
alee@aol.com
123-45-6789
1983-7-15
Parsers
• There are two principal models for
parsers.
• SAX – Simple API for XML
– Uses a call-back method
– Similar to javax listeners
• DOM – Document Object Model
– Creates a parse tree
– Requires a tree traversal

More Related Content

PPT
Introduction to XML
PPTX
Xml unit1
PPTX
Internet_Technology_UNIT V- Introduction to XML.pptx
PPTX
XML-Extensible Markup Language
PPTX
Extensible Markup Language (XML)
PPTX
Introduction to XML
PPTX
Intro xml
Introduction to XML
Xml unit1
Internet_Technology_UNIT V- Introduction to XML.pptx
XML-Extensible Markup Language
Extensible Markup Language (XML)
Introduction to XML
Intro xml

Similar to Introduction to XML.ppt (20)

PPT
web program-Extended MARKUP Language XML.ppt
PPTX
Web Technology Part 4
PPT
Ch2 neworder
PPTX
Data interchange integration, HTML XML Biological XML DTD
PPT
cis110-xml-xhtml engineering computer science
PPT
PPTX
PPTX
Adbms_unit1_1.pptx dsfdsfdfdfdfsdfdsfdsf
PPTX
Unit iv xml dom
PDF
M.FLORENCE DAYANA WEB DESIGN -Unit 5 XML
PPT
1 xml fundamentals
PPTX
Xml basics
PDF
WT UNIT-2 XML.pdf
PPT
01 xml document structure
PPT
Xml iet 2015
PPTX
Unit3wt
PPTX
Unit3wt
PPTX
Adbms_unit1_1 - Copy.pptxzsszcsczsxczxcxzcxzc
PDF
Markup For Dummies (Russ Ward)
web program-Extended MARKUP Language XML.ppt
Web Technology Part 4
Ch2 neworder
Data interchange integration, HTML XML Biological XML DTD
cis110-xml-xhtml engineering computer science
Adbms_unit1_1.pptx dsfdsfdfdfdfsdfdsfdsf
Unit iv xml dom
M.FLORENCE DAYANA WEB DESIGN -Unit 5 XML
1 xml fundamentals
Xml basics
WT UNIT-2 XML.pdf
01 xml document structure
Xml iet 2015
Unit3wt
Unit3wt
Adbms_unit1_1 - Copy.pptxzsszcsczsxczxcxzcxzc
Markup For Dummies (Russ Ward)
Ad

More from Varsha Uchagaonkar (6)

PPT
chap04.ppt
PPTX
wpsession15.pptx
PPTX
wpsession9.pptx
PPTX
wptoolbox.pptx
PPT
Introduction to XML.ppt
chap04.ppt
wpsession15.pptx
wpsession9.pptx
wptoolbox.pptx
Introduction to XML.ppt
Ad

Recently uploaded (20)

PDF
Designing Through Complexity - Four Perspectives.pdf
PPT
416170345656655446879265596558865588.ppt
PPTX
Arunesh_Kevin Lynch.pptxynynynyunynynynnynyn
PPTX
UNIT II - UID FOR MOBILE GAMES[INTRODUCTION TO MOBILE GAME DESIGN]
PPTX
22CDH01-V3-UNIT III-UX-UI for Immersive Design
PPTX
Presentation1.pptxnmnmnmnjhjhkjkjkkjkjjk
PDF
Social Media USAGE .............................................................
PDF
Humans do not die they live happily without
PPT
EthicsNotesSTUDENTCOPYfghhnmncssssx sjsjsj
PDF
Instagram Marketing in 2025 Reels, Stories, and Strategy (14) (2).pdf
PPT
Wheezing1.ppt powerpoint presentation for
PPTX
UNIT III - GRAPHICS AND AUDIO FOR MOBILE
PDF
Control and coordination isdorjdmdndjke
PDF
Pfthuujhgdddtyygghjjiuyggghuiiiijggbbhhh
PDF
IARG - ICTC ANALOG RESEARCH GROUP - GROUP 1 - CHAPTER 2.pdf
PDF
Humans do not die they live happily without
PPTX
SOBALAJE WORK.pptxe4544556y8878998yy6555y5
PPTX
Necrosgwjskdnbsjdmdndmkdndndnmdndndkdmdndkdkndmdmis.pptx
PPTX
Introduction to Building Information Modeling
PPTX
WHY UPLOADING IS IMPORTANT TO DOWNLOAD SLIDES.pptx
Designing Through Complexity - Four Perspectives.pdf
416170345656655446879265596558865588.ppt
Arunesh_Kevin Lynch.pptxynynynyunynynynnynyn
UNIT II - UID FOR MOBILE GAMES[INTRODUCTION TO MOBILE GAME DESIGN]
22CDH01-V3-UNIT III-UX-UI for Immersive Design
Presentation1.pptxnmnmnmnjhjhkjkjkkjkjjk
Social Media USAGE .............................................................
Humans do not die they live happily without
EthicsNotesSTUDENTCOPYfghhnmncssssx sjsjsj
Instagram Marketing in 2025 Reels, Stories, and Strategy (14) (2).pdf
Wheezing1.ppt powerpoint presentation for
UNIT III - GRAPHICS AND AUDIO FOR MOBILE
Control and coordination isdorjdmdndjke
Pfthuujhgdddtyygghjjiuyggghuiiiijggbbhhh
IARG - ICTC ANALOG RESEARCH GROUP - GROUP 1 - CHAPTER 2.pdf
Humans do not die they live happily without
SOBALAJE WORK.pptxe4544556y8878998yy6555y5
Necrosgwjskdnbsjdmdndmkdndndnmdndndkdmdndkdkndmdmis.pptx
Introduction to Building Information Modeling
WHY UPLOADING IS IMPORTANT TO DOWNLOAD SLIDES.pptx

Introduction to XML.ppt

  • 2. Extensible Markup Language • Introduction • SGML is a meta-markup language is a language for defining markup language it can describe a wide variety of document types. • Developed in the early 1980s; In 1986 SGML was approved by ISO std. • HTML was developed using SGML in the early 1990s - specifically for Web documents. • Two problems with HTML: • 1. HTML is defined to describe the general form and layout of information without considering its meaning. • 2. Fixed set of tags and attributes. Given tags must fit every kind of document. No way to find particular information • 3. There are no restrictions on arrangement or order of tag appearance in document.
  • 3. What is XML • XML stands for eXtensible Markup Language. • A markup language is used to provide information about a document. • Tags are added to the document to provide the extra information. • XML was designed to describe data, not to display data • XML tags are not predefined. You must define your own tags • HTML tags tell a browser how to display the document. • XML tags give a reader some idea what some of the data means.
  • 4. What is XML Used For? • XML documents are used to transfer data from one place to another often over the Internet. • XML subsets are designed for particular applications. • A number of fields have their own subsets. These include chemistry, mathematics, and books publishing. • Most of these subsets are registered with the W3Consortium and are available for anyone’s use.
  • 5. How Can XML be Used? • If you need to display dynamic data in your HTML document, it will take a lot of work to edit the HTML each time the data changes. • With XML, data can be stored in separate XML files. This way you can concentrate on using HTML/CSS for display and layout, and be sure that changes in the underlying data will not require any changes to the HTML. • With a few lines of JavaScript code, you can read an external XML file and update the data content of your web page.
  • 6. Advantages of XML • XML is text (Unicode) based. – Takes up less space. – Can be transmitted efficiently. • XML documents can be modularized. Parts can be reused.
  • 7. Example of an HTML Document <html> <head><title>Example</title></head. <body> <h1>This is an example of a page.</h1> <h2>Some information goes here.</h2> </body> </html>
  • 8. Example of an XML Document <?xml version=“1.0”/> <address> <name>Alice Lee</name> <email>[email protected]</email> <phone>212-346-1234</phone> <birthday>1985-03-22</birthday> </address>
  • 9. Difference Between HTML and XML • HTML tags have a fixed meaning and browsers know what it is. • XML tags are different for different applications, and users know what they mean. • HTML tags are used for display. • XML tags are used to describe documents and data.
  • 10. XML Rules • Tags are enclosed in angle brackets. • Tags come in pairs with start-tags and end-tags. • Tags must be properly nested. – <name><email>…</name></email> is not allowed. – <name><email>…</email><name> is. • Tags that do not have end-tags must be terminated by a ‘/’. – <br /> is an html example.
  • 11. More XML Rules • Tags are case sensitive. – <address> is not the same as <Address> • XML in any combination of cases is not allowed as part of a tag. • Tags may not contain ‘<‘ or ‘&’. • Tags follow Java naming conventions, except that a single colon and other characters are allowed. They must begin with a letter and may not contain white space. • Documents must have a single root tag that begins the document.
  • 12. Encoding • XML (like Java) uses Unicode to encode characters. • Unicode comes in many flavors. The most common one used in the West is UTF-8. • UTF-8 is a variable length code. Characters are encoded in 1 byte, 2 bytes, or 4 bytes. • The first 128 characters in Unicode are ASCII. • In UTF-8, the numbers between 128 and 255 code for some of the more common characters used in western Europe, such as ã, á, å, or ç.
  • 13. Well-Formed Documents • An XML document is said to be well-formed if it follows all the rules. • An XML parser is used to check that all the rules have been obeyed. • Recent browsers such as Internet Explorer 5 and Netscape 7 come with XML parsers. • Parsers are also available for free download over the Internet. • Java 1.4 also supports an open-source parser.
  • 14. XML Example Revisited <?xml version=“1.0”/> <address> <name>Alice Lee</name> <email>[email protected]</email> <phone>212-346-1234</phone> <birthday>1985-03-22</birthday> </address> • Markup for the data helps understanding of its purpose. • A flat text file is not nearly so clear. Alice Lee [email protected] 212-346-1234 1985-03-22 • The last line looks like a date, but what is it for?
  • 15. Expanded Example <?xml version = “1.0” ?> <address> <name> <first>Alice</first> <last>Lee</last> </name> <email>[email protected]</email> <phone>123-45-6789</phone> <birthday> <year>1983</year> <month>07</month> <day>15</day> </birthday> </address>
  • 16. XML Files are Trees address name email phone birthday first last year month day
  • 17. XML Trees • An XML document has a single root node. • The tree is a general ordered tree. – A parent node may have any number of children. – Child nodes are ordered, and may have siblings.
  • 18. Validity • A well-formed document has a tree structure and obeys all the XML rules. • A particular application may add more rules in either a DTD (document type definition) or in a schema. • Many specialized DTDs and schemas have been created to describe particular areas. • These range from disseminating news bulletins to chemical formulas. • DTDs were developed first, so they are not as comprehensive as schema.
  • 19. Document Type Definitions • A DTD describes the tree structure of a document and something about its data. • There are two data types, PCDATA and CDATA. – PCDATA is parsed character data. – CDATA is character data, not usually parsed. • A DTD determines how many times a node may appear, and how child nodes are ordered.
  • 20. Parsing • Breaking a data block into smaller chunks by following a set of rules, so that it can be more easily interpreted, managed, or transmitted by a computer. Spreadsheet programs, for example, parse a data to fit it into a cell of certain size.
  • 21. Document Type Definitions • The form of an element declaration for elements that contain elements • <!ELEMENT element_name(list of names of child elements)> • The form of an Attribute declaration • <!ATTLIST element_name attribute_name attribute_type[default_value]> • Ex.<!ATTLIST airplane places CDATA “4”>
  • 22. DTD for address Example <!ELEMENT address (name, email, phone, birthday)> <!ELEMENT name (first, last)> <!ELEMENT first (#PCDATA)> <!ELEMENT last (#PCDATA)> <!ELEMENT email (#PCDATA)> <!ELEMENT phone (#PCDATA)> <!ELEMENT birthday (year, month, day)> <!ELEMENT year (#PCDATA)> <!ELEMENT month (#PCDATA)> <!ELEMENT day (#PCDATA)>
  • 23. INTERNAL AND EXTERNAL DTDs • Internal DTD Example: • External DTD Example: [assuming that the DTD is stored in the file named planes.dtd] <!DOCTYPE planes_for_sale SYSTEM “planes.dtd”>
  • 24. NAMESPACES • It is often convenient to construct XML documents that include tag sets that are defined for and used by other documents. • When a tag set is available and appropriate for particular XML document, it is better to use it rather than to invent new collection of element types. • problem with using different markup vocabularies in the same document is that collisions between names that are defined in two or more of those tag sets could result. • An example of this situation is having a <table> tag for a category of furniture and a <table> tag from XHTML for information tables.
  • 25. NAMESPACES • An XML namespace is a collection of element and attribute names used in XML documents. The name of a namespace usually has the form of a uniform resource identifier (URI). • The form of a namespace declaration for an element is • <element_name xmlns[:prefix] = URI> • The square brackets indicate that what is within them is optional. The prefix, if included, is the name that must be attached to the names in the declared namespace. • <html xmlns = “http://www.w3.org/1999/xhtml”>
  • 26. NAMESPACES • The next example declares two namespaces. The first is declared to be the default namespace; the second defines the prefix, cap:
  • 27. XML SCHEMAS • XML schemas is similar to DTD i.e. schemas are used to define the structure of the document • DTDs had several disadvantages: • The syntax of the DTD was un-related to XML, therefore they cannot be analyzed with an XML processor • It was very difficult for the programmers to deal with 2 different types of syntaxes • DTDs does not support the datatype of content of the tag. All of them are specified as text
  • 28. Schemas • Schemas are themselves XML documents. • They were standardized after DTDs and provide more information about the document. • They have a number of data types including string, decimal, integer, boolean, date, and time. • They divide elements into simple and complex types. • They also determine the tree structure and how many children a node may have.
  • 29. DEFINING A SCHEMA • Schemas themselves are written with the use of a collection of tags, from a namespace that is, in effect, a schema of schemas. • The name of this namespace is http://www.w3.org/2001/XMLSchema. • Every schema has schema as its root element. This namespace specification appears as follows: • xmlns:xsd = “http://www.w3.org/2001/XMLSchema” • The name of the namespace defined by a schema must be specified with the targetNamespace attribute of the schema element. – targetNamespace = “http://cs.uccs.edu/planeSchema”
  • 31. DEFINING A SCHEMA INSTANCE • An instance document normally defines its default namespace to be the one defined in its schema. • for example, if the root element is planes, we could have <planes xmlns = “http://cs.uccs.edu/planeSchema” ... > • The second attribute specification in the root element of an instance document is for the schemaLocation attribute. This attribute is used to name the standard namespace for instances, which includes the name XMLSchema-instance. xmlns:xsi = http://www.w3.org/2001/XMLSchema-instance • Third, the instance document must specify the filename of the schema in which the default namespace is defined. This is accomplished with the schemaLocation attribute, which takes two values: the namespace of the schema and the filename of the schema.
  • 32. Schema for First address Example <?xml version="1.0" encoding="ISO-8859-1" ?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="address"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="email" type="xs:string"/> <xs:element name="phone" type="xs:string"/> <xs:element name="birthday" type="xs:date"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
  • 33. Explanation of Example Schema <?xml version="1.0" encoding="ISO-8859-1" ?> • ISO-8859-1, Latin-1, is the same as UTF-8 in the first 128 characters. <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> • www.w3.org/2001/XMLSchema contains the schema standards. <xs:element name="address"> <xs:complexType> • This states that address is a complex type element. <xs:sequence> • This states that the following elements form a sequence and must come in the order shown. <xs:element name="name" type="xs:string"/> • This says that the element, name, must be a string. <xs:element name="birthday" type="xs:date"/> • This states that the element, birthday, is a date. Dates are always of the form yyyy-mm-dd.
  • 34. XSLT • XSL = Style Sheets for XML • XML does not use predefined tags (we can use any tag-names we like), and therefore the meaning of each tag is not well understood. • A <table> tag could mean an HTML table, a piece of furniture, or something else - and a browser does not know how to display it. • XSL describes how the XML document should be displayed!
  • 35. XSLT • The eXtensible Stylesheet Language (XSL) is a family of recommendations for defining the presentation and transformations of XML documents. • It consists of three related standards: – XSL Transformations (XSLT), – XML Path Language (XPath), and – XSL Formatting Objects (XSL-FO). • XSLT is used to transform one xml document into another, often an html document. • A program is used that takes as input one xml document and produces as output another. • If the resulting document is in html, it can be viewed by a web browser. • This is a good way to display xml data.
  • 36. XSLT • XPath is a language for expressions, which are often used to identify parts of XML documents. • such as specific elements that are in specific positions in the document or elements that have particular attribute values.
  • 37. OVERVIEW OF XSLT • XSLT processors take both an XML document and an XSLT document as input. • the XSLT document is the program to be executed; the XML document is the input data to the program. • An XSLT document consists primarily of one or more templates. • One XSLT model of processing XML data is called the template-driven model • l
  • 39. A Style Sheet to Transform address.xml <?xml version="1.0" encoding="ISO-8859-1"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="address"> <html><head><title>Address Book</title></head> <body> <xsl:value-of select="name"/> <br/><xsl:value-of select="email"/> <br/><xsl:value-of select="phone"/> <br/><xsl:value-of select="birthday"/> </body> </html> </xsl:template> </xsl:stylesheet>
  • 40. The Result of the Transformation Alice Lee [email protected] 123-45-6789 1983-7-15
  • 41. Parsers • There are two principal models for parsers. • SAX – Simple API for XML – Uses a call-back method – Similar to javax listeners • DOM – Document Object Model – Creates a parse tree – Requires a tree traversal