0% found this document useful (0 votes)
101 views32 pages

Chapter 17

The document discusses XML (eXtensible Markup Language), including its definition, uses, syntax rules, elements, attributes, and validation. XML is a text-based markup language that can be used to mark up any type of data and is commonly used to transfer data between systems.

Uploaded by

Bhumika Gowda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views32 pages

Chapter 17

The document discusses XML (eXtensible Markup Language), including its definition, uses, syntax rules, elements, attributes, and validation. XML is a text-based markup language that can be used to mark up any type of data and is commonly used to transfer data between systems.

Uploaded by

Bhumika Gowda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

XML Extensible

Markup Language

Chapter 17 - 1

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


Textbook to be published by Pearson ©
Ed2015
in early
Pearson
2014
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
http://www.funwebdev.com
XML Overview
Introduction

• XML is a text-based markup language, but unlike HTML, XML


can be used to mark up any type of data.
• Derived from Standard Generalized Markup Language SGML
• One of the key benefits of XML data is that as plain text, it can
be read and transferred between applications and different
operating systems as well as being human-readable and
understandable as well.
• XML is not only used on the web server and to communicate
asynchronously with the browser, but is also used as a data
interchange format for moving information between systems

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


XML Overview
XML in the web context - Used in many systems

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


Well Formed XML
Sample Document

XML declaration is analogous to


HTML DOCTYPE

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


Well Formed XML
Syntax Rules

For a document to be well-formed XML, it must follow the syntax


rules for XML:
• Element names are composed of any of the valid characters
(most punctuation symbols and spaces are not allowed) in XML.
• Element names can’t start with a number.
• There must be a single-root element. A root element is one that
contains all the other elements; for instance, in an HTML
document, the root element is <html>.
• All elements must have a closing element (or be self-closing).
• Elements must be properly nested.
• Elements can contain attributes.
• Attribute values must always be within quotes.
• Element and attribute names are case sensitive.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Valid XML
Requires a DTD

• Validation is a process by which an XML document is


validated. An XML document is said to be valid if its content
matches with the elements, attributes and other piece of
an associated document type declaration and if the
document complies with the constraints expressed in it.
• A valid XML document is one that is well formed and
whose element and content conform to the rules of either
its Document Type definition (DTD) or its schema.
• A DTD tells the XML parser which elements and attributes
to expect in the document as well as the order and nesting
of those elements.
• A DTD can be defined within an XML document or within an
external file.

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


XSLT
XML Stylesheet Transformations

XSLT is an XML-based
programming language
that is used for
transforming XML into
other document
formats

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


XSLT
Another usage

XSLT is also used on the server side and within JavaScript

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


XSLT
Example XSLT document that converts the XML from Listing 17.1 into an HTML list

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


XSLT
An XML parser is still needed to perform the actual transformation

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


XPath
Another XML Technology

XPath is a standardized syntax for searching an XML


document and for navigating to elements within the
XML document
XPath is typically used as part of the programmatic
manipulation of an XML document in PHP and other
languages
XPath uses a syntax that is similar to the one used in
most operating systems to access directories.

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


XPath
Learn through example

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


XML Basics
• XML tags identify the data and are used to store and
organize the data, rather than specifying how to display
it like HTML tags
• XML Characteristics:
• XML is extensible: XML allows you to create your own
self-descriptive tags, or language, that suits your
application.
• XML carries the data, does not present it: XML allows
you to store the data irrespective of how it will be
presented.
• XML is a public standard: XML was developed by World
Wide Web Consortium (W3C) and is available as an
open standard.

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


XML Usage
list of XML usage

XML can:
• work behind the scene to simplify the creation of HTML
for large web sites.
• Be used to exchange the information between
organizations and systems.
• Be used for offloading and reloading of databases.
• Be used to store and arrange the data.
• Be easily be merged with style sheets to create almost
any desired output.

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


What is a Markup?

• XML is not a markup language, but the set of rules


building a markup language.
• XML is not a programming language, but it:
• does have certain syntax

• can be processed by special programs (parsers)

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


XML Syntax
<?xml version="1.0"?> Declaration
<contact-info> Root Element
<person>
<name lang="en">Tanmay Patil</name>
<company>TutorialsPoint</company>
<phone>(011) 123-4567</phone>
</person>
Elements <person>
<name lang="ar">Ahmed Ali</name>
<company>Batelco</company>
<phone>(00973)17448888</phone>
</person>
<!--- This is a comment --->
</contact-info>

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


XML Syntax Rules

 XML Declaration
 Tags and Elements
 Attributes
 References
 Text

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


Declaration

<?xml version="1.0"?>
• Optional but must be the first statement if used.
• Case sensitive
• Optional attributes include:
• version: Always 1.0
• encoding: Default is UTF-8
• standalone: Yes or No (Default is No)
It informs the parser whether the document relies
on information from an external source, such as
external document type definition, for its content.

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


Tags
Syntax Rules

• Tag names are enclosed by triangular brackets


<element> .. </element> or in simple-cases empty
tags <element />
• Nesting of elements: can contain multiple XML-
elements as its children, but the children elements must
not overlap. XML tags must be closed in order i.e XML
tag opened inside another element must be closed
before the outer element is closed.
• Only one root element.
• Case sensitive: <contact-info> ≠ <Contact-Info>

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


Attributes

• Attribute gives more information about the XML


element or more precisely it defines a property of the
element.
• An XML attribute is always a name-value pair.
• The syntax for attributes is name="value"
• You can have multiple attributes for each element.
• There are three types of attributes:
• StringType
• TokenizedType
• EnumratedType

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


Attributes Rules

• An attribute name must not appear more than


once in the same start-tag or empty-element tag.
• The attribute must have been declared; the value
must be of the type declared for it.
• Attribute values must not contain direct or indirect
entity references to external entities.
• The replacement text of any entity referred to
directly or indirectly in an attribute value must not
contain either less than sign < or grater than > sign
>.

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


Character Entities

• Entities are placeholders in XML.


• There are three types of character entities:
1. Predefined Character Entities: to avoid the ambiguity while
using some of symbols
2. Numbered Character Entities: To refer the character entity
numeric reference can be used. Numeric reference can either be
in decimal or hexadecimal numbers.
3. Named Character Entities: for special character, such as A-acute
(Á)

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


Character Data (CDATA) &
White Space
• CDATA are used to escape block of text that does not
parsed by the parser and are otherwise recognized as
markup
<![CDATA[ CDATA Start

<message>Welcome to TutorialsPoint</message>

]] > CDATA End

• Whitespace is handled in a significant or non


significant manner in XML
• Significant whitespace: within the element which contain
text and markup mixed together
• Non significant whitespace is the spaces where only element
content is allowed.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Processing
Processing Instructions (PI)

• Processing Instructions (PI) can be used to pass


information to applications so as to escapes most
XML rules.
• PIs can appear anywhere in the document outside
of other markup.
• PIs are hardly used. They are known mostly to be
used to link XML document to a stylesheet.
<?xml-stylesheet
href="tutorialspointstyle.css"
type="text/css"?>

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


Encoding

• Encoding is the process of converting characters into


their equivalent binary representation.
• When an XML processor reads an XML document,
depending on the type of encoding it encodes the
document. Hence we need to specify the type of
encoding in the XML declaration.
• There are mainly two types of encoding present: UTF-8
and UTF-16. UTF stands for UCS Transformation Format,
and UCS itself means Universal Character Set.
• UTF-8 is considered the default encoding when a declaration or
the encoding attribute is missing.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Validation

• Validation is a process by which an XML document is


validated.
• An XML document is said to be valid if its content
matches with the elements, attributes and other
piece of an associated document type declaration and
if the document complies with the constraints
expressed in it.
• Validation is dealt in two ways by the XML parser:
• Well-formed XML document
• Valid XML document

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


Valid XML document

• If an XML document is well-formed and has an associated


Document Type Declaration [DTD] , then it is said to be a
valid XML document.
• The main drawback with DTDs is that they can only
validate the existence and ordering of elements. They
provide no way to validate the values of attributes or the
textual content of elements.
• For this type of validation, one must instead use XML schemas,
which have the added advantage of using XML syntax.
• Unfortunately, schemas have the corresponding disadvantage of
being long-winded and harder for humans to read and
comprehend; for this reason, they are typically created with
tools.

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


Data Type Definition
Example

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


XML Schema
Just one example

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


XML Viewers and Editors

• An XML document can be viewed using a simple text


editor or any browser.
• Most of the major browsers supports XML.
• XML files are saved with a ".xml" extension.

• XML Editor is a markup language editor. The XML documents


can be edited or created using existing editors such as
Notepad, WordPad or any simple text editor.
• Most IDEs support XML editing and validation

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


XML Parser

• XML parser is a software library or a package that


provides methods for client applications to work
with XML documents. It checks for proper format of
the XML document and may also validate the XML
• XML Parsers:
• Verifies that an XML document is well formed.
• Checks XML document for syntax errors
• Converts XML document into some type of internal
memory structure
• All contemporary browsers have built-in parsers as do
most web development environments such as PHP and
ASP.NET

Randy Connolly and Ricardo Hoar Fundamentals of Web Development


XML Processor
• When a software program reads an XML document and
does something with it, this is called processing the
XML.
• Therefore, any program that can read and that can process
XML documents is known as an XML processor.
• An XML processor reads an XML file and turns it into in-
memory structures that the rest of the program can do
whatever it likes with.
• The most fundamental XML processor reads XML
documents and converts them into an internal
representation for other programs or subroutines to use.
• This is called a parser, and it is an important component of
every XML processing program.

Randy Connolly and Ricardo Hoar Fundamentals of Web Development

You might also like