XML
Extensible Markup Language
XML (Extensible Markup Language) is a markup language used to store and transport data in a
structured, human-readable, and machine-readable format.
Key Features of XML
1. Extensible – You can create your own custom tags.
2. Self-descriptive – Data is described by meaningful tags.
3. Platform-independent – Works across all systems and programming languages.
4. Data-focused – Unlike HTML (which is for displaying data), XML is for storing and transferring
data.
5. Hierarchical structure – Data is stored in a tree-like structure.
Aspect XML (Extensible Markup HTML (HyperText Markup Language)
Language)
Purpose Used to store and transport data Used to display data in web pages
Tags User-defined tags (custom) Predefined tags (like <p>, <h1>, <table>)
Structure Strict – must be well-formed and Flexible – browsers can adjust even if code is not
properly nested well-formed
Case Tags are case-sensitive (<Name> Tags are not case-sensitive (<P> = <p>)
Sensitivity ≠ <name>)
Data vs Focuses on data storage & Focuses on data presentation
Display transfer
End Tags All tags must be closed Some tags can be left open (e.g., <br>)
Error Errors stop parsing Errors are often ignored by browsers
Handling
Extensibility Highly extensible (create your own Not extensible (fixed set of tags)
tags)
XML validation- We check the structure of XML document means the given data in XML
is match with their structure or not? It is implemented through two concepts these are:
1. XML DTD
2. XML Schema
Create table record (id number(3),name varchar(10),age number(3),address
varchar(50))
Insert into record values(‘ABC’,101,’kanpur’,23)
XML DTD (Document Type Definition) defines the structure and rules for an XML document. It ensures
that the XML data is valid and follows a predefined format.
Purpose of DTD
● Specifies which elements and attributes can appear in a document.
● Defines the order and nesting of elements.
● Validates XML to ensure data consistency across systems.
Types
Internal DTD
External DTD
Internal DTD- Here we define structure of the XML database along with the XML file.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Record [
<!ELEMENT Record (id, name, age, address)>
<!ELEMENT id (#PCDATA)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT age (#PCDATA)>
<!ELEMENT address (#PCDATA)>
]>
<Record>
<id>101</id>
<name>ABC</name>
<age>22</age>
<address>kanpur</address>
</Record>
External DTD- Here we define the DTD in a separate file and the extension of file must be .dtd.
Record.dtd
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Record [
<!ELEMENT Record (id, name, age, address)>
<!ELEMENT id (#PCDATA)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT age (#PCDATA)>
<!ELEMENT address (#PCDATA)>
]>
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Record SYSTEM "Record.dtd">
<Record>
<id>101</id>
<name>ABC</name>
<age>22</age>
<address>kanpur</address>
</Record>
XML Schema: XML Document
An XML Schema defines the structure, content,
and data types of XML documents. It is written in <?xml version="1.0" encoding="UTF-8"?>
XML syntax itself, making it more powerful and <Record
flexible than DTD. xmlns:xsi="http://www.w3.org/2001/XMLSchema-i
nstance"
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xsi:noNamespaceSchemaLocation="Record.xsd">
xmlns:xs="http://www.w3.org/2001/XMLSchema"> <id>101</id>
<name>ABC</name>
<xs:element name="Record"> <age>22</age>
<xs:complexType> <address>kanpur</address>
<xs:sequence> </Record>
<xs:element name="id" type="xs:integer"/>
<xs:element name="name" type="xs:string"/>
<xs:element name="age" type="xs:integer"/>
<xs:element name="address"
type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
XML PARSERS : XML Parsing means reading XML data and converting it into a usable form (so that
programs can process it).
● An XML parser is a software library or package that provides interfaces for client applications to
work with an XML document.
● The XML Parser is designed to read the XML and create a way for programs to use XML.
● XML parser validates the document and checks that the document is well formatted.
Types of XML Parsers
These are the two main types of XML Parsers:
1. DOM
2. SAX
DOM (DOCUMENT OBJECT MODEL)
● A DOM document is an object which contains all the information of an XML document.
● The DOM Parser implements a DOM API. This API is very simple to use.
● A DOM Parser creates an internal structure in memory which is a DOM document object, and the client
applications get information of the original XML document by invoking methods on this document object.
● DOM Parser has a tree-based structure.
XML DOM – Nodes
● According to the XML DOM, everything in an XML document is a node:
○ The entire document is a document node
○ Every XML element is an element node
○ The text in the XML elements are text nodes
○ Every attribute is an attribute node
○ Comments are comment nodes
Accessing Nodes
By using the getElementsByTagName() method
ex: x.getElementsByTagName("title");
Get the Value of an Element
xmlDoc.getElementsByTagName("title")[0].childNodes[0];
Add a Node – appendChild()
newEle = xmlDoc.createElement("edition");
xmlDoc.getElementsByTagName("book")[0].appendChild(newEle);
Remove an Element Node – removeChild()
xmlDoc.getElementsByTagName("book")[0];
xmlDoc.documentElement.removeChild(y);
SAX (Simple API for XML)
● SAX (Simple API for XML) is an event-based parser for XML documents.
● Unlike a DOM parser, a SAX parser does not create a parse tree.
● A parser that implements SAX (i.e., a SAX Parser) functions as a stream parser, with an event-driven
API.
● The user defines a number of callback methods that will be called when events occur during parsing.
SAX Events include (among others):
● XML Text nodes
● XML Element Starts and Ends
● XML Processing Instructions
● XML Comments
Difference Between DOM and SAX
Sr. DOM (Document Object Model) SAX (Simple API for XML)
No
1 DOM = Document Object Model SAX = Simple API for XML
2 DOM parser is a tree-based parser SAX is an event-based XML parser
3 DOM parser loads the whole XML SAX loads only a small part of the XML file in
document in memory at once memory
4 DOM parser is faster than SAX because it SAX parser is slower than DOM because it does
accesses the whole XML document in not store the whole XML document in memory
memory
5 Requires more memory Does not require much memory