What is HTML?
HTML is a language for describing web pages.
HTML stands for Hyper Text Markup Language
HTML is not a programming language, it is a markup language
A markup language is a set of markup tags
HTML uses markup tags to describe web pages
HTML Tags
HTML markup tags are usually called HTML tags
HTML tags are keywords surrounded by angle brackets like <html>
HTML tags normally come in pairs like <b> and </b>
The first tag in a pair is the start tag, the second tag is the end tag
Start and end tags are also called opening tags and closing tags
HTML Documents = Web Pages
HTML documents describe web pages
HTML documents contain HTML tags and plain text
HTML documents are also called web pages
The purpose of a web browser (like Internet Explorer or Firefox) is to read HTML documents and display them as web
pages. The browser does not display the HTML tags, but uses the tags to interpret the content of the page:
<html>
<body>
<h1>My First Heading</h1>
<p>My first paragraph.</p>
</body>
</html>
Example Explained
The text between <html> and </html> describes the web page
The text between <body> and </body> is the visible page content
The text between <h1> and </h1> is displayed as a heading
The text between <p> and </p> is displayed as a paragraph
What You Need
You don't need any tools to learn HTML at W3Schools.
You don't need an HTML editor
You don't need a web server
You don't need a web site
Editing HTML
HTML can be written and edited using many different editors like Dreamweaver and Visual Studio.
However, in this tutorial we use a plain text editor (like Notepad) to edit HTML. We believe using a plain text editor is
the best way to learn HTML.
HTML Headings
HTML headings are defined with the <h1> to <h6> tags.
Example
<h1>This is a heading</h1>
<h2>This is a heading</h2>
<h3>This is a heading</h3>
HTML Paragraphs
HTML paragraphs are defined with the <p> tag.
Example
<p>This is a paragraph.</p>
<p>This is another paragraph.</p>
HTML Links
HTML links are defined with the <a> tag.
Example
<a href="http://www.w3schools.com">This is a link</a>
HTML Images
HTML images are defined with the <img> tag.
Example
<img src="w3schools.jpg" width="104" height="142" />
HTML Elements
An HTML element is everything from the start tag to the end tag:
Start tag *
Element content
End tag *
<p>
This is a paragraph
</p>
<a href="default.htm" >
This is a link
</a>
<br />
* The start tag is often called the opening tag. The end tag is often called the closing tag.
HTML Element Syntax
An HTML element starts with a start tag / opening tag
An HTML element ends with an end tag / closing tag
The element content is everything between the start and the end tag
Some HTML elements have empty content
Empty elements are closed in the start tag
Most HTML elements can have attributes
Tip: You will learn about attributes in the next chapter of this tutorial.
Nested HTML Elements
Most HTML elements can be nested (can contain other HTML elements).
HTML documents consist of nested HTML elements.
HTML Document Example
<html>
<body>
<p>This is my first paragraph.</p>
</body>
</html>
The example above contains 3 HTML elements.
HTML Example Explained
The <p> element:
<p>This is my first paragraph.</p>
The <p> element defines a paragraph in the HTML document.
The element has a start tag <p> and an end tag </p>.
The element content is: This is my first paragraph.
The <body> element:
<body>
<p>This is my first paragraph.</p>
</body>
The <body> element defines the body of the HTML document.
The element has a start tag <body> and an end tag </body>.
The element content is another HTML element (a p element).
The <html> element:
<html>
<body>
<p>This is my first paragraph.</p>
</body>
</html>
The <html> element defines the whole HTML document.
The element has a start tag <html> and an end tag </html>.
The element content is another HTML element (the body element).
Don't Forget the End Tag
Some HTML elements might display correctly even if you forget the end tag:
<p>This is a paragraph
<p>This is a paragraph
The example above works in most browsers, because the closing tag is considered optional.
Never rely on this. Many HTML elements will produce unexpected results and/or errors if you forget the end tag .
Empty HTML Elements
HTML elements with no content are called empty elements.
<br> is an empty element without a closing tag (the <br> tag defines a line break).
Tip: In XHTML, all elements must be closed. Adding a slash inside the start tag, like <br />, is the proper way of
closing empty elements in XHTML (and XML).
HTML Attributes
HTML elements can have attributes
Attributes provide additional information about an element
Attributes are always specified in the start tag
Attributes come in name/value pairs like: name="value"
Attribute Example
HTML links are defined with the <a> tag. The link address is specified in the href attribute:
Example
<a href="http://www.w3schools.com">This is a link</a>
Always Quote Attribute Values
Attribute values should always be enclosed in quotes.
Double style quotes are the most common, but single style quotes are also allowed.
HTML Tip: Use Lowercase Attributes
Attribute names and attribute values are case-insensitive.
However, the World Wide Web Consortium (W3C) recommends lowercase attributes/attribute values in their HTML 4
recommendation.
Newer versions of (X)HTML will demand lowercase attributes.
HTML Attributes Reference
A complete list of legal attributes for each HTML element is listed in our:
Complete HTML Reference
Below is a list of some attributes that are standard for most HTML elements:
Attribute
Value
Description
class
classname
Specifies a classname for an element
id
Id
Specifies a unique id for an element
style
style_definition
Specifies an inline style for an element
title
tooltip_text
Specifies extra information about an element (displayed as a
tool tip)
HTML Tag Reference
W3Schools' tag reference contains additional information about these tags and their attributes.
You will learn more about HTML tags and attributes in the next chapters of this tutorial.
Tag
Description
<html>
Defines an HTML document
<body>
Defines the document's body
<h1> to <h6>
Defines HTML headings
<hr />
Defines a horizontal line
<!-->
Defines a comment
HTML Line Breaks
Use the <br /> tag if you want a line break (a new line) without starting a new paragraph:
Example
<p>This is<br />a para<br />graph with line breaks</p>
The <br /> element is an empty HTML element. It has no end tag.
<br> or <br />
In XHTML, XML, elements with no end tag (closing tag) are not allowed.
Even if <br> works in all browsers, writing <br /> instead works better in XHTML and XML applications.
HTML Output - Useful Tips
You cannot be sure how HTML will be displayed. Large or small screens, and resized windows will create different
results.
With HTML, you cannot change the output by adding extra spaces or extra lines in your HTML code.
The browser will remove extra spaces and extra lines when the page is displayed. Any number of lines count as one
line, and any number of spaces count as one space.
HTML Text Formatting
This text is bold
This text is big
This text is italic
This is computer output
This is subscript and
superscript
HTML Text Formatting Tags
Tag
Description
<b>
Defines bold text
<big>
Defines big text
<em>
Defines emphasized text
<i>
Defines italic text
<small>
Defines small text
<strong>
Defines strong text
<sub>
Defines subscripted text
<sup>
Defines superscripted text
<ins>
Defines inserted text
<del>
Defines deleted text
HTML "Computer Output" Tags
Tag
Description
<code>
Defines computer code text
<kbd>
Defines keyboard text
<samp>
Defines sample computer code
<tt>
Defines teletype text
<var>
Defines a variable
<pre>
Defines preformatted text
HTML Citations, Quotations, and Definition Tags
Tag
Description
<abbr>
Defines an abbreviation
<acronym>
Defines an acronym
<address>
Defines contact information for the author/owner of a document
<bdo>
Defines the text direction
<blockquote>
Defines a long quotation
<q>
Defines a short quotation
<cite>
Defines a citation
<dfn>
Defines a definition term
The HTML <font> Tag Should NOT be Used
The <font> tag is deprecated in HTML 4, and removed from HTML5.
The World Wide Web Consortium (W3C) has removed the <font> tag from its recommendations.
In HTML 4, style sheets (CSS) should be used to define the layout and display properties for many HTML elements.
The example below shows how the HTML could look by using the <font> tag:
Example
<p>
<font size="5" face="arial" color="red">
This paragraph is in Arial, size 5, and in red text color.
</font>
</p>
<p>
<font size="3" face="verdana" color="blue">
This paragraph is in Verdana, size 3, and in blue text color.
</font>
</p>
Styling HTML with CSS
CSS was introduced together with HTML 4, to provide a better way to style HTML elements.
CSS can be added to HTML in the following ways:
in separate style sheet files (CSS files)
in the style element in the HTML head section
in the style attribute in single HTML elements
Using the HTML Style Attribute
It is time consuming and not very practical to style HTML elements using the style attribute.
The preferred way to add CSS to HTML, is to put CSS syntax in separate CSS files.
However, in this HTML tutorial we will introduce you to CSS using the style attribute. This is done to simplify the
examples. It also makes it easier for you to edit the code and try it yourself.
HTML Style Example - Background Color
The background-color property defines the background color for an element:
Example
<html>
<body style="background-color:yellow;">
<h2 style="background-color:red;">This is a heading</h2>
<p style="background-color:green;">This is a paragraph.</p>
</body>
</html>
HTML Style Example - Font, Color and Size
The font-family, color, and font-size properties defines the font, color, and size of the text in an element:
Example
<html>
<body>
<h1 style="font-family:verdana;">A heading</h1>
<p style="font-family:arial;color:red;font-size:20px;">A paragraph.</p>
</body>
</html>
HTML Hyperlinks (Links)
A hyperlink (or link) is a word, group of words, or image that you can click on to jump to a new document or a new
section within the current document.
When you move the cursor over a link in a Web page, the arrow will turn into a little hand.
Links are specified in HTML using the <a> tag.
The <a> tag can be used in two ways:
1.
To create a link to another document, by using the href attribute
2.
To create a bookmark inside a document, by using the name attribute
HTML Link Syntax
The HTML code for a link is simple. It looks like this:
<a href="url">Link text</a>
The href attribute specifies the destination of a link.
Example
<a href="http://www.w3schools.com/">Visit W3Schools</a>
which will display like this: Visit W3Schools
Clicking on this hyperlink will send the user to W3Schools' homepage.
Tip: The "Link text" doesn't have to be text. It can be an image or any other HTML element.
HTML Links - The target Attribute
The target attribute specifies where to open the linked document.
The example below will open the linked document in a new browser window or a new tab:
Example
<a href="http://www.w3schools.com/" target="_blank">Visit W3Schools!</a>
HTML Links - The name Attribute
The name attribute specifies the name of an anchor.
The name attribute is used to create a bookmark inside an HTML document.
Note: The upcoming HTML5 standard suggest using the id attribute instead of the name attribute for specifying the
name of an anchor. Using the id attribute actually works also for HTML4 in all modern browsers.
Bookmarks are not displayed in any special way. They are invisible to the reader.
Example
A named anchor inside an HTML document:
<a name="tips">Useful Tips Section</a>
Create a link to the "Useful Tips Section" inside the same document:
<a href="#tips">Visit the Useful Tips Section</a>
Or, create a link to the "Useful Tips Section" from another page:
<a href="http://www.w3schools.com/html_links.htm#tips">
Visit the Useful Tips Section</a>
HTML Images - The <img> Tag and the Src Attribute
In HTML, images are defined with the <img> tag.
The <img> tag is empty, which means that it contains attributes only, and has no closing tag.
To display an image on a page, you need to use the src attribute. Src stands for "source". The value of the src
attribute is the URL of the image you want to display.
Syntax for defining an image:
<img src="url" alt="some_text"/>
The URL points to the location where the image is stored. An image named "boat.gif", located in the "images"
directory on "www.w3schools.com" has the URL: http://www.w3schools.com/images/boat.gif.
The browser displays the image where the <img> tag occurs in the document. If you put an image tag between two
paragraphs, the browser shows the first paragraph, then the image, and then the second paragraph.
HTML Images - The Alt Attribute
The required alt attribute specifies an alternate text for an image, if the image cannot be displayed.
The value of the alt attribute is an author-defined text:
<img src="boat.gif" alt="Big Boat" />
The alt attribute provides alternative information for an image if a user for some reason cannot view it (because of
slow connection, an error in the src attribute, or if the user uses a screen reader).
HTML Images - Set Height and Width of an Image
The height and width attributes are used to specify the height and width of an image.
The attribute values are specified in pixels by default:
<img src="pulpit.jpg" alt="Pulpit rock" width="304" height="228" />
Tip: It is a good practice to specify both the height and width attributes for an image. If these attributes are set, the
space required for the image is reserved when the page is loaded. However, without these attributes, the browser
does not know the size of the image. The effect will be that the page layout will change during loading (while the
images load).
HTML Image Tags
Tag
Description
<img />
Defines an image
<map>
Defines an image-map
<area />
Defines a clickable area inside an image-map
HTML Tables
Tables are defined with the <table> tag.
A table is divided into rows (with the <tr> tag), and each row is divided into data cells (with the <td> tag). td stands
for "table data," and holds the content of a data cell. A <td> tag can contain text, links, images, lists, forms, other
tables, etc.
Table Example
<table border="1">
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>
How the HTML code above looks in a browser:
row 1, cell 1
row 1, cell 2
row 2, cell 1
row 2, cell 2
HTML Tables and the Border Attribute
If you do not specify a border attribute, the table will be displayed without borders. Sometimes this can be useful, but
most of the time, we want the borders to show.
To display a table with borders, specify the border attribute:
<table border="1">
<tr>
<td>Row 1, cell 1</td>
<td>Row 1, cell 2</td>
</tr>
</table>
HTML Table Headers
Header information in a table are defined with the <th> tag.
All major browsers will display the text in the <th> element as bold and centered.
<table border="1">
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1,
<td>row 1,
</tr>
<tr>
<td>row 2,
<td>row 2,
</tr>
</table>
cell 1</td>
cell 2</td>
cell 1</td>
cell 2</td>
How the HTML code above looks in your browser:
Header 1
Header 2
row 1, cell 1
row 1, cell 2
row 2, cell 1
row 2, cell 2
HTML Table Tags
Tag
Description
<table>
Defines a table
<th>
Defines a table header
<tr>
Defines a table row
<td>
Defines a table cell
<caption>
Defines a table caption
<colgroup>
Defines a group of columns in a table, for formatting
<col />
Defines attribute values for one or more columns in a table
<thead>
Groups the header content in a table
<tbody>
Groups the body content in a table
<tfoot>
Groups the footer content in a table
HTML Unordered Lists
An unordered list starts with the <ul> tag. Each list item starts with the <li> tag.
The list items are marked with bullets (typically small black circles).
<ul>
<li>Coffee</li>
<li>Milk</li>
</ul>
How the HTML code above looks in a browser:
Coffee
Milk
HTML Ordered Lists
An ordered list starts with the <ol> tag. Each list item starts with the <li> tag.
The list items are marked with numbers.
<ol>
<li>Coffee</li>
<li>Milk</li>
</ol>
How the HTML code above looks in a browser:
1.
Coffee
2.
Milk
HTML Definition Lists
A definition list is a list of items, with a description of each item.
The <dl> tag defines a definition list.
The <dl> tag is used in conjunction with <dt> (defines the item in the list) and <dd> (describes the item in the list):
<dl>
<dt>Coffee</dt>
<dd>- black hot drink</dd>
<dt>Milk</dt>
<dd>- white cold drink</dd>
</dl>
How the HTML code above looks in a browser:
Coffee
Milk
- black hot drink
- white cold drink
HTML List Tags
Tag
Description
<ol>
Defines an ordered list
<ul>
Defines an unordered list
<li>
Defines a list item
<dl>
Defines a definition list
<dt>
Defines an item in a definition list
<dd>
Defines a description of an item in a definition list
HTML Forms
HTML forms are used to pass data to a server.
A form can contain input elements like text fields, checkboxes, radio-buttons, submit buttons and more. A form can
also contain select lists, textarea, fieldset, legend, and label elements.
The <form> tag is used to create an HTML form:
<form>
.
input elements
.
</form>
HTML Forms - The Input Element
The most important form element is the input element.
The input element is used to select user information.
An input element can vary in many ways, depending on the type attribute. An input element can be of type text field,
checkbox, password, radio button, submit button, and more.
The most used input types are described below.
Text Fields
<input type="text" /> defines a one-line input field that a user can enter text into:
<form>
First name: <input type="text" name="firstname" /><br />
Last name: <input type="text" name="lastname" />
</form>
How the HTML code above looks in a browser:
First name:
Last name:
Note: The form itself is not visible. Also note that the default width of a text field is 20 characters.
Password Field
<input type="password" /> defines a password field:
<form>
Password: <input type="password" name="pwd" />
</form>
How the HTML code above looks in a browser:
Password:
Note: The characters in a password field are masked (shown as asterisks or circles).
Radio Buttons
<input type="radio" /> defines a radio button. Radio buttons let a user select ONLY ONE of a limited number of
choices:
<form>
<input type="radio" name="sex" value="male" /> Male<br />
<input type="radio" name="sex" value="female" /> Female
</form>
How the HTML code above looks in a browser:
Male
Female
Checkboxes
<input type="checkbox" /> defines a checkbox. Checkboxes let a user select ONE or MORE options of a limited
number of choices.
<form>
<input type="checkbox" name="vehicle" value="Bike" /> I have a bike<br />
<input type="checkbox" name="vehicle" value="Car" /> I have a car
</form>
How the HTML code above looks in a browser:
I have a bike
I have a car
Submit Button
<input type="submit" /> defines a submit button.
A submit button is used to send form data to a server. The data is sent to the page specified in the form's action
attribute. The file defined in the action attribute usually does something with the received input:
<form name="input" action="html_form_action.asp" method="get">
Username: <input type="text" name="user" />
<input type="submit" value="Submit" />
</form>
How the HTML code above looks in a browser:
Username:
Submit
HTML Form Tags
Tag
Description
<form>
Defines an HTML form for user input
<input />
Defines an input control
<textarea>
Defines a multi-line text input control
<label>
Defines a label for an input element
<fieldset>
Defines a border around elements in a form
<legend>
Defines a caption for a fieldset element
<select>
Defines a select list (drop-down list)
<optgroup>
Defines a group of related options in a select list
<option>
Defines an option in a select list
<button>
Defines a push button
HTML Frames
With frames, you can display more than one HTML document in the same browser window. Each HTML document is
called a frame, and each frame is independent of the others.
The disadvantages of using frames are:
Frames are not expected to be supported in future versions of HTML
Frames are difficult to use. (Printing the entire page is difficult).
The web developer must keep track of more HTML documents
The HTML frameset Element
The frameset element holds one or more frame elements. Each frame element can hold a separate document.
The frameset element states HOW MANY columns or rows there will be in the frameset, and HOW MUCH
percentage/pixels of space will occupy each of them.
The HTML frame Element
The <frame> tag defines one particular window (frame) within a frameset.
In the example below we have a frameset with two columns.
The first column is set to 25% of the width of the browser window. The second column is set to 75% of the width of
the browser window. The document "frame_a.htm" is put into the first column, and the document "frame_b.htm" is
put into the second column:
<frameset cols="25%,75%">
<frame src="frame_a.htm" />
<frame src="frame_b.htm" />
</frameset>
Note: The frameset column size can also be set in pixels (cols="200,500"), and one of the columns can be set to use
the remaining space, with an asterisk (cols="25%,*").
HTML Frame Tags
Tag
Description
<frameset>
Defines a set of frames
<frame />
Defines a sub window (a frame)
<noframes>
Defines a noframe section for browsers that do not handle frames
Syntax for adding an iframe:
<iframe src="URL"></iframe>
The URL points to the location of the separate page.
Iframe - Set Height and Width
The height and width attributes are used to specify the height and width of the iframe.
The attribute values are specified in pixels by default, but they can also be in percent (like "80%").
Example
<iframe src="demo_iframe.htm" width="200" height="200"></iframe>
Iframe - Remove the Border
The frameborder attribute specifies whether or not to display a border around the iframe.
Set the attribute value to "0" to remove the border:
Example
<iframe src="demo_iframe.htm" frameborder="0"></iframe>
Use iframe as a Target for a Link
An iframe can be used as the target frame for a link.
The target attribute of a link must refer to the name attribute of the iframe:
Example
<iframe src="demo_iframe.htm" name="iframe_a"></iframe>
<p><a href="http://www.w3schools.com" target="iframe_a">W3Schools.com</a></p>
HTML iframe Tag
Tag
Description
<iframe>
Defines an inline sub window (frame)
Color Values
HTML colors are defined using a hexadecimal notation (HEX) for the combination of Red, Green, and Blue color values
(RGB).
The lowest value that can be given to one of the light sources is 0 (in HEX: 00). The highest value is 255 (in HEX: FF).
HEX values are specified as 3 pairs of two-digit numbers, starting with a # sign.
Color Values
Color
Source: http://www.w3schools.com
Color HEX
Color RGB
#000000
rgb(0,0,0)
#FFFFFF
rgb(255,255,255)
What is XML?
XML stands for EXtensible Markup Language
XML is a markup language much like HTML
XML was designed to carry data, not to display data
XML tags are not predefined. You must define your own tags
XML is designed to be self-descriptive
XML is a W3C Recommendation
The Difference Between XML and HTML
XML is not a replacement for HTML.
XML and HTML were designed with different goals:
XML was designed to transport and store data, with focus on what data is
HTML was designed to display data, with focus on how data looks
HTML is about displaying information, while XML is about carrying information.
XML Does Not DO Anything
Maybe it is a little hard to understand, but XML does not DO anything. XML was created to structure, store, and
transport information.
The following example is a note to Tove, from Jani, stored as XML:
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
The note above is quite self descriptive. It has sender and receiver information, it also has a heading and a message
body.
But still, this XML document does not DO anything. It is just information wrapped in tags. Someone must write a piece
of software to send, receive or display it.
With XML You Invent Your Own Tags
The tags in the example above (like <to> and <from>) are not defined in any XML standard. These tags are
"invented" by the author of the XML document.
That is because the XML language has no predefined tags.
The tags used in HTML are predefined. HTML documents can only use tags defined in the HTML standard (like <p>,
<h1>, etc.).
XML allows the author to define his/her own tags and his/her own document structure.
XML is Not a Replacement for HTML
XML is a complement to HTML.
It is important to understand that XML is not a replacement for HTML. In most web applications, XML is used to
transport data, while HTML is used to format and display the data.
My best description of XML is this:
XML is a software- and hardware-independent tool for carrying information.
XML Separates Data from HTML
If you need to display dynamic data in your HTML document, it will take a lot of work to edit the HTML each time the
data changes.
With XML, data can be stored in separate XML files. This way you can concentrate on using HTML for layout and
display, and be sure that changes in the underlying data will not require any changes to the HTML.
With a few lines of JavaScript code, you can read an external XML file and update the data content of your web page.
XML Simplifies Data Sharing
In the real world, computer systems and databases contain data in incompatible formats.
XML data is stored in plain text format. This provides a software- and hardware-independent way of storing data.
This makes it much easier to create data that can be shared by different applications.
XML Simplifies Data Transport
One of the most time-consuming challenges for developers is to exchange data between incompatible systems over
the Internet.
Exchanging data as XML greatly reduces this complexity, since the data can be read by different incompatible
applications.
XML Simplifies Platform Changes
Upgrading to new systems (hardware or software platforms), is always time consuming. Large amounts of data must
be converted and incompatible data is often lost.
XML data is stored in text format. This makes it easier to expand or upgrade to new operating systems, new
applications, or new browsers, without losing data.
XML Makes Your Data More Available
Different applications can access your data, not only in HTML pages, but also from XML data sources.
With XML, your data can be available to all kinds of "reading machines" (Handheld computers, voice machines, news
feeds, etc), and make it more available for blind people, or people with other disabilities.
XML is Used to Create New Internet Languages
A lot of new Internet languages are created with XML.
Here are some examples:
XHTML
WSDL for describing available web services
WAP and WML as markup languages for handheld devices
RSS languages for news feeds
RDF and OWL for describing resources and ontology
SMIL for describing multimedia for the web
An Example XML Document
XML documents use a self-describing and simple syntax:
<?xml version="1.0" encoding="ISO-8859-1"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
The first line is the XML declaration. It defines the XML version (1.0) and the encoding used (ISO-8859-1 = Latin1/West European character set).
The next line describes the root element of the document (like saying: "this document is a note"):
<note>
The next 4 lines describe 4 child elements of the root (to, from, heading, and body):
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
And finally the last line defines the end of the root element:
</note>
You can assume, from this example, that the XML document contains a note to Tove from Jani.
Don't you agree that XML is pretty self-descriptive?
XML Documents Form a Tree Structure
XML documents must contain a root element. This element is "the parent" of all other elements.
The elements in an XML document form a document tree. The tree starts at the root and branches to the lowest level
of the tree.
All elements can have sub elements (child elements):
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
The terms parent, child, and sibling are used to describe the relationships between elements. Parent elements have
children. Children on the same level are called siblings (brothers or sisters).
All elements can have text content and attributes (just like in HTML).
Example:
The image above represents one book in the XML below:
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
The root element in the example is <bookstore>. All <book> elements in the document are contained within
<bookstore>.
The <book> element has 4 children: <title>,< author>, <year>, <price>.
All XML Elements Must Have a Closing Tag
In HTML, some elements do not have to have a closing tag:
<p>This is a paragraph
<p>This is another paragraph
In XML, it is illegal to omit the closing tag. All elements must have a closing tag:
<p>This is a paragraph</p>
<p>This is another paragraph</p>
Note: You might have noticed from the previous example that the XML declaration did not have a closing tag. This is
not an error. The declaration is not a part of the XML document itself, and it has no closing tag.
XML Tags are Case Sensitive
XML tags are case sensitive. The tag <Letter> is different from the tag <letter>.
Opening and closing tags must be written with the same case:
<Message>This is incorrect</message>
<message>This is correct</message>
Note: "Opening and closing tags" are often referred to as "Start and end tags". Use whatever you prefer. It is exactly
the same thing.
XML Elements Must be Properly Nested
In HTML, you might see improperly nested elements:
<b><i>This text is bold and italic</b></i>
In XML, all elements must be properly nested within each other:
<b><i>This text is bold and italic</i></b>
In the example above, "Properly nested" simply means that since the <i> element is opened inside the <b> element,
it must be closed inside the <b> element.
XML Documents Must Have a Root Element
XML documents must contain one element that is the parent of all other elements. This element is called
the root element.
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
XML Attribute Values Must be Quoted
XML elements can have attributes in name/value pairs just like in HTML.
In XML, the attribute values must always be quoted.
Study the two XML documents below. The first one is incorrect, the second is correct:
<note date=12/11/2007>
<to>Tove</to>
<from>Jani</from>
</note>
<note date="12/11/2007">
<to>Tove</to>
<from>Jani</from>
</note>
The error in the first document is that the date attribute in the note element is not quoted.
Entity References
Some characters have a special meaning in XML.
If you place a character like "<" inside an XML element, it will generate an error because the parser interprets it as
the start of a new element.
This will generate an XML error:
<message>if salary < 1000 then</message>
To avoid this error, replace the "<" character with an entity reference:
<message>if salary < 1000 then</message>
There are 5 predefined entity references in XML:
<
<
less than
>
>
greater than
&
&
ampersand
'
'
apostrophe
"
"
quotation mark
Note: Only the characters "<" and "&" are strictly illegal in XML. The greater than character is legal, but it is a good
habit to replace it.
Comments in XML
The syntax for writing comments in XML is similar to that of HTML.
<!-- This is a comment -->
White-space is Preserved in XML
HTML truncates multiple white-space characters to one single white-space:
HTML:
Hello
Output:
Hello Tove
Tove
With XML, the white-space in a document is not truncated.
XML Stores New Line as LF
In Windows applications, a new line is normally stored as a pair of characters: carriage return (CR) and line feed (LF).
In Unix applications, a new line is normally stored as an LF character. Macintosh applications also use an LF to store a
new line.
XML stores a new line as LF.
What is an XML Element?
An XML element is everything from (including) the element's start tag to (including) the element's end tag.
An element can contain:
other elements
text
attributes
or a mix of all of the above...
<bookstore>
<book category="CHILDREN">
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title>Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
In the example above, <bookstore> and <book> have element contents, because they contain other elements.
<book> also has an attribute (category="CHILDREN"). <title>, <author>, <year>, and <price> have text
content because they contain text.
XML Naming Rules
XML elements must follow these naming rules:
Names can contain letters, numbers, and other characters
Names cannot start with a number or punctuation character
Names cannot start with the letters xml (or XML, or Xml, etc)
Names cannot contain spaces
Any name can be used, no words are reserved.
Best Naming Practices
Make names descriptive. Names with an underscore separator are nice: <first_name>, <last_name>.
Names should be short and simple, like this: <book_title> not like this: <the_title_of_the_book>.
Avoid "-" characters. If you name something "first-name," some software may think you want to subtract name from
first.
Avoid "." characters. If you name something "first.name," some software may think that "name" is a property of the
object "first."
Avoid ":" characters. Colons are reserved to be used for something called namespaces (more later).
XML documents often have a corresponding database. A good practice is to use the naming rules of your database for
the elements in the XML documents.
Non-English letters like are perfectly legal in XML, but watch out for problems if your software vendor doesn't
support them.
XML Elements are Extensible
XML elements can be extended to carry more information.
Look at the following XML example:
<note>
<to>Tove</to>
<from>Jani</from>
<body>Don't forget me this weekend!</body>
</note>
Let's imagine that we created an application that extracted the <to>, <from>, and <body> elements from the XML
document to produce this output:
MESSAGE
To: Tove
From: Jani
Don't forget me this weekend!
Imagine that the author of the XML document added some extra information to it:
<note>
<date>2008-01-10</date>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
Should the application break or crash?
No. The application should still be able to find the <to>, <from>, and <body> elements in the XML document and
produce the same output.
One of the beauties of XML, is that it can be extended without breaking applications.
XML Attributes
In HTML, attributes provide additional information about elements:
<img src="computer.gif">
<a href="demo.asp">
Attributes often provide information that is not a part of the data. In the example below, the file type is irrelevant to
the data, but can be important to the software that wants to manipulate the element:
<file type="gif">computer.gif</file>
XML Attributes Must be Quoted
Attribute values must always be quoted. Either single or double quotes can be used. For a person's sex, the person
element can be written like this:
<person sex="female">
or like this:
<person sex='female'>
If the attribute value itself contains double quotes you can use single quotes, like in this example:
<gangster name='George "Shotgun" Ziegler'>
or you can use character entities:
<gangster name="George "Shotgun" Ziegler">
XML Elements vs. Attributes
Take a look at these examples:
<person sex="female">
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
<person>
<sex>female</sex>
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
In the first example sex is an attribute. In the last, sex is an element. Both examples provide the same information.
There are no rules about when to use attributes or when to use elements. Attributes are handy in HTML. In XML my
advice is to avoid them. Use elements instead.
My Favorite Way
The following three XML documents contain exactly the same information:
A date attribute is used in the first example:
<note date="10/01/2008">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
A date element is used in the second example:
<note>
<date>10/01/2008</date>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
An expanded date element is used in the third: (THIS IS MY FAVORITE):
<note>
<date>
<day>10</day>
<month>01</month>
<year>2008</year>
</date>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
Avoid XML Attributes?
Some of the problems with using attributes are:
attributes cannot contain multiple values (elements can)
attributes cannot contain tree structures (elements can)
attributes are not easily expandable (for future changes)
Attributes are difficult to read and maintain. Use elements for data. Use attributes for information that is not relevant
to the data.
Don't end up like this:
<note day="10" month="01" year="2008"
to="Tove" from="Jani" heading="Reminder"
body="Don't forget me this weekend!">
</note>
XML Attributes for Metadata
Sometimes ID references are assigned to elements. These IDs can be used to identify XML elements in much the
same way as the id attribute in HTML. This example demonstrates this:
<messages>
<note id="501">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
<note id="502">
<to>Jani</to>
<from>Tove</from>
<heading>Re: Reminder</heading>
<body>I will not</body>
</note>
</messages>
The id attributes above are for identifying the different notes. It is not a part of the note itself.
What I'm trying to say here is that metadata (data about data) should be stored as attributes, and the data itself
should be stored as elements.
Well Formed XML Documents
A "Well Formed" XML document has correct XML syntax.
The syntax rules were described in the previous chapters:
XML documents must have a root element
XML elements must have a closing tag
XML tags are case sensitive
XML elements must be properly nested
XML attribute values must be quoted
<?xml version="1.0" encoding="ISO-8859-1"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
Valid XML Documents
A "Valid" XML document is a "Well Formed" XML document, which also conforms to the rules of a Document Type
Definition (DTD):
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE note SYSTEM "Note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
The DOCTYPE declaration in the example above, is a reference to an external DTD file. The content of the file is shown
in the paragraph below.
XML DTD
The purpose of a DTD is to define the structure of an XML document. It defines the structure with a list of legal
elements:
<!DOCTYPE
[
<!ELEMENT
<!ELEMENT
<!ELEMENT
<!ELEMENT
<!ELEMENT
]>
note
note (to,from,heading,body)>
to (#PCDATA)>
from (#PCDATA)>
heading (#PCDATA)>
body (#PCDATA)>
If you want to study DTD, you will find our DTD tutorial on our homepage.
XML Schema
W3C supports an XML-based alternative to DTD, called XML Schema:
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
XML Errors Will Stop You
Errors in XML documents will stop your XML applications.
The W3C XML specification states that a program should stop processing an XML document if it finds an error. The
reason is that XML software should be small, fast, and compatible.
HTML browsers will display documents with errors (like missing end tags). HTML browsers are big and incompatible
because they have a lot of unnecessary code to deal with (and display) HTML errors.
With XML, errors are not allowed.
Syntax-Check Your XML
To help you syntax-check your XML, we have created an XML validator.
Paste your XML into the text area below, and syntax-check it by clicking the "Validate" button.
Note: This only checks if your XML is "Well formed". If you want to validate your XML against a DTD, see the last
paragraph on this page.
Syntax-Check an XML File
You can syntax-check an XML file by typing the URL of the file into the input field below, and then click the "Validate"
button:
Filename:
http://w w w .w 3schools.com/xml/note_error.xml
Validate
Note: If you get an "Access denied" error, it's because your browser security does not allow file access across
domains.
The file "note_error.xml" demonstrates your browsers error handling. If you want see an error free message,
substitute the "note_error.xml" with "cd_catalog.xml".
Viewing XML Files
<?xml version="1.0" encoding="ISO-8859-1"?>
- <note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
Look at this XML file: note.xml
The XML document will be displayed with color-coded root and child elements. A plus (+) or minus sign (-) to the left
of the elements can be clicked to expand or collapse the element structure. To view the raw XML source (without the
+ and - signs), select "View Page Source" or "View Source" from the browser menu.
Note: In Safari, only the element text will be displayed. To view the raw XML, you must right click the page and select
"View Source"
Displaying your XML Files with CSS?
It is possible to use CSS to format an XML document.
Below is an example of how to use a CSS style sheet to format an XML document:
Take a look at this XML file: The CD catalog
Then look at this style sheet: The CSS file
Finally, view: The CD catalog formatted with the CSS file
Below is a fraction of the XML file. The second line links the XML file to the CSS file:
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/css" href="cd_catalog.css"?>
<CATALOG>
<CD>
<TITLE>Empire Burlesque</TITLE>
<ARTIST>Bob Dylan</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>Columbia</COMPANY>
<PRICE>10.90</PRICE>
<YEAR>1985</YEAR>
</CD>
<CD>
<TITLE>Hide your heart</TITLE>
<ARTIST>Bonnie Tyler</ARTIST>
<COUNTRY>UK</COUNTRY>
<COMPANY>CBS Records</COMPANY>
<PRICE>9.90</PRICE>
<YEAR>1988</YEAR>
</CD>
.
.
.
</CATALOG>