0% found this document useful (0 votes)
12 views8 pages

HTML Emojis - XHTML Versus HTML

The document explains the use of emojis in HTML, emphasizing that they are characters from the UTF-8 character set and can be displayed using entity numbers. It also covers the importance of specifying the character set in HTML using the <meta> tag, and discusses the differences between HTML and XHTML, including stricter rules for element nesting and attribute formatting in XHTML. Additionally, it provides examples of how to properly encode URLs and the significance of UTF-8 in web development.

Uploaded by

kiratrent827
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views8 pages

HTML Emojis - XHTML Versus HTML

The document explains the use of emojis in HTML, emphasizing that they are characters from the UTF-8 character set and can be displayed using entity numbers. It also covers the importance of specifying the character set in HTML using the <meta> tag, and discusses the differences between HTML and XHTML, including stricter rules for element nesting and attribute formatting in XHTML. Additionally, it provides examples of how to properly encode URLs and the significance of UTF-8 in web development.

Uploaded by

kiratrent827
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Read Carefully and Understand 🤝.

😄😍💗
Using Emojis in HTML
Emojis are characters from the UTF-8 character set:

What are Emojis?


Emojis look like images, or icons, but they are not.

● They are letters (characters) from the UTF-8 (Unicode) character set.

● UTF-8 covers almost all of the characters and symbols in the world.

The HTML charset Attribute


To display an HTML page correctly, a web browser must know the character set used in
the page.

This is specified in the <meta> tag:

<meta charset="UTF-8">

If not specified, UTF-8 is the default character set in HTML.

UTF-8 Characters
Many UTF-8 characters cannot be typed on a keyboard, but they can always be displayed
using numbers (called entity numbers):

A is 65
B is 66
C is 67

Example
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
</head>
<body>

<p>I will display A B C</p>


<p>I will display &#65; &#66; &#67;</p>

</body>
</html>
Example Explained:
● The <meta charset="UTF-8"> element defines the character set.

● The characters A, B, and C, are displayed by the numbers 65, 66, and 67.

To let the browser understand that you are displaying a character, you must start the
entity number with &# and end it with ; (semicolon).

Emoji Characters
Emojis are also characters from the UTF-8 alphabet:

😄 is 128516
😍 is 128525
💗 is 128151
Example
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
</head>
<body>

<h1>My First Emoji</h1>

<p>&#128512;</p>

</body>
</html>

Since Emojis are characters, they can be copied, displayed, and sized just like any other
character in HTML.

Example
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
</head>

<body>
<h1>Sized Emojis</h1>
<p style="font-size:48px">
&#128512; &#128516; &#128525; &#128151;
</p>
</body>
</html>

Some Emoji Symbols in UTF-8


Emoji Value

🗻
🗼
&#128507;

🗽
&#128508;

🗾
&#128509;

🗿
&#128510;

😀
&#128511;

😁
&#128512;

😂
&#128513;

😃
&#128514;

😄
&#128515;

😅
&#128516;
&#128517;

HTML Encoding (Character Sets).


To display an HTML page correctly, a web browser must know which character set to use.

From ASCII to UTF-8.


ASCII was the first character encoding standard. ASCII defined 128 different characters
that could be used on the internet: numbers (0-9), English letters (A-Z), and some special
characters like ! $ + - ( ) @ < > .

ISO-8859-1 was the default character set for HTML 4. This character set supported 256
different character codes. HTML 4 also supported UTF-8.

ANSI (Windows-1252) was the original Windows character set. ANSI is identical to
ISO-8859-1, the only exception is that ANSI has 32 extra characters.

The HTML5 specification encourages web developers to use the UTF-8 character set,
which covers almost all of the characters and symbols in the world!

The HTML charset Attribute


To display an HTML page correctly, a web browser must know the character set used in
the page.

This is specified in the <meta> tag:

<meta charset="UTF-8">
The ASCII Character Set
● ASCII uses the values from 0 to 31 (and 127) for control characters.

● ASCII uses the values from 32 to 126 for letters, digits, and symbols.

● ASCII does not use the values from 128 to 255.

The ANSI Character Set (Windows-1252)


● ANSI is identical to ASCII for the values from 0 to 127.

● ANSI has a proprietary set of characters for the values from 128 to 159.

● ANSI is identical to UTF-8 for the values from 160 to 255.

The ISO-8859-1 Character Set


● ISO-8859-1 is identical to ASCII for the values from 0 to 127.

● ISO-8859-1 does not use the values from 128 to 159.

● ISO-8859-1 is identical to UTF-8 for the values from 160 to 255.

The UTF-8 Character Set


● UTF-8 is identical to ASCII for the values from 0 to 127.

● UTF-8 does not use the values from 128 to 159.

● UTF-8 is identical to both ANSI and 8859-1 for the values from 160 to 255.

● UTF-8 continues from the value 256 with more than 10 000 different characters.

HTML Uniform Resource Locators

● A URL is another word for a web address.

● A URL can be composed of words (e.g. w3schools.com), or an Internet Protocol


(IP) address (e.g. 192.68.20.50).

Most people enter the name when surfing, because names are easier to remember than
numbers.

● Web browsers request pages from web servers by using a URL.


● A Uniform Resource Locator (URL) is used to address a document (or other data)
on the web.

A web address like https://www.w3schools.com/html/default.asp follows these syntax


rules:

scheme://prefix.domain:port/path/filename
Explanation:

● scheme - defines the type of Internet service (most common is http or https).

● prefix - defines a domain prefix (default for http is www).

● domain - defines the Internet domain name (like w3schools.com).

● port - defines the port number at the host (default for http is 80).

● path - defines a path at the server (If omitted: the root directory of the site).

● filename - defines the name of a document or resource

URL Encoding
URLs can only be sent over the Internet using the ASCII character-set. If a URL contains
characters outside the ASCII set, the URL has to be converted.

● URL encoding converts non-ASCII characters into a format that can be transmitted
over the Internet.

● URL encoding replaces non-ASCII characters with a "%" followed by hexadecimal


digits.

● URLs cannot contain spaces. URL encoding normally replaces a space with a plus
(+) sign, or %20.

HTML Versus XHTML


XHTML is a stricter, more XML-based version of HTML.

What is XHTML?
● XHTML stands for EXtensible HyperText Markup Language.

● XHTML is a stricter, more XML-based version of HTML.

● XHTML is HTML defined as an XML application.


● XHTML is supported by all major browsers.

Why XHTML?
● XML is a markup language where all documents must be marked up correctly (be
"well-formed").

● XHTML was developed to make HTML more extensible and flexible to work with
other data formats (such as XML). In addition, browsers ignore errors in HTML
pages, and try to display the website even if it has some errors in the markup. So
XHTML comes with a much stricter error handling.

The Most Important Differences from HTML.

● <!DOCTYPE> is mandatory.
● The xmlns attribute in <html> is mandatory.
● <html>, <head>, <title>, and <body> are mandatory.
● Elements must always be properly nested.
● Elements must always be closed.
● Elements must always be in lowercase.
● Attribute names must always be in lowercase.
● Attribute values must always be quoted.
● Attribute minimization is forbidden.
● XHTML - <!DOCTYPE ....> Is Mandatory.
● An XHTML document must have an XHTML <!DOCTYPE> declaration.
● The <html>, <head>, <title>, and <body> elements must also be present, and the
xmlns attribute in <html> must specify the xml namespace for the document.

Example
Here is an XHTML document with a minimum of required tags:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"


"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Title of document</title>
</head>
<body>

some content here...

</body>
</html>
XHTML Elements Must be Properly Nested
In XHTML, elements must always be properly nested within each other, like this:

Correct:
<b><i>Some text</i></b>

Wrong:
<b><i>Some text</b></i>

XHTML Elements Must Always be Closed


In XHTML, elements must always be closed, like this:

Correct:
<p>This is a paragraph</p>
<p>This is another paragraph</p>
Wrong:
<p>This is a paragraph
<p>This is another paragraph

XHTML Empty Elements Must Always be Closed


In XHTML, empty elements must always be closed, like this:

Correct:
A break: <br />
A horizontal rule: <hr />
An image: <img src="happy.gif" alt="Happy face" />

Wrong:
A break: <br>
A horizontal rule: <hr>
An image: <img src="happy.gif" alt="Happy face">

XHTML Elements Must be in Lowercase


In XHTML, element names must always be in lowercase, like this:

Correct:
<body>
<p>This is a paragraph</p>
</body>

Wrong:
<BODY>
<P>This is a paragraph</P>
</BODY>
XHTML Attribute Names Must be in Lowercase
In XHTML, attribute names must always be in lowercase, like this:

Correct:
<a href="https://www.w3schools.com/html/">Visit our HTML tutorial</a>

Wrong:
<a HREF="https://www.w3schools.com/html/">Visit our HTML tutorial</a>

XHTML Attribute Values Must be Quoted


In XHTML, attribute values must always be quoted, like this:

Correct:
<a href="https://www.w3schools.com/html/">Visit our HTML tutorial</a>

Wrong:
<a href=https://www.w3schools.com/html/>Visit our HTML tutorial</a>

XHTML Attribute Minimization is Forbidden


In XHTML, attribute minimization is forbidden:

Correct:
<input type="checkbox" name="vehicle" value="car" checked="checked" />
<input type="text" name="lastname" disabled="disabled" />

Wrong:
<input type="checkbox" name="vehicle" value="car" checked />
<input type="text" name="lastname" disabled />

You might also like