0% found this document useful (0 votes)
1 views24 pages

HTTP QuickTutorials

Uploaded by

sathish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views24 pages

HTTP QuickTutorials

Uploaded by

sathish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Tutorials: HTTP - Quick Guide

Sunnie Chung

The Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed,


collaborative, hypermedia information systems. This is the foundation for data communication
for the World Wide Web (ie. internet) since 1990. HTTP is a generic and stateless protocol which
can be used for other purposes as well using extension of its request methods, error codes and
headers.

Basically, HTTP is an TCP/IP based communication protocol, which is used to deliver data (HTML
files, image files, query results etc) on the World Wide Web. The default port is TCP 80, but other
ports can be used. It provides a standardized way for computers to communicate with each
other. HTTP specification specifies how clients request data will be constructed and sent to the
serve, and how servers respond to these requests.

Basic Features
There are following three basic features which makes HTTP a simple but powerful protocol:

• HTTP is connectionless: The HTTP client ie. browser initiates an HTTP request and after
a request is made, the client disconnects from the server and waits for a response. The
server process the request and re-establish the connection with the client to send response
back.

• HTTP is media independent: This means, any type of data can be sent by HTTP as long
as both the client and server know how to handle the data content. This is required for
client as well as server to specify the content type using appropriate MIME-type.

• HTTP is stateless: As mentioned above, HTTP is a connectionless and this is a direct


result that HTTP is a stateless protocol. The server and client are aware of each other only
during a current request. Afterwards, both of them forget about each other. Due to this
nature of the protocol, neither the client nor the browser can retain information between
different request across the web pages.
HTTP/1.0 uses a new connection for each request/response exchange where as
HTTP/1.1 connection may be used for one or more request/response exchanges.

Basic Architecture
Following diagram shows a very basic architecture of a web application and depicts where HTTP
sits:

The HTTP protocol is a request/response protocol based on client/server based architecture


where web browser, robots and search engines, etc. act like HTTP clients and Web server acts
as server.

Client
The HTTP client sends a request to the server in the form of a request method, URI, and protocol
version, followed by a MIME-like message containing request modifiers, client information, and
possible body content over a TCP/IP connection.

Server
The HTTP server responds with a status line, including the message's protocol version and a
success or error code, followed by a MIME-like message containing server information, entity
metainformation, and possible entity-body content.

HTTP - Parameters
This chapter is going to list down few of the important HTTP Protocol Parameters and their syntax
in a way they are used in the communication. For example, format for date, format of URL etc.
This will help you in constructing your request and response messages while writing HTTP client
or server programs. You will see complete usage of these parameters in subsequent chapters
while explaining message structure for HTTP requests and responses.

HTTP Version
HTTP uses a <major>.<minor> numbering scheme to indicate versions of the protocol. The
version of an HTTP message is indicated by an HTTP-Version field in the first line. Here is the
general syntax of specifying HTTP version number:

HTTP-Version = "HTTP" "/" 1*DIGIT "." 1*DIGIT

Example
HTTP/1.0

or

HTTP/1.1

Uniform Resource Identifiers (URI)


Uniform Resource Identifiers (URI) is simply formatted, case-insensitive string containing name,
location etc to identify a resource, for example a website, a web service etc. A general syntax of
URI used for HTTP is as follows:

URI = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]

Here if the port is empty or not given, port 80 is assumed for HTTP and an empty abs_path is
equivalent to an abs_path of "/". The characters other than those in the reserved and unsafe
sets are equivalent to their ""%" HEX HEX" encoding.

Example
Following two URIs are equivalent:

[Link]
[Link]
[Link]

Date/Time Formats
All HTTP date/time stamps MUST be represented in Greenwich Mean Time (GMT), without
exception. HTTP applications are allowed to use any of the following three representations of
date/time stamps:

Sun, 06 Nov 1994 [Link] GMT ; RFC 822, updated by RFC 1123
Sunday, 06-Nov-94 [Link] GMT ; RFC 850, obsoleted by RFC 1036
Sun Nov 6 [Link] 1994 ; ANSI C's asctime() format

Character Sets
You use character set to specify the character sets that the client prefers. Multiple character sets
can be listed separated by commas. If a value is not specified, the default is US-ASCII.

Example
Following are valid character sets:

US-ASCII

or

ISO-8859-1

or

ISO-8859-7

Content Encodings
A content ecoding values indicate an encoding algorithm has been used to encode the content
before passing it over the network. Content codings are primarily used to allow a document to
be compressed or otherwise usefully transformed without losing the identity.
All content-coding values are case-insensitive. HTTP/1.1 uses content-coding values in the
Accept-Encoding and Content-Encoding header fields which we will see in subsequent chapters.

Example
Following are valid encoding schemes:

Accept-encoding: gzip

or

Accept-encoding: compress

or

Accept-encoding: deflate

Media Types
HTTP uses Internet Media Types in the Content-Type and Accept header fields in order to
provide open and extensible data typing and type negotiation. All the Media-type values are
registered with the Internet Assigned Number Authority ((IANA). Following is a general syntax
to specify media type:

media-type = type "/" subtype *( ";" parameter )

The type, subtype, and parameter attribute names are case- insensitive.

Example
Accept: image/gif

Language Tags
HTTP uses language tags within the Accept-Language and Content-Language fields. A
language tag is composed of 1 or more parts: A primary language tag and a possibly empty
series of subtags:

language-tag = primary-tag *( "-" subtag )

White space is not allowed within the tag and all tags are case- insensitive.
Example
Example tags include:

en, en-US, en-cockney, i-cherokee, x-pig-latin

Where any two-letter primary-tag is an ISO-639 language abbreviation and any two-letter initial
subtag is an ISO-3166 country code.

HTTP - Messages
HTTP is based on client-server architecture model and a stateless request/response protocol that
operates by exchanging messages across a reliable TCP/IP connection.

An HTTP "client" is a program (Web browser or any other client) that establishes a connection to
a server for the purpose of sending one or more HTTP request messages. An HTTP "server" is a
program ( generally a web server like Apache Web Server or Internet Information Services IIS
etc. ) that accepts connections in order to serve HTTP requests by sending HTTP response
messages.

HTTP makes use of the Uniform Resource Identifier (URI) to identify a given resource and to
establish a connection. Once connection is established, HTTP messages are passed in a format
similar to that used by Internet mail [RFC5322] and the Multipurpose Internet Mail Extensions
(MIME) [RFC2045]. These messages are consisted of requests from client to server and
responses from server to client which will have following format:

HTTP-message = <Request> | <Response> ; HTTP/1.1 messages

HTTP request and HTTP response use a generic message format of RFC 822 for transferring the
required data. This generic message format consists of following four items.

• A Start-line

• Zero or more header fields followed by CRLF


• An empty line (i.e., a line with nothing preceding the CRLF) indicating the end of
the header fields

• Optionally a message-body

Following section will explain each of the entities used in HTTP message.

Message Start-Line
A start-line will have following generic syntax:

start-line = Request-Line | Status-Line

We will discuss Request-Line and Status-Line while discussing HTTP Request and HTTP Response
messages respectively. For now let's see the examples of start line in case of request and
response:

GET /[Link] HTTP/1.1 (This is Request-Line sent by the client)

HTTP/1.1 200 OK (This is Status-Line sent by the server)

Header Fields
HTTP deader fields provide required information about the request or response, or about the
object sent in the message body. There are following four types of HTTP message headers:

• General-header: These header fields have general applicability for both request and
response messages.

• Request-header: These header fields are applicability only for request messages.

• Response-header: These header fields are applicability only for response messages.

• Entity-header: These header fields define metainformation about the entity-body or, if
no body is present
All the above mentioned headers follow the same generic format and each of the header field
consists of a name followed by a colon (:) and the field value as follows:

message-header = field-name ":" [ field-value ]

Following are the examples of various header fields:

User-Agent: curl/7.16.3 libcurl/7.16.3 OpenSSL/0.9.7l zlib/1.2.3


Host: [Link]
Accept-Language: en, mi
Date: Mon, 27 Jul 2009 [Link] GMT
Server: Apache
Last-Modified: Wed, 22 Jul 2009 [Link] GMT
ETag: "34aa387-d-1568eb00"
Accept-Ranges: bytes
Content-Length: 51
Vary: Accept-Encoding
Content-Type: text/plain

Message Body
The message body part is optional for an HTTP message but if it is available then it is used to
carry the entity-body associated with the request or response. If entity body is associated then
usually Content-Type and Content-Length headers lines specify the nature of the body
associated.

A message body is the one which carries actual HTTP request data (including form data and
uploaded etc.) and HTTP response data from the server ( including files, images etc). Following
is a simple content of a message body:

<html>
<body>
<h1>Hello, World!</h1>
</body>
</html>
HTTP - Requests
An HTTP client sends an HTTP request to a server in the form of a request message which includes
following format:

• A Request-line

• Zero or more header (General|Request|Entity) fields followed by CRLF

• An empty line (i.e., a line with nothing preceding the CRLF) indicating the end of
the header fields

• Optionally a message-body

Following section will explain each of the entities used in HTTP message.

Message Request-Line
The Request-Line begins with a method token, followed by the Request-URI and the protocol
version, and ending with CRLF. The elements are separated by space SP characters.

Request-Line = Method SP Request-URI SP HTTP-Version CRLF

Let's discuss each of the part mentioned in Request-Line.

Request Method
The request Method indicates the method to be performed on the resource identified by the
given Request-URI. The method is case-sensitive ans should always be mentioned uppercase.
Following are supported methods in HTTP/1.1
S.N. Method and Description

GET
1 The GET method is used to retrieve information from the given server using a given URI. Requests using
GET should only retrieve data and should have no other effect on the data.

HEAD
2
Same as GET, but only transfer the status line and header section.

POST
3 A POST request is used to send data to the server, for example customer information, file upload etc
using HTML forms.

PUT
4
Replace all current representations of the target resource with the uploaded content.

DELETE
5
Remove all current representations of the target resource given by URI.

CONNECT
6
Establish a tunnel to the server identified by a given URI.

OPTIONS
7
Describe the communication options for the target resource.

TRACE
8
Perform a message loop-back test along the path to the target resource.

Request-URI
The Request-URI is a Uniform Resource Identifier and identifies the resource upon which to apply
the request. Following are the most commonly used forms to specify an URI:

Request-URI = "*" | absoluteURI | abs_path | authority

S.N. Method and Description


The asterisk * is used when HTTP request does not apply to a particular resource, but to the server itself,
and is only allowed when the method used does not necessarily apply to a resource. For example:
1

OPTIONS * HTTP/1.1

The absoluteURI is used when HTTP request is being made to a proxy. The proxy is requested to
forward the request or service it from a valid cache, and return the response. For example:
2
GET [Link] HTTP/1.1

The most common form of Request-URI is that used to identify a resource on an origin server or gateway.
For example, a client wishing to retrieve the resource above directly from the origin server would create a
TCP connection to port 80 of the host "[Link]" and send the lines:
GET /pub/WWW/[Link] HTTP/1.1
3 Host: [Link]

Note that the absolute path cannot be empty; if none is present in the original URI, it MUST be given as
"/" (the server root)

Request Header Fields


We will study General-header and Entity-header in a separate chapter when we will learn HTTP
header fields. For now let's check what are Request header fields.

The request-header fields allow the client to pass additional information about the request, and
about the client itself, to the server. These fields act as request modifiers and there are following
important Request-header fields available which can be used based on requirement.

• Accept-Charset

• Accept-Encoding

• Accept-Language

• Authorization

• Expect

• From

• Host
• If-Match

• If-Modified-Since

• If-None-Match

• If-Range

• If-Unmodified-Since

• Max-Forwards

• Proxy-Authorization

• Range

• Referer

• TE

• User-Agent

You can introduce your custom fields in case you are going to write your own custom Client and
Web Server.

Request Message Examples


Now let's put it all together to form an HTTP request to fetch [Link] page from the web
server running on [Link]

GET /[Link] HTTP/1.1


User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)
Host: [Link]
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Connection: Keep-Alive

Here we are not sending any request data to the server because we are fetching a plan HTML
page from the server. Connection is a general-header used here and rest of the headers are
request headers. Following is one more example where we send form data to the server using
request message body:

POST /cgi-bin/[Link] HTTP/1.1


User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)
Host: [Link]
Content-Type: application/x-www-form-urlencoded
Content-Length: length
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Connection: Keep-Alive

licenseID=string&content=string&/paramsXML=string

Here given URL /cgi-bin/[Link] will be used to process the passed data and accordingly a
response will be retuned. Here content-type tells the server that passed data is simple web
form data and length will be actual length of the data put in the message body. Following
example shows how you can pass plan XML to your web server:

POST /cgi-bin/[Link] HTTP/1.1


User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)
Host: [Link]
Content-Type: text/xml; charset=utf-8
Content-Length: length
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Connection: Keep-Alive

<?xml version="1.0" encoding="utf-8"?>


<string xmlns="[Link]

HTTP - Responses
After receiving and interpreting a request message, a server responds with an HTTP response
message:
• A Status-line

• Zero or more header (General|Response|Entity) fields followed by CRLF

• An empty line (i.e., a line with nothing preceding the CRLF) indicating the end of
the header fields

• Optionally a message-body

Following section will explain each of the entities used in HTTP message.

Message Status-Line
The Status-Line consisting of the protocol version followed by a numeric status code and its
associated textual phrase. The elements are separated by space SP characters.

Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF

Let's discuss each of the part mentioned in Status-Line.

HTTP Version
A server supporting HTTP version 1.1 will return following version information:

HTTP-Version = HTTP/1.1

Status Code
The Status-Code element is a 3-digit integer where first digit of the Status-Code defines the class
of response and the last two digits do not have any categorization role. There are 5 values for
the first digit:
S.N. Code and Description

1xx: Informational
1
This means request received and continuing process.

2xx: Success
2
This means the action was successfully received, understood, and accepted.

3xx: Redirection
3
This means further action must be taken in order to complete the request.

4xx: Client Error


4
This means the request contains bad syntax or cannot be fulfilled

5xx: Server Error


5
The server failed to fulfill an apparently valid request

HTTP status codes are extensible and HTTP applications are not required to understand the
meaning of all registered status codes. A list of all the status code has been given in a separate
chapter for you reference.

Response Header Fields


We will study General-header and Entity-header in a separate chapter when we will learn HTTP
header fields. For now let's check what are Response header fields.

The response-header fields allow the server to pass additional information about the response
which cannot be placed in the Status- Line. These header fields give information about the server
and about further access to the resource identified by the Request-URI.

• Accept-Ranges

• Age

• ETag

• Location

• Proxy-Authenticate
• Retry-After

• Server

• Vary

• WWW-Authenticate

You can introduce your custom fields in case you are going to write your own custom Web Client
and Server.

Response Message Examples


Now let's put it all together to form an HTTP response for a request to fetch [Link] page
from the web server running on [Link]

HTTP/1.1 200 OK
Date: Mon, 27 Jul 2009 [Link] GMT
Server: Apache/2.2.14 (Win32)
Last-Modified: Wed, 22 Jul 2009 [Link] GMT
Content-Length: 88
Content-Type: text/html
Connection: Closed

<html>
<body>
<h1>Hello, World!</h1>
</body>
</html>

Following is an example of HTTP response message showing error condition when web server
could not find requested page:

HTTP/1.1 404 Not Found


Date: Sun, 18 Oct 2012 [Link] GMT
Server: Apache/2.2.14 (Win32)
Content-Length: 230
Connection: Closed
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">


<html>
<head>
<title>404 Not Found</title>
</head>
<body>
<h1>Not Found</h1>
<p>The requested URL /[Link] was not found on this server.</p>
</body>
</html>

Following is an example of HTTP response message showing error condition when web server
encountered a wrong HTTP version in given HTTP request:

HTTP/1.1 400 Bad Request


Date: Sun, 18 Oct 2012 [Link] GMT
Server: Apache/2.2.14 (Win32)
Content-Length: 230
Content-Type: text/html; charset=iso-8859-1
Connection: Closed

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">


<html>
<head>
<title>400 Bad Request</title>
</head>
<body>
<h1>Bad Request</h1>
<p>Your browser sent a request that this server could not understand.<p>
<p>The request line contained invalid characters following the protocol string.<p>
</body>
</html>
HTTP - Methods
The set of common methods for HTTP/1.1 is defined below and this set can be expanded based
on requirement. These method names are case sensitive and they must be used in uppercase.

S.N. Method and Description

GET
1 The GET method is used to retrieve information from the given server using a given URI. Requests using
GET should only retrieve data and should have no other effect on the data.

HEAD
2
Same as GET, but only transfer the status line and header section.

POST
3 A POST request is used to send data to the server, for example customer information, file upload etc
using HTML forms.

PUT
4
Replace all current representations of the target resource with the uploaded content.

DELETE
5
Remove all current representations of the target resource given by URI.

CONNECT
6
Establish a tunnel to the server identified by a given URI.

OPTIONS
7
Describe the communication options for the target resource.

TRACE
8
Perform a message loop-back test along the path to the target resource.

GET Method

A GET request retrieves data from a web server by specifying parameters in the URL portion of
the request. This is the main method used for document retrieval. Following is a simple example
which makes use of GET method to fetch [Link]:
GET /[Link] HTTP/1.1
User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)
Host: [Link]
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Connection: Keep-Alive

Following will be a server response against the above GET request:

HTTP/1.1 200 OK
Date: Mon, 27 Jul 2009 [Link] GMT
Server: Apache/2.2.14 (Win32)
Last-Modified: Wed, 22 Jul 2009 [Link] GMT
ETag: "34aa387-d-1568eb00"
Vary: Authorization,Accept
Accept-Ranges: bytes
Content-Length: 88
Content-Type: text/html
Connection: Closed

<html>
<body>
<h1>Hello, World!</h1>
</body>
</html>

HEAD Method
The HEAD method is functionally like GET, except that the server replies with a response line and
headers, but no entity-body. Following is a simple example which makes use of HEAD method to
fetch header information about [Link]:

HEAD /[Link] HTTP/1.1


User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)
Host: [Link]
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Connection: Keep-Alive

Following will be a server response against the above GET request:

HTTP/1.1 200 OK
Date: Mon, 27 Jul 2009 [Link] GMT
Server: Apache/2.2.14 (Win32)
Last-Modified: Wed, 22 Jul 2009 [Link] GMT
ETag: "34aa387-d-1568eb00"
Vary: Authorization,Accept
Accept-Ranges: bytes
Content-Length: 88
Content-Type: text/html
Connection: Closed

You can notice that here server does not send any data after header.

POST Method
The POST method is used when you want to send some data to the server, for example file
update, form data etc. Following is a simple example which makes use of POST method to send
a form data to the server which will be processed by a [Link] and finally a response will be
returned:

POST /cgi-bin/[Link] HTTP/1.1


User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)
Host: [Link]
Content-Type: text/xml; charset=utf-8
Content-Length: 88
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Connection: Keep-Alive

<?xml version="1.0" encoding="utf-8"?>


<string xmlns="[Link]
Server side script [Link] process the passed data and send following response:

HTTP/1.1 200 OK
Date: Mon, 27 Jul 2009 [Link] GMT
Server: Apache/2.2.14 (Win32)
Last-Modified: Wed, 22 Jul 2009 [Link] GMT
ETag: "34aa387-d-1568eb00"
Vary: Authorization,Accept
Accept-Ranges: bytes
Content-Length: 88
Content-Type: text/html
Connection: Closed

<html>
<body>
<h1>Request Processed Successfully</h1>
</body>
</html>

PUT Method
The PUT method is used to request the server to store the included entity-body at a location
specified by the given URL. The following example request server to save the given entity-boy in
[Link] at the root of the server:

PUT /[Link] HTTP/1.1


User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)
Host: [Link]
Accept-Language: en-us
Connection: Keep-Alive
Content-type: text/html
Content-Length: 182

<html>
<body>
<h1>Hello, World!</h1>
</body>
</html>

The server will store given entity-body in [Link] file and will send following response back
to the client:

HTTP/1.1 201 Created


Date: Mon, 27 Jul 2009 [Link] GMT
Server: Apache/2.2.14 (Win32)
Content-type: text/html
Content-length: 30
Connection: Closed

<html>
<body>
<h1>The file was created.</h1>
</body>
</html>

DELETE Method
The DELETE method is used to request the server to delete file at a location specified by the
given URL. The following example request server to delete the given file [Link] at the root
of the server:

DELETE /[Link] HTTP/1.1


User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)
Host: [Link]
Accept-Language: en-us
Connection: Keep-Alive

The server will delete mentioned file [Link] and will send following response back to the
client:

HTTP/1.1 200 OK
Date: Mon, 27 Jul 2009 [Link] GMT
Server: Apache/2.2.14 (Win32)
Content-type: text/html
Content-length: 30
Connection: Closed

<html>
<body>
<h1>URL deleted.</h1>
</body>
</html>

CONNECT Method
The CONNECT method is used by the client to establish a network connection to a web server
over HTTP. The following example request a connection with a web server running on host
[Link]:

CONNECT [Link] HTTP/1.1


User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)

The connection is established with the server and following response is sent back to the client:

HTTP/1.1 200 Connection established


Date: Mon, 27 Jul 2009 [Link] GMT
Server: Apache/2.2.14 (Win32)

OPTIONS Method
The OPTIONS method is used by the client to find out what are the HTTP methods and other
options supported by a web server. The client can specify a URL for the OPTIONS method, or an
asterisk (*) to refer to the entire server. The following example request a list of methods
supported by a web server running on [Link]:

OPTIONS * HTTP/1.1
User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)

The server will send information based on the current configuration of the server, for example:

HTTP/1.1 200 OK
Date: Mon, 27 Jul 2009 [Link] GMT
Server: Apache/2.2.14 (Win32)
Allow: GET,HEAD,POST,OPTIONS,TRACE
Content-Type: httpd/unix-directory

TRACE Method
The TRACE method is used to eacho the contents of an HTTP Request back to the requester which
can be used for debugging purpose at the time of development. The following example shows
the usage of TRACE method:

TRACE / HTTP/1.1
Host: [Link]
User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)

The server will send following message in response of the above request:

HTTP/1.1 200 OK
Date: Mon, 27 Jul 2009 [Link] GMT
Server: Apache/2.2.14 (Win32)
Content-Type: message/http
Content-Length: 39
Connection: Closed

TRACE / HTTP/1.1
Host: [Link]
User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)

You might also like