0% found this document useful (0 votes)
18 views10 pages

Fundamentals

Uploaded by

SAKSHAM PRASAD
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views10 pages

Fundamentals

Uploaded by

SAKSHAM PRASAD
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

 Impact of the World Wide Web

o Transformation of lives in industrialized and unindustrialized countries


o Daily use for communication, shopping, and information gathering
o Role in social and political demonstrations and revolutions
 Downsides of the Web
o Easier access to harmful content (e.g., pornography and gambling)
o Ease of spreading destructive ideas
 Upsides of the Web
o Communication with friends, relatives, and business associates
o Online shopping for a wide variety of products
o Access to limitless information on various topics

1. A Brief Introduction to the Internet

1.1 Origins

 The U.S. Department of Defense (DoD) developed a large-scale computer network in the
1960s for communications, program sharing, and remote computer access for defense-
related research.

 The network was required to be robust enough to continue functioning even if some
network nodes were lost due to sabotage, war, or other causes.

 ARPA (later renamed DARPA) funded the construction of the first network, ARPAnet,
connecting about a dozen research laboratories and universities, with the first node
established at UCLA in 1969.

 ARPAnet was primarily used for text-based communications through electronic mail and was
only available to ARPA-funded research institutions.

 Other networks, such as BITNET and CSNET, were developed in the late 1970s and early
1980s due to limited ARPAnet access.

 NSFnet, sponsored by the National Science Foundation (NSF), was created in 1986 and
eventually replaced ARPAnet for most nonmilitary uses. By 1992, NSFnet connected over 1
million computers worldwide.

 In 1995, a small part of NSFnet returned to being a research network, with the rest becoming
known as the Internet.

1.2 What Is the Internet?

 The Internet is a vast collection of interconnected computers and devices of various sizes,
configurations, and manufacturers.

 The Transmission Control Protocol/Internet Protocol (TCP/IP) is the single, low-level protocol
that allows diverse devices to communicate with each other. TCP/IP became the standard for
computer network connections in 1982.

 The Internet is a network of networks, with individual computers within an organization


connected to each other in a local network, which is then connected to the Internet.

 All devices connected to the Internet must be uniquely identifiable.


 TCP/IP is not the only communication protocol used by the Internet; User Datagram
Protocol/Internet Protocol (UDP/IP) is an alternative used in some situations.

1.3 Internet Protocol Addresses

 Internet nodes identified by names for people and numeric addresses for computers

 Internet Protocol (IP) address is a unique 32-bit number for machines connected to the
Internet

 Written as four 8-bit numbers separated by periods

 Organizations assigned blocks of IP addresses for their machines that need Internet access

 IPv6 standard approved in late 1998, expanding address size from 32 bits to 128 bits

1.4 Domain Names

 Machines on the Internet have textual names due to people's difficulty dealing with numbers

 Domain names start with the host machine name, followed by progressively larger enclosing
collections of machines called domains

 Last domain name identifies the type of organization in which the host resides

 Fully qualified domain name (FQDN) is the combination of the hostname and all domain
names

 FQDN must be converted to an IP address before message transmission over the Internet

 Domain Name System (DNS) and name servers handle FQDN to IP address conversion

 FQDNs and IP addresses must be unique

Figure 1: Domain name conversion process

 telnet can be used to determine the IP address of a Web site

 Variety of protocols running on top of TCP/IP developed by the mid-1980s to support


different Internet uses

 World Wide Web emerged as a better approach to access the Internet's advantages

2. The World Wide Web

2.1 Origins
 In 1989, Tim Berners-Lee and a small group at CERN proposed a new protocol and document
access system for the Internet called the World Wide Web

 The Web aimed to allow scientists worldwide to exchange documents describing their work

 The system enabled users to search for and retrieve documents from databases on any
document-serving computer connected to the Internet

 The Web used hypertext, which is text with embedded links to other documents, allowing
nonsequential browsing

 The first implementation of the Web was on a NeXT computer at CERN in late 1990, and it
was released to the world in 1991

2.2 Web or Internet?

 The Internet and the Web are not the same thing

 The Internet is a collection of computers and devices connected by communication


equipment

 The Web is a collection of software and protocols installed on most, if not all, computers on
the Internet

 Web servers provide documents, while Web clients (browsers) request and display
documents to users

 The Internet was useful before the Web, and it remains useful without it, but most users now
access the Internet through the Web

3 Web Browsers

 Web operates in a client-server configuration, with browsers acting as clients and servers
providing documents
 Early browsers were text-based, limiting the growth of the Web
 Mosaic, released in early 1993, was the first browser with a graphical user interface,
developed at NCSA, University of Illinois
 Mosaic's interface provided convenient access to the Web for non-scientist and non-
developer users
 Versions of Mosaic for Apple Macintosh and Microsoft Windows systems were released in
late 1993, leading to explosive growth in Web usage
 Browsers initiate communication with servers, which respond to requests
 Servers may provide static documents or request user input through the browser
 The Hypertext Transfer Protocol (HTTP) is the most common protocol used by the Web for
communication between browsers and servers
 Most commonly used browsers are Microsoft Internet Explorer (IE), Firefox, and Chrome,
with a focus on these in this text

4 Web Servers

4.1 Web Server Operation

 Web servers provide documents to requesting browsers

 Servers act when requests are made by browsers running on other computers
 Most commonly used web servers are Apache and Microsoft's Internet Information Server
(IIS)

 Web browsers initiate network communications with servers by sending URLs

 URLs can specify a data file or a program stored on the server

 All communications between a web client and a web server use the standard web protocol,
Hypertext Transfer Protocol (HTTP)

 Web servers monitor a communications port, accept HTTP commands, and perform specified
operations

4.2 General Server Characteristics

 File structure of a web server has two separate directories: document root and server root

 Document root stores web documents to which the server has direct access and normally
serves to clients

 Server root stores the server and its support software

 Files stored directly in the document root are available to clients through top-level URLs

 Virtual document trees allow part of the servable document collection to be stored outside
the directory at the document root

 Contemporary servers provide a wide variety of client services, including support for virtual
hosts and proxy servers

 Many servers can interact with database systems through server-side scripts

4.3 Apache

 Began as the NCSA server, httpd, with added features

 Most widely used web server due to its speed, reliability, and open-source nature

 Offers a long list of services beyond serving documents to clients

 Configuration information is read from a file when Apache begins execution

 Three configuration files: httpd.conf, srm.conf, and access.conf

 httpd.conf stores the directives that control Apache server behavior

4.4 IIS

 Most popular server on Windows platforms

 Supplied as part of Windows and considered a reasonably good server

 Apache and IIS provide similar services

 IIS is controlled by a window-based management program, IIS snap-in

 IIS snap-in controls both IIS and FTP, allowing site managers to set server parameters
 Accessed through Control Panel, Administrative Tools, and IIS Manager on Windows XP and
Vista

5. Uniform Resource Locators (URLs)

5.1 URL Formats

 All URLs have the same general format: scheme

 Common schemes include http, ftp, gopher, telnet, file, mailto, and news

 HTTP protocol is used to request and send HTML documents

 URL format for HTTP: //fully-qualified-domain-name/path-to-document

 File protocol is used for documents residing on the machine running the browser: file://path-
to-document

 Host name is the name of the server computer that stores the document or provides access
to it

 Default port number of Web server processes is 80; if a server uses a different port number,
it must be attached to the host name in the URL

 URLs cannot have embedded spaces or special characters; they must be coded as a percent
sign (%) followed by the two-digit hexadecimal ASCII code for the character

 Eg: if domain name ‘RV CE’ has to be specfied then it has to be written as ‘RV%20CE’ (20 is
ASCII code for space)

5.2 URL Paths

 Path to the document for the HTTP protocol is similar to a path to a file or directory in an
operating system's file system

 Path is given by a sequence of directory names and a file name, separated by the appropriate
separator character (forward slashes for UNIX servers, backward slashes for Windows
servers)

 Path in a URL can be complete (includes all directories along the way) or partial (relative to
some base path specified in the server's configuration files)

 If the specified document is a directory, its name is followed immediately by a slash

 If a directory does not have a file that the server recognizes as a home page, a directory
listing is constructed and returned to the browser

6. Multipurpose Internet Mail Extensions (MIME)

6.1 Type Specifications

 MIME was developed to specify the format of documents sent via Internet mail

 Adopted by the Web to specify document types transmitted over the Web

 MIME format specification attached to the beginning of a document by the Web server

 MIME specifications have the form: type/subtype


 Common MIME types: text, image, video

 Common text subtypes: plain, html

 Common image subtypes: gif, jpeg

 Common video subtypes: mpeg, quicktime

 Servers determine the type of a document by using the file name's extension as the key into
a table of types

6.2 Experimental Document Types

 Experimental subtypes begin with x-, e.g., video/x-msvideo

 Web providers can add an experimental subtype to the list of MIME specifications stored in
their server

 Browser must supply a program (helper application or plug-in) to display the contents of
experimental document types

 Browsers have a set of MIME specifications they can handle, and an error message is
displayed if they cannot render a document

 Browsers can indicate to the server their preferred document types to receive (discussed in
Section 7)

The Hypertext Transfer Protocol (HTTP)

 All Web communications use HTTP

 Current version of HTTP is 1.1 (RFC 2616)

 HTTP consists of two phases: request and response

 Each HTTP communication has two parts: header and body

HTTP Request Phase

 General form: HTTP method Domain part of the URL HTTP version, header fields, blank line,
message body

 Commonly used HTTP request methods: GET, HEAD, POST, PUT, DELETE

 GET: returns the contents of the specified document

 HEAD: returns the header information for the specified document

 POST: executes the specified document using the enclosed data

 PUT: replaces the specified document with the enclosed data

 DELETE: deletes the specified document

 Most common request methods: GET and POST

 Header fields provide additional information, such as Accept (preferred MIME type) and Host
(host name)
 If-Modified-Since: date request field specifies that the requested file should be sent only if it
has been modified since the given date

 Content-length field gives the length of the response body in bytes

 Header of a request must be followed by a blank line

 Browser not necessary to communicate with a Web server; telnet can be used instead

7.2 The Response Phase

 General form of HTTP response consists of status line, response header fields, blank line, and
response body
 Status line includes HTTP version, three-digit status code, and short textual explanation of
status code
 Status codes categorized into five groups:
o Informational (1xx)
o Success (2xx)
o Redirection (3xx)
o Client error (4xx)
o Server error (5xx)
 Common status codes:
o 200 OK: request handled without error
o 404 Not Found: requested file not found
o 500 Internal Server Error: server encountered a problem
 Response header contains several lines of information about the response, with Content-
type being the essential field
 Response header followed by a blank line, then the response body (e.g., HTML file)
 HTTP 1.1 default operation keeps the connection open for a time, allowing multiple requests
without reestablishing the connection, increasing web efficiency

8 Security

 Internet and Web are prone to security problems


 Web server side: anyone can request software execution or access data on the server
 Browser end: any server can download software to be executed on the browser host
machine
 Security issues for transactions (e.g., credit card purchase): privacy, integrity, authentication,
nonrepudiation
 Security issues are
o Privacy: it shouldn’t be possible to steal the data while transmitting
o Integrity: it shouldn’t be possible to modify the data while transmitting
o Authentication: it should be possible for both ends to be certain of each others
identity
o Nonrepudiation: it should be possible to legally prove that the message was actually
sent and received
 Encryption is the basic tool to support privacy and integrity
 Public-key encryption uses a public key and a private key to encrypt and decrypt messages
 RSA is the most widely used public-key algorithm
 Intentional and malicious destruction of data on Internet-connected computers is another
security problem
 Denial-of-service (DoS) attacks, viruses, and worms cause billions of dollars in damage
 DoS attacks flood a Web server with requests, overwhelming its ability to operate effectively
 Viruses replicate and overwrite memory, destroying programs and data
 Worms spread on their own and damage memory
 Protection against viruses and worms is provided by antivirus software, which must be
updated frequently

9. The Web Programmer's Toolbox

 HTML: A markup language used to describe the form and layout of documents for display in
a browser

 XML: A meta-markup language used to define custom markup languages

 JavaScript: A client-side scripting language used for creating dynamic web content

 PHP: A server-side scripting language used for creating dynamic web content

 Ruby: A server-side scripting language used for creating dynamic web content

 JSF (JavaServer Faces): A Java-based framework for building web applications

 ASP.NET: A Microsoft framework for building web applications

 Rails (Ruby on Rails): A Ruby-based framework for building web applications

 Flash: A technology for creating and displaying graphics and animation in HTML documents

 Ajax (Asynchronous JavaScript and XML): A web technology used for creating dynamic and
interactive web applications

9.1 Overview of HTML

 HTML is not a programming language; it cannot describe computations

 HTML documents consist of content and controls specified by tags

 Tags delimit particular kinds of content and form elements

 Some tags include attribute specifications for additional information

9.2 Tools for Creating HTML Documents

 HTML documents can be created with a general-purpose text editor

 HTML editors provide shortcuts for producing repetitious tags and may include spell-
checkers, syntax-checkers, and color-coding

 WYSIWYG (What You See Is What You Get) HTML editors allow users to see the formatted
document while writing the HTML code

 Examples of WYSIWYG HTML editors: Microsoft FrontPage and Adobe Dreamweaver

9.3 Plug-ins and Filters


 Plug-ins: Programs that can be integrated with a word processor to add new capabilities,
such as creating HTML documents with WYSIWYG features

 Filters: Converters that transform an existing document into HTML format; they are not part
of the editor or word processor that created the document

 Neither plug-ins nor filters produce HTML documents with identical appearance to the
original word processor document

 Using plug-ins or filters allows easy conversion of existing documents to HTML and the use of
familiar word processors for creating HTML documents

 HTML output produced by converters often needs modification, leading to version problems
during maintenance

9.4 Overview of XML

 XML (eXtensible Markup Language) is a simplified version of SGML (Standard Generalized


Markup Language) for creating custom markup languages

 XML-based markup languages describe data and its meaning through individualized tags and
attributes, while HTML describes overall layout and presentation

 XML allows application programs to process specific kinds of data based on tag meanings
and validate documents before processing

9.5 Overview of JavaScript

 JavaScript is a client-side scripting language used for validating form data, building Ajax-
enabled HTML documents, and creating dynamic HTML documents

 JavaScript is dynamically typed, unlike strongly typed languages such as C++ and Java

 JavaScript code is embedded in HTML documents and interpreted by the browser on the
client-side

 JavaScript defines an object hierarchy for accessing and modifying elements of an HTML
document, enabling dynamic document creation and manipulation

SUMMARY

Internet and Web Fundamentals

 The Internet began as ARPAnet in the late 1960s and later became known as NSFnet for
nonmilitary users

 The Internet connects millions of computers worldwide through the TCP/IP protocol, making
them appear the same at the lowest level

 Two kinds of addresses are used on the Internet: IP addresses (four-part numbers for
computers) and fully qualified domain names (words separated by periods for people)

 Fully qualified domain names are translated to IP addresses by name servers running DNS

 A number of information interchange protocols have been created, including telnet, ftp, and
mailto
Web Basics

 The Web started in the late 1980s at CERN as a means for physicists to share results
efficiently with colleagues at other locations

 The fundamental idea of the Web is to transfer hypertext documents among computers
using the HTTP protocol on the Internet

 Browsers request HTML documents from Web servers and display them for users

 URLs are used to address all documents on the Internet, including the specific protocol, fully
qualified domain name, and file path to the specific document on the server

 Web servers find and send requested documents to browsers

 The type of a document delivered by a Web server appears as a MIME specification in the
first line of the document

Web Programming Languages and Tools

 Web programmers use several languages to create documents that servers can provide to
browsers

 HTML is the standard markup language for describing how Web documents should be
presented by browsers

 Tools like plug-ins and filters can be used without specific knowledge of HTML to create
HTML documents

 XML is a meta-markup language that provides a standard way to define new markup
languages

 JavaScript is a client-side scripting language that can be embedded in an HTML document to


describe simple computations and change elements dynamically

 Flash is a framework for building animation into HTML documents

 Ajax is an approach to building Web applications in which partial document requests are
handled asynchronously

 PHP is a server-side scripting language used primarily for form processing and database
access from browsers

 Servlets and JSP are server-side Java programs used for form processing, database access, or
building dynamic documents

 ASP.NET is a Web development framework using any .NET programming language

 Ruby, with the Rails framework, is used for building Web applications that access databases,
simplifying the development process

You might also like