14-Feb-12
Internet & the Web
The World Wide Web
The Web and the Internet are not the same thing Web servers: Computers programmed to send files to browsers running on other computers connected to the Internet Web servers and their files make up the World Wide Web The Web is made from a subset of all the computers on the Internet The Internet is the road, the Web is just one form of traffic on the road
3-2
14-Feb-12
Requesting a Web Page
Web request creates a client/server interaction Uniform Resource Locator (URL) has three main parts
http://www.widgets.com/hardware/support/faq.html
1. Protocol:
http:// indicates Hypertext Transfer Protocol (HTTP) Tells the computers how to handle the file
2. Server computer's name:
Server's IP address given by the domain hierarchy
3. Page's pathname:
Tells the server which file (page) is requested and where to find it.
3-3
Describing a Web Page
Pages are stored as descriptions of how they should appear on screen (called page markup) A web browser creates the viewable image from the description file (the source)
Browser can adapt the page image more easily (scale to your screen, scroll it, etc.) from source
You can see the page description by selecting view source in the browser
3-4
14-Feb-12
3-5
Hypertext
Hypertext Markup Language (HTML) Markup languages describe the layout, formatting, and look applied to a documents abstract structure
Margin width, indentations Font, text style, size, color Image placement, etc.
Hyperlinks allow jumping from point to point in documents (non-linear); links show as highlighted words and images HTML realization of hypertext, and the Web, from Tim Berners-Lee (1990s)
Term hypertext is from Ted Nelson (1960s) Concept comes from Vannevar Bush (1940s)
3-6
14-Feb-12
Web Pages and File Structure
Web sites are organized collections of HTML files
URL points into this organization to select a file
Directory, or folder, is a named collection of files, other directories, or both Directory Hierarchy: Directories can contain other directories, which can contain other directories, etc.
Down, or lower in the hierarchy, means moving into subdirectories Up, or higher in the hierarchy, means into enclosing (parent) directories
3-7
3-8
14-Feb-12
File Structure (cont'd)
Part of the directory hierarchy is shown in the pathnames of URL's.
http://www.nasm.si.edu/exhibitions/ga1100/pioneer.html
Page is given by pathname:
/exhibitions/ga1100/pioneer.html
Each time we pass a slash (/), we move into a subdirectory or into the file (lower in the hierarchy)
3-9
3-10
14-Feb-12
Organizing the Directory
When a URL ends in a slash, the browser looks for a file called index.html in that directory
http://www.widget.com/ and http://www.widget.com/index.html are the same
If the browser does not find an index.html file, the browser automatically tries to display a directory listing (index) of the files there Why are hierarchies important?
People use them to organize their thinking and work Directories are free; there is no reason not to use them
3-11
Communication Types
General Communication
Synchronous: sender and receiver are active at the same time
e.g., telephone call, instant messaging (IM)
Asynchronous: sending and receiving occur at different times
e.g., e-mail
Broadcast communication (or multicast): single sender and many receivers Point-to-point communication: single sender and single receiver
3-12
14-Feb-12
Universal Communication Medium
Internet provides a general communication "fabric" linking all computers connected to it Can be applied in many ways:
Point-to-point asynchronous
E-mail is alternative to standard mail
Point-to-point synchronous
IM is alternative to telephone
Multicasting
Chat rooms are alternatives to magazines
Broadcasting
Web pages are alternatives to radio and television
3-13
Client/Server Interaction
Server is the computer that stores information
Web server, file server, mail server
Client is the computer that wants the information When you click a Web link, your computer (the client) enters into a client/server relationship with a web server Once the page is sent to you, the client/server relationship ends
3-14
14-Feb-12
3-15
Client/Server Interaction
These relationships are brief, so a server can serve many clients at the same time
Ask, receive, done
One server can provide information to many clients
Yahoo, Google, eBay a web site can be used by many different people at once, and they all get service when
One client computer can ask for services from many servers
A web page may have many links, each to a different web server
3-16
14-Feb-12
3-17
Name Game: Computer Addresses
IP addresses: Each computer on the Internet (a host) is given a unique 4-part numerical address
For example: 128.208.2.44
Hostnames: Human-readable symbolic names, based on a domain hierarchy
Easier to read and remember For example: spiff.cs.washington.edu
3-18
14-Feb-12
3-19
Domains and Domain Hierarchy
Domain is a related group of networked computers Domain names are organized hierarchically Top-level domains appear in the last part of domain name:
.edu educational institutions .org organizations .net networks .mil military .gov government agencies Mnemonic two-letter country designators such as .ca (Canada)
3-20
10
14-Feb-12
Taking Apart a Hostname
Consider the name spiff.cs.washington.edu Reading from the left, the individual computer (host) is named spiff It is a part of the cs domain, which is a collection of Internet hosts belonging to the Computer Science department The cs domain is within the washington domain, which comprises all departmental domains at the Univ. of Washington This washington domain is within the .edu educational domain, along with domains for other universities.
3-21
3-22
11
14-Feb-12
3-23
DNS Servers
The Domain Name System (DNS) translates all the human-readable hostnames into IP addresses Each Internet host knows the IP address of its nearest DNS server, a computer that keeps a list of host/domain names and corresponding IP addresses When you use a hostname to send information, your computer asks the DNS server to look up the IP address (this is a client/server relationship) If the closest DNS server doesn't know the IP address, it asks an authoritative server, the root of a hierarchy of special DNS servers with more complete name translation information.
3-24
12
14-Feb-12
Sending Information Over the Net
We know how to specify (address) a specific computer on the Internet now how do we send information from one to the other?
3-25
Following Protocol
A protocol describes the specific technical steps involved in how information is actually transmitted TCP/IP (Transmission Control Protocol/Internet Protocol)
Information is broken into a sequence of small fixed-size units called IP packets Each packet has space for a chunk of data (e.g., piece of the novel), the IP addresses of the source and destination computers, and a sequence number The packets are sent over the Internet one at a time using whatever route is available Each packet can take a different route, so congestion and service interruptions do not delay transmissions
3-26
13
14-Feb-12
3-27
Connecting a Computer to Internet
Three Common Ways
Via Internet Service Provider (ISP)
An ISP sells connections to the Internet (like Comcast and Earthlink, many others) User plugs a computer into the telephone system or a dedicated line to ISP (DSL, cable) Users computer talks to ISP's computer ISP's computer is a constantly connected host on the Internet, and relays information for its customers
3-28
14
14-Feb-12
Connecting a Computer to Internet
Via Enterprise Network Connections (LAN)
Used by large networked organizations such as schools, businesses, or governmental units The organization creates a LAN, or intranet The intranet connects to the Internet by a gateway Information from a Web computer is sent across Internet, through gateway, then across LAN to user's computer
3-29
Connecting a Computer to Internet
Via wireless (variation on a LAN)
A specialized computer (access point, hub, or router) is physically connected to the Internet (wired) Mobile computers use radio signals to connect wirelessly with the router and initiate network transmissions through it Router assigns temporary IP addresses via DHCP (Dynamic Host Configuration Protocol) Wireless mobile computers and the router do an Ethernet-like protocol, acting as a LAN Router then uses Internet protocols to the broader physical network and relays transmissions from the mobile computers
3-30
15
14-Feb-12
3-31
Search Tools
Search engines Metasearch engines Specialized search engines Tips
Start with the right approach Be as precise as possible Use multiple words Use Boolean operators Check your spelling Keep moving
16
14-Feb-12
Search Engines
Specialized programs to assist in locating information Types of searches
Keyword search Directory search
Metasearch Engines
17
14-Feb-12
Content Evaluation
Not all information on the web is accurate Ways to evaluate accuracy of Web information include: Authority Accuracy Objectivity Currency
Plug-Ins
Program that is automatically loaded and operated as part of the browser.
18
14-Feb-12
Filters
A program that blocks access to selected websites.
References
Fluency with Information technology: Skills, Concepts & Capabilities, by Lawrence Snyder, fourth edition, Pearson Education. Computing Essentials 2011, OLeary & OLeary, Complete International Edition, Mc Graw Hill.
19