Week 5: E-Business Infrastructure, Internet, and the Web
Lecture 1: Components of E-Business Infrastructure
● Goals for quality information services
○ Performance: Response time can be affected by ISPs, Networks, and third-party
services.
○ Scalability: Handling demand surges by scaling up (larger server) or scaling out
(more servers).
○ Availability and maintainability: Includes identifying single points of failure,
minimum configuration needs, self-repairing capability, diagnostics, and
emergency procedures. Key metrics are MTTF (mean time to failure) and MTTR
(mean time to repair).
● Technology Platform for E-business
○ Software Solutions: Web Languages and packaged solutions.
○ Server Platforms
○ Networking Infrastructure: Networking overview, communication protocols, and
network security.
○ Digital Payment Systems
○ Data Infrastructure
● Web System Architecture
○ Web Client
○ Web Server and Application Server
○ Database Server
○ Internet
● Web Server Elements
○ HTTP Server
○ TCP/IP
○ Operating System
○ Hardware: Processor, Disks, Network Interfaces, etc.
● Characteristics of a Web Server
○ Also known as HTTP Server/ HTTP Daemon.
○ Continuously listens to the client requests and returns the requested file.
○ Handles more than one request at a time: Forking / Multithreading
● Performance metrics for the Web server
○ Throughput: The rate at which the HTTP requests are serviced. Measured in HTTP
operations/second OR megabits per second (Mbps)
○ Latency: The time required to complete a request. Average latency is the average
time for handling requests.
● Dynamic Load Balancing
○ Splitting the traffic across the servers
○ Mirroring the site
○ Methods
■ DNS Based: Mapping to a cluster of servers in a round-robin fashion during
address translation.
■ Dispatcher based: Address of a special TCP router as the address of the
Web server. Router diverts the request to the server with less load
■ Server based: Address redirection. Increase in client response time
● Application Server
○ Handles all the transactions between the Web server and the backend database
○ Supports different programming languages and/or scripting languages
● Database Server
○ Database management system
○ Structured query language
○ Database connectivity
● Other important components and concepts in E-business infrastructure
○ Mainframe and Legacy systems: Integration Technologies
○ Proxies
■ Network traffic reduction
■ Privacy and security (Firewalls)
■ Load balancing
○ Caches
■ Traffic reduction
■ Levels of Caches
■ Dedicated community proxy servers
○ Third-party Services
■ Security services, Ad servers, Trust services, Escrow services
■ A source of additional delay in the Web servers’ response time
○ Other data resources
■ Data warehouses and data marts
■ Online Analytical Processing Queries (OLAP)
■ Business Intelligence
Lecture 2: Internet and the Web
● Features of the Internet
○ Originated in 1960 as a result of research supported by the Advances Research
Project Agency by US DOD (ARPANET)
○ A collection of networks
○ Basic Features
■ Data Centric
■ Separation of communication from data processing
■ Packet Switching
● Features of a packet-switched Network
○ Network consists of two types of nodes
■ Hosts: Originators and destinations of data packets
■ Routers: Responsible for routing the packets
○ A connectionless system
■ No-fixed routing scheme between the hosts
■ Routing tables change based on network state
○ Congestion or link failure
■ Packets arrive out of sequence packets
○ A “Best-effort” delivery network
■ In case of congestion or link failure, the packets are discarded
■ Recognition of failure and the corrective action is the task of the host
computer.
● Connecting to the Internet
○ To connect a computer to the internet, it must be connected to a router that is a
part of the Internet
○ Routers are sponsored by a university, research centers, or commercial
companies (ISPs).
○ ISPs Operate at many levels
■ Local ISPs
■ Lease Connections from the national or regional ISPs
■ Provide dial-up access to the users and charge them
■ National or regional ISPs
■ Have their own backbone to carry traffic
■ Charge local ISPs
● Domain Name System
○ Converting IP addresses to human-readable form
○ An application on which many other application-level protocols rely
○ Includes a distributed database system responsible for storing domain names
● How DNS works
○ Client enters a domain name (www.domainname.com) into his browser.
○ The browser contacts the Client's ISP for the IP address of the domain name.
○ The ISP first tries to answer by itself using "cached" data.
○ If the answer is found, it is returned. Since the ISP isn't in charge of the DNS and is
just acting as a "dns relay", the answer is marked "non-authoritative"
○ If the answer isn't found, or it's too old, then the ISP DNS contacts the
nameservers for the domain directly for the answer.
○ If the nameservers are not known, the ISP's looks for the information at the 'root
servers' or 'registry servers'.
● Getting a domain name
○ ICANN (Internet Corporation for Assigned Names and Numbers) is the private
(non-government) non-profit corporation with responsibility for IP address space
allocation, protocol parameter assignment, domain name system management,
and root server system management functions.
● Uniform Resource Locator
○ Unique address of an Internet resource
○ Protocol://domain-name:port/directory/resource
○ Example: http://www.accd.edu/sac/lrc/john/wwwtest2.htm
○ The port number can be deleted if its usage the standard port.
● HTTP Protocol
○ An application-level protocol
○ A client issues a request to a server, and the server returns a response
■ Request is in ASCII format
■ Response in MIME (Multipurpose Internet Mail Extension) format
■ Text: HTML
■ Image: JPEG/GIF
○ A stateless protocol
● HTTP request-response model
○ Web client makes a TCP connection to the server (at port 80).
○ Sends HTTP request (header + data)
○ Server returns HTTP response. (Status, header, requested resource)
● Static Web page generation
○ HTML Tags
○ Browser
● Dynamic Webpage Generation
○ Server-side programming
■ Database Connectivity
■ Passing additional data to the Web server
■ Java: Servlets, JSP
■ Microsoft: ASP
■ PHP, CGI Script
○ Client-side programming
■ Java scripts
● Cookies
○ To cope with the stateless nature of HTTP
○ Tracking a client
○ Supporting applications like shopping cart
○ Privacy issues
○ Servers set cookies by sending a set-cookie header in HTTP response
■ Set-cookie: Name=Value
○ Whenever required by the server, the client includes the cookie in the request
header by using
■ Cookie: Name=value
Lecture 3: Networking Resources
● Key Concepts
○ ISO-OSI reference model
○ TCP/IP protocol stack
○ Computer Network: A set of communicating computing devices
○ Consisting of the following building blocks
■ The framework
■ Standard Organizations
■ ISO-OSI Reference Model
■ Addressing
■ Protocols
■ Protocol suit
■ Applications
■ Hardware
■ Physical Connectivity
● Standard Organizations
○ ISO (International Standard Organization)
○ IAB (Internet Advisory board)
○ IEEE (Institute of Electrical and Electronic Engineers)
● The ISO-OSI Reference Model
○ Originally intended as the benchmark for the international standardization of
computer networking protocols.
○ A divide-and-conquer approach
○ Layers are used to isolate groups of related functions so that development and
flexibility are promoted through the use of well-defined interfaces.
○ Each layer is insulated from the addressing details used by the layer below.
○ Networking Protocols/ Protocol suits can be designed and compared in the
framework of this model.
○ Today TCP/IP is the most important protocol suite
● TCP/IP – A Layered Model
○ Application Layer: Provides a specific application
○ Transport Layer: Provides end-to-end transport service between two hosts
○ Network Layer: Forwards the packets across the network
○ Link Layer: Provides interface or access to the network
● TCP/IP and the OSI Model in context
○ OSI Layers: Physical, Data Link, Network, Transport, Session, Presentation,
Application
○ TCP/IP Layers: Physical, LLC (Logical Link Control) – MAC (Medium Access
Control), IP ARP, TCP UDP, FTP HTTP Telnet SMTP
● Processing at Each Layer
○ Encapsulation
○ Headers: Link, IP, TCP, Application Headers added at corresponding layers
○ Data Units: Frame (Link Layer), Datagram (IP Layer), Segment/Stream (TCP Layer)
● Transfer of Packet
○ Packet moves through layers at each host (A and B).
○ Encapsulation and decapsulation at each layer.
● Link Layer
○ Provides access to the network
○ Addresses physical characteristics
○ Handles many access control protocols for each physical network standard
○ Functions
■ Encapsulation of IP datagrams into frames
■ Mapping of IP addresses to physical address used by the network
● Network Layer
○ Internet Protocol
■ Defining datagram
■ Defining Internet addressing scheme
■ Moving data between Network layer and Transport layer
■ Routing datagrams
■ Performing segmentation and reassembling of datagrams
● IP Addresses
○ IPv4 – 32-bit address
○ IPv6 – 128-bit addresses
● Representation of IP Addresses
○ Dot decimal format
■ Ex. 128.0.0.1
■ Binary equivalent of the above is
10000000.00000000.00000000.00000001
○ Consists of two parts
■ Network number
■ Host number (within the network)
● Transport Layer
○ TCP and UDP
○ TCP (Transmission control protocol)
■ Connection oriented
■ Handshaking
■ Source port, destination port, sequence number, and acknowledgment.
■ Sliding window mechanism
○ UDP (User datagram protocol)
■ Connectionless
■ No handshaking
■ Source port and destination port
■ No acknowledgment
■ No retransmission
● TCP and UDP Header Formats
○ [Diagrams of TCP and UDP headers]
● Application Layer
○ Includes all the processes that use the transport layer protocol to deliver data.
○ Example: HTTP, FTP, Telnet, SMTP
● Protocol Port and Socket
○ Data multiplexing and demultiplexing
■ Combining data from many sources for delivering to the network
■ Dividing the data for delivery to multiple sources
○ Protocol number: to identify transport protocol
○ Port number: To identify application
■ May be dynamically allocated by the system
○ Socket: The combination of IP address and Port number
■ Uniquely identifies a network process within the entire Internet
Lecture 4: Hardware and Software Resources
● Networking hardware in Context
○ [Diagram relating hardware to OSI Model Layers]
● Transceivers (Media Attachment Units)
○ Provide the means for encoding data into purely electrical or light signals ready
for transmission onto the physical media.
○ Also responsible for converting the signal back into the data at the receiving
station.
○ Ex. Network Adapter Card
● Repeaters
○ Used to extend the LAN
○ Regeneration of the Frames
○ Must be compliant with maximum acceptable delay in the network (bit-budget
delay)
○ Mostly dumb
○ Some are semi-intelligent
■ Memory
■ Inhibit regeneration of error frames and collision frames
○ Ex: 10/100 Base T (Ethernet)
● Bridges
○ Offer filtering and forwarding capability based on Layer 2 fields and independent
of Layer 3 protocols.
○ Filtering and forwarding capability on layer 2 fields to increase backbone
efficiency.
○ Traffic management capability at Link level
■ Associating node MAC addresses with particular interfaces and forwarding
them
○ Responsible for preserving network topology integrity by stopping the formation
of loops
■ Using protocols such as spanning tree or its variants
● Switches
○ Used when there is a need for higher bandwidth in shared access LAN
○ High-speed bridges
○ Replacing the old bridges and repeaters
● Routers
○ A special-purpose layer 3 device used instead of a host.
○ Forwards network traffic based on IP addresses rather than the MAC address
○ Communicate with one another, learning neighbours
● Gateways
○ Generic term for any network device with protocol translation capability.
○ Transport Relay devices.
○ Older literature may refer to routers as gateways.
● Computer Hardware Platforms
○ Client machines: Desktop PCs, mobile devices (PDAs, laptops).
○ Servers: Blade servers (ultrathin, in racks), Mainframes (IBM, equivalent to
thousands of blade servers).
○ Top chip producers: AMD, Intel, IBM.
○ Top firms: IBM, HP, Dell, Sun Microsystems.
● Server Definition
○ A computer or software that provides services to other computers.
○ Examples: Application server, Communications server, Database server, Fax
server, File server, Game server, Standalone server, Web server.
● Factors Influencing Server Selection
○ Applications Support, Cost, Ease of Administration, Familiarity, Homogeneity,
Interoperability, Reliability (MTBF), Scalability, Security, Vendor Support.
○ Most to least important (according to an Advisory Council study).
● Execution Time Definition
○ Response Time: Lapsed/Wall-clock/Execution Time/Latency to complete a task;
includes disk access, memory access, I/O, OS overhead.
○ CPU Time: Time CPU is computing (user + system CPU time), excluding I/O wait.
○ System performance: Elapsed time on unloaded system.
○ CPU performance: User CPU time on unloaded system.
● Performance Considerations
○ Response time depends on CPU and I/O time.
○ Neglecting I/O can lead to diminishing returns when improving CPU speed.
○ I/O performance can limit CPU performance.
● Purchasing Decisions
○ Cost is often a constant (system/commercial requirements).
○ Speed and storage capacity are adjusted to meet the cost target.
● Memory Hierarchy Concept
○ Smaller is faster due to signal propagation delays in larger memories.
○ Faster memories are smaller and more expensive per byte.
○ Principle of locality.
● Levels in a Typical Memory Hierarchy
○ Registers, Cache, Memory, I/O Devices (with increasing slowness).
● Cache Memory
○ A small, fast memory near the CPU holding recently accessed code/data.
○ Cache hit/miss.
○ Temporal and spatial locality.
○ Cache miss time depends on memory latency and bandwidth.
○ Cache misses cause CPU stalls.
● Main Memory:
○ Programs reside in main memory.
○ Virtual memory, pages, page fault.
○ CPU switches tasks during disk access for page faults.
● Types of Storage Devices:
○ Magnetic storage
○ Semiconductor storage
○ Optical disc storage
● Magnetic Storage:
○ Non-volatile, stores information using magnetic patterns.
○ Accessed via read/write heads.
○ Sequential access (seek, cycle).
○ Examples: Floppy disk, Hard disk, Magnetic tape.
● Semiconductor Memory:
○ Uses integrated circuits to store information (transistors, capacitors).
○ Volatile and non-volatile forms.
○ Primary storage: Dynamic volatile semiconductor memory (DRAM).
○ Flash memory: Non-volatile, for off-line storage.
○ Non-volatile semiconductor memory: Secondary storage in devices/computers.
● Optical Disc Storage:
○ Stores information in pits on a disc surface, read by laser reflection.
○ Non-volatile, sequential access.
○ Examples: CD/CD-ROM/DVD (read-only), CD-R/DVD-R/DVD+R (write-once),
CD-RW/DVD-RW/DVD+RW/DVD-RAM (rewritable).
● Network Storage:
○ Accessing information over a computer network.
○ Centralizes information management and reduces duplication.
○ Examples: Direct Attached Storage (DAS), Network Access Storage (NAS), Storage
Area Networks (SAN).
● Direct-Attached Storage (DAS):
○ Storage devices are part of the host computer or directly connected to a single
server.
○ Network workstations access storage via the server.
○ Contrasts with NAS/SAN, which connect over a network.
○ Large installed base.
● Network Access Storage (NAS):
○ Computing-storage devices accessed over a network (TCP/IP).
○ Enables multiple computers to share storage, centrally managing hard disks.
○ Often uses RAID arrays.
○ Access via NFS, CIFS, or HTTP, allowing sharing between different OS.
● Storage Area Networks (SAN):
○ Similar to NAS but uses a block-based protocol over a specialized storage
network.
○ Server-class devices with SCSI Fiber Channel connect to SAN.
○ File sharing is OS-dependent.
○ Fiber Channel has distance limits (around 10km).
○ Faster data transfer than NAS.
● Comparison of NAS and SAN:
○ NAS: Scalable, long-distance data transfer, slower, congestion-prone, inefficient
backup/recovery.
○ SAN: Efficient data integrity/backup, faster, no congestion, less scalable, limited
distance.
● Software Resources:
○ The Operating System:
■ Manages system resources (memory, processors, devices, information).
■ Keeps track of resources, allocates, and reclaims them.
○ Functions of Operating Systems.
Lecture 5: Data Resources
● Types of data resources.
● Logical data elements in information systems.
● Entities and relationships.
● Entity Relationship Diagram.
● Relational database structure.
● Logical User Views vs. Physical Data Views in a database management system.
● Structured Query Language.
● Major types of databases.
● Components of a data warehouse system.
● Data mining process.
● Multi-dimensional view of data.
● Online analytical processing (OLAP):
○ Consolidation: Aggregation of data (roll-ups, complex groupings).
○ Drill-down: Displaying detailed data comprising consolidated data.
○ Slicing and Dicing: Looking at the database from different viewpoints (e.g., sales
by product type or sales chan