0% found this document useful (0 votes)
5 views169 pages

Unit 4 - Network Layer

Chapter 4 focuses on the network layer, detailing its services, including forwarding, routing, and the functions of routers. It discusses the differences between virtual circuit and datagram networks, as well as the implementation of routing algorithms and forwarding tables. The chapter also outlines the network service model, including guaranteed delivery, in-order packet delivery, and security services.

Uploaded by

zeelsoni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views169 pages

Unit 4 - Network Layer

Chapter 4 focuses on the network layer, detailing its services, including forwarding, routing, and the functions of routers. It discusses the differences between virtual circuit and datagram networks, as well as the implementation of routing algorithms and forwarding tables. The chapter also outlines the network service model, including guaranteed delivery, in-order packet delivery, and security services.

Uploaded by

zeelsoni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 169

Chapter 4

Network Layer

A note on the use of these ppt slides: Computer


We’re making these slides freely available to all (faculty, students, readers).
They’re in PowerPoint form so you see the animations; and can add, modify,
Networking: A
and delete slides (including this one) and slide content to suit your needs.
They obviously represent a lot of work on our part. In return for use, we only
Top Down
ask the following: Approach
 If you use these slides (e.g., in a class) that you mention their source
(after all, we’d like people to use our book!)
6th edition
 If you post any slides on a www site, that you note that they are adapted Jim Kurose, Keith
from (or perhaps identical to) our slides, and note our copyright of this Ross
material.
Addison-Wesley
Thanks and enjoy! JFK/KWR March 2012
All material copyright 1996-2013
J.F Kurose and K.W. Ross, All Rights Reserved
Network Layer 4-1
Chapter 4: network layer
chapter goals:
 understand principles behind network
layer services:
 network layer service models
 forwarding versus routing
 how a router works
 routing (path selection)
 broadcast, multicast

Network Layer 4-2


Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
 hierarchical routing
4.3 what’s inside a
router 4.6 routing in the
4.4 IP: Internet Protocol Internet
  RIP
datagram format
  OSPF
IPv4 addressing
  BGP
ICMP
 IPv6 4.7 broadcast and
multicast routing

Network Layer 4-3


Network layer application
transport
 transport segment from network
data link

sending to receiving host physical


network network

on sending side
data link data link
 network
physical physical
data link

encapsulates segments physical network


data link
network
data link

into datagrams physical


physical

network
on receiving side, delivers
network
 data link data link
physicalnetwork physical
segments to transport data link
physical
layer network
application
transport
data link network
 network layer protocols in network
data link
physical
network
data link data link
physical
every host, router
physical
physical

 router examines header


fields in all IP datagrams
passing through it
Network Layer 4-4
Two key network-layer
functions
 forwarding: move analogy:
packets from
router’s input link to  routing: process of
appropriate output planning trip from
link source to dest
 routing: determine  forwarding: process
route taken by of getting through
packets from source single interchange
to dest.
 routing algorithms
Network Layer 4-5
Interplay between routing and forwardin
routing algorithm routing algorithm determines
end-end-path through network
local forwarding table forwarding table determines
header value output link local forwarding at this router
0100 3
0101 2
0111 2
1001 1

value in arriving
packet’s header
0111 1
3 2

Network Layer 4-6


Network Service Model

 In the sending host, when the transport layer


passes a packet to the network layer, services
provided by network layer:

 Guaranteed delivery
 This service guarantees that the packet will eventually arrive
at its destination.
 Guaranteed delivery with bounded delay
 This service not only guarantees delivery of the packet, but
delivery within a specified host-to-host delay bound.
Network Service Model –
Cont…
 Services provided by network layer for a flow of packets
between Source and destination:

 In-order packet delivery


 This service guarantees that packets arrive at the destination in
the order that they were sent.

 Guaranteed minimal bandwidth


 This network-layer service emulates the behaviour of a
transmission link of a specified bit rate (for example, 1 Mbps)
between sending and receiving hosts.
 As long as the sending host transmits bits at a rate below the
specified bit rate, then no packet is lost.

 Guaranteed maximum jitter


 This service guarantees that the amount of time between the
transmission of two successive packets at the sender is equal to
the amount of time between their receipt at the receiver.
Network service model
 Security services
 Using a secret session key known only by a source and
destination host, the network layer in the source host could
encrypt the payloads of all datagrams being sent to the
destination host.
 The network layer in the destination host would then be
responsible for decrypting the payloads.
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
 hierarchical routing
4.3 what’s inside a
router 4.6 routing in the
4.4 IP: Internet Protocol Internet
  RIP
datagram format
  OSPF
IPv4 addressing
  BGP
ICMP
 IPv6 4.7 broadcast and
multicast routing

Network Layer 4-10


Connection, connection-less
service
 datagram network provides network-layer
connectionless service
 virtual-circuit network provides network-
layer connection service
 analogous to TCP/UDP connecton-oriented /
connectionless transport-layer services,
but:
 service: host-to-host
 no choice: network provides one or the
other
 implementation: in network core
Network Layer 4-11
Virtual circuits
“source-to-dest path behaves much like
telephone circuit”
 performance-wise
 network actions along source-to-dest path
 call setup, teardown for each call before data
can flow
 each packet carries VC identifier (not
destination host address)

Network Layer 4-12


VC implementation
a VC consists of:
1. path from source to destination
2. VC numbers, one number for each link along
path
3. entries in forwarding tables in routers along
path
 packet belonging to VC carries VC
number (rather than dest address)
 VC number can be changed on each link.
 new VC number comes from forwarding table

Network Layer 4-13


VC forwarding table
12 R1 22 R2 32
1 3
2
VC number
interface
forwarding table in number R3 R4
R1 router:
Incoming interface Incoming VC # Outgoing interface Outgoing VC #

1 12 3 22
2 63 1 18
3 7 2 17
1 97 3 87
… … … …

VC routers maintain connection state


information! Network Layer 4-14
Virtual circuits: signaling
protocols
 used to setup, maintain teardown VC
 used in ATM, frame-relay, X.25
 not used in today’s Internet

application application
transport transport
5. data flow begins 6. receive data
network network
4. call connected 3. accept call
data link 1. initiate call data link
2. incoming call
physical physical

Network Layer 4-15


Datagram networks
 In connectionless service, packets are injected into the
subnet individually and routed independently of each other.
 No advance setup is needed. The packets are frequently
called datagrams and the subnet is called a datagram
subnet.
 Only directly-connected lines can be used.

application application
transport 1. send datagrams transport
2. receive datagrams network
network
data link data link
physical physical

Network Layer 4-16


Datagram forwarding
table
4 billion IP addresses,
routing algorithm
so rather than list
individual destination
local forwarding table address
dest address output list range of addresses
address-range 1 3 link
address-range 2 2 (aggregate table
address-range 3 2 entries)
address-range 4 1

IP destination address in
arriving packet’s header
1
3 2

Network Layer 4-17


Datagram forwarding
table
Destination Address Range Link Interface

11001000 00010111 00010000 00000000 0


through
11001000 00010111 00010111 11111111
11001000 00010111 00011000 00000000
through 1
11001000 00010111 00011000 11111111
11001000 00010111 00011001 00000000
through 2
11001000 00010111 00011111 11111111
otherwise 3

Q: but what happens if ranges don’t divide up so nicely?


Network Layer 4-18
Longest prefix matching
longest prefix matching
when looking for forwarding table entry
for given destination address, use longest
address prefix that matches destination
address.
Destination Address Range Link interface
11001000 00010111 00010*** ********* 0
11001000 00010111 00011000 ********* 1
11001000 00010111 00011*** ********* 2
otherwise 3
examples:
DA: 11001000 00010111 00010110 10100001 which interface?
DA: 11001000 00010111 00011000 10101010 which interface?
Network Layer 4-19
Datagram Vs. VC network
Datagram Virtual Circuit
Connection Setup None Required
Addressing Packet contains full source and Each virtual circuit number entered
destination address to table on setup, used for routing.

State Information None other than router table Route established at setup, all
containing destination network packets follow same route.
Effect of Router Only on packets lost during All virtual circuits passing through
Failure crash failed router terminated.
Congestion Control Difficult since all packets Simple by pre-allocating enough
routed independently router buffers to each virtual circuit at
resource requirements can setup, since maximum number of
vary. circuits fixed.

Network Layer 4-20


Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
 hierarchical routing
4.3 what’s inside a
router 4.6 routing in the
4.4 IP: Internet Protocol Internet
  RIP
datagram format
  OSPF
IPv4 addressing
  BGP
ICMP
 IPv6 4.7 broadcast and
multicast routing

Network Layer 4-21


Router architecture overview
two key router functions:
 run routing algorithms/protocol (RIP, OSPF, BGP)
 forwarding datagrams from incoming to outgoing link

forwarding tables computed, routing routing, management


pushed to input ports
processor
control plane (software)
forwarding data
plane (hardware)

high-seed
switching
fabric

router input ports router output ports


Network Layer 4-22
Input port functions
link lookup,
line layer forwarding switch
termination protocol fabric
(receive)
queueing

physical layer:
bit-level reception
data link layer: decentralized switching:
e.g., Ethernet  given datagram dest., lookup output port
see chapter 5 using forwarding table in input port
memory (“match plus action”)
 goal: complete input port processing at
‘line speed’
 queuing: if datagrams arrive faster than
forwarding rate into switch fabric
Network Layer 4-23
Switching fabrics
 It connects the router’s input ports to its
output ports.
 transfer packet from input buffer to
appropriate output buffer
 switching rate: rate at which packets can be
transfer from inputs to outputs
 often measured as multiple of input/output line rate
 N inputs: switching rate N times line rate desirable
 three types of switching fabrics
memory

memory bus crossbar


Switching via memory
first generation routers:
 traditional computers with switching under direct control of CPU
 packet copied to system’s memory
 speed limited by memory bandwidth (2 bus crossings per datagram)

input output
port port
memory (e.g.,
(e.g.,
Ethernet) Ethernet)

system bus

Network Layer 4-25


Switching via a bus
 datagram from input port
memory
to output port memory via a
shared bus
 bus contention: switching
speed limited by bus bus
bandwidth
 32 Gbps bus, Cisco 5600:
sufficient speed for access and
enterprise routers

Network Layer 4-26


Switching via interconnection
network
 overcome bus bandwidth
limitations
 banyan networks, crossbar, other
interconnection nets initially
developed to connect processors in
multiprocessor
 advanced design: fragmenting
datagram into fixed length cells, crossbar
switch cells through the fabric.
 Cisco 12000: switches 60 Gbps
through the interconnection
network

Network Layer 4-27


Output ports

datagram
switch buffer link
fabric layer line
protocol termination
(send)
queueing

 buffering required when datagrams arrive from fabric faster than the
transmission rate Datagram (packets) can be
 scheduling discipline chooses among queued datagrams for transmission
lost due to congestion, lack of
buffers
Priority scheduling – who gets best
performance, network neutrality
Network Layer 4-28
Output port queueing

switch switch
fabric fabric

at t, packets more one packet time later


from input to output
 buffering when arrival rate via switch exceeds output line
speed
 queueing (delay) and loss due to output port buffer overflow!

Network Layer 4-29


How much buffering?
 RFC 3439 rule of thumb: average
buffering equal to “typical” RTT (say
250 msec) times link capacity C
 e.g., C = 10 Gpbs link: 2.5 Gbit buffer
 recent recommendation: with N flows,
buffering equal to RTT . C
N

Network Layer 4-30


Input port queuing
 fabric slower than input ports combined ->
queueing may occur at input queues
 queueing delay and loss due to input buffer
overflow!
 Head-of-the-Line (HOL) blocking: queued
datagram at front of queue prevents others in
queue from moving forward

switch switch
fabric fabric

output port contention: one packet time


only one red datagram can later: green
be transferred. packet
lower red packet is blocked experiences HOL
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
 hierarchical routing
4.3 what’s inside a
router 4.6 routing in the
4.4 IP: Internet Protocol Internet
  RIP
datagram format
  OSPF
IPv4 addressing
  BGP
ICMP
 IPv6 4.7 broadcast and
multicast routing

Network Layer 4-32


The Internet network layer
host, router network layer functions:

transport layer: TCP, UDP

routing IP protocol
protocols • addressing conventions
• path selection • datagram format
network • RIP, OSPF, BGP • packet handling conventions
layer forwarding
ICMP protocol
table • error reporting
• router “signaling”

link layer

physical layer

Network Layer 4-33


IP datagram format
IP protocol version 32 bits
number total datagram
header length ver Head type of length (bytes)
length
(bytes) len service for
“type” of data fragment
16-bit identifier flgs offset fragmentation/
max number time to upper header reassembly
remaining hops live layer checksum
(decremented at 32 bit source IP address
each router) 32 bit destination IP address
upper layer protocol
to deliver payload to options (if any) e.g. timestamp,
record route
how much overhead? data taken, specify
 20 bytes of TCP list of routers
(variable length,
 20 bytes of IP typically a TCP to visit.
 = 40 bytes + app or UDP segment)
layer overhead
Network Layer 4-34
IP fragmentation,

reassembly
network links have MTU
(max.transfer size) -
largest possible link-level
frame
 different link types, fragmentation:


different MTUs in: one large datagram
out: 3 smaller datagrams
 large IP datagram divided (
“fragmented”) within net
 one datagram becomes
several datagrams reassembly
 “reassembled” only at
final destination


 IP header bits used to
identify, order related
fragments

Network Layer 4-35


IP fragmentation,
reassembly
length ID fragflag offset
example: =4000 =x =0 =0
 4000 byte one large datagram becomes
datagram several smaller datagrams
 MTU = 1500
bytes
1480 bytes in length ID fragflag offset
data field =1500 =x =1 =0

offset = length ID fragflag offset


1480/8 =1500 =x =1 =185

length ID fragflag offset


=1040 =x =0 =370

Network Layer 4-36


Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
 hierarchical routing
4.3 what’s inside a
router 4.6 routing in the
4.4 IP: Internet Protocol Internet
  RIP
datagram format
  OSPF
IPv4 addressing
  BGP
ICMP
 IPv6 4.7 broadcast and
multicast routing

Network Layer 4-37


IP Address
 IP addresses are useful in identifying a specific host in
a network.
 IP addresses are 32 bit numbers which are divided into
4 octets. Each octet represents 8 bit binary number.
 Below is an example of an IP address:

10101100 00010000 11111110 00000001

172 16 254 1

IP addresses are divided into 2 parts:


Network ID & Host ID
<NID> <HID> = IP Address
Classification of IP Addresses (Classful
Addressing)
Class: A
0
7 Bit Network ID
Fix 24 Bit Host ID
Class: B
1 0
Fix
Class: C 14 Bit Network ID 16 Bit Host ID

1 10
Fix
Class: D 21 Bit Network ID 8 Bit Host ID
1 11 0
Fix
Class: E Multicast address
1 11 1
Fix Reserved address
Class A: (0.0.0.0 to 127.255.255.255)

0
7 Bit Network ID 24 Bit Host ID
 Only 126 addresses are used for network address.
 All 0’s and 1’s in Network-ID are dedicated for special
IP address. So, total number of IP address in class A
can be represented:
0.0.0.0 Special IP Address
00000001.0.0.1
1.0.0.2
1.0.0.3
.
224 – 2 are Host IP
.
.
126.255.255.25
4
127.255.255.25 Special IP Address – Loopback
5
Class B: (128.0.0.0 to 191.255.255.255)
1 0
Fix 14 Bit 16 Bit
Network ID Host ID

 No special network address here. All are usable.

128.0.0.0 Special IP Address


10000001.0.0.1
130.0.0.2
130.0.0.3
. 216 – 2 are Host IP
.
.
190.255.255.254
10111111.255.255.255 Special IP Address –
Loopback
Class C: (192.0.0.0 to 223.255.255.255)

1 10
Fix 21 Bit Network 8 Bit
ID Host ID

192.0.0.0 Special IP Address


11000001.0.0.1
194.0.0.2
194.0.0.3
. 28 – 2 are Host IP
.
.
222.255.255.254
11011111.255.255.255 Special IP Address –
Loopback
Class D: (224.0.0.0 to 239.255.255.255)
 Very first four bits of the first octet in Class D IP
addresses are set to 1110, giving a range of:

11100000 – 11101111
224 - 239
 Class D has IP address rage from 224.0.0.0 to
239.255.255.255.
 Class D is reserved for Multicasting.
 In multicasting data is not destined for a particular
host, that is why there is no need to extract host
address from the IP address, and Class D does not
have any subnet mask.
Class E: (240.0.0.0 to 255.255.255.255)
 This IP Class is reserved for experimental purposes
only for R&D or Study.
 IP addresses in this class ranges from 240.0.0.0 to
255.255.255.254.
 Like Class D, this class too is not equipped with any
subnet mask.
IP Addressing Summary
Size Default subn
Size
Leadin of networ Number Total et CIDR
of rest Addresses Start
Class g k of addresses End address mask in dot- notatio
bit per network address
bits number networks in class decimal n
field
bit field notation

16,777,2 2,147,483,6 127.255.255.25


Class A 0 8 24 128 (27) 0.0.0.0 255.0.0.0 /8
16 (224) 48 (231) 5

16,384 65,536 1,073,741,8 128.0.0. 191.255.255.25


Class B 10 16 16 255.255.0.0 /16
(214) (216) 24 (230) 0 5

2,097,15 536,870,912 192.0.0. 223.255.255.25 255.255.255.


Class C 110 24 8 256 (28) /24
2 (221) (229) 0 5 0

Class D not
not not not 268,435,456 224.0.0. 239.255.255.25 not
(multicas 1110 define not defined
defined defined defined (228) 0 5 defined
t) d

Class E not
not not not 268,435,456 240.0.0. 255.255.255.25 not
(reserved 1111 define not defined
defined defined defined (228) 0 5 defined
) d
IP addressing: introduction
223.1.1.1
 IP address: 32-bit
223.1.2.1
identifier for host,
router interface 223.1.1.2
223.1.1.4 223.1.2.9
 interface: connection
between host/router
and physical link 223.1.1.3
223.1.3.27
223.1.2.2
 router’s typically have
multiple interfaces
 host typically has one or
two interfaces (e.g., 223.1.3.1 223.1.3.2
wired Ethernet, wireless
802.11)
 IP addresses associated 223.1.1.1 = 11011111 00000001 00000001 00000001
with each interface
223 1 1 1

Network Layer 4-46


Subnets
 IP address: 223.1.1.1
 subnet part - high
order bits 223.1.1.2 223.1.2.1
223.1.1.4 223.1.2.9
 host part - low order
bits 223.1.2.2
223.1.1.3 223.1.3.27
 what’s a subnet ? subnet
 device interfaces
with same subnet 223.1.3.1 223.1.3.2
part of IP address
 can physically reach
each other without network consisting of 3 subnets
intervening router

Network Layer 4-47


Subnets
223.1.1.0/24
223.1.2.0/24
recipe 223.1.1.1

 to determine the 223.1.1.2 223.1.2.1


223.1.1.4 223.1.2.9
subnets, detach
each interface 223.1.1.3 223.1.3.27
223.1.2.2

from its host or subnet


router, creating
islands of isolated 223.1.3.1 223.1.3.2
networks
 each isolated
223.1.3.0/24
network is called
a subnet subnet mask: /24
Network Layer 4-48
Subnets 223.1.1.2

how many? 223.1.1.1 223.1.1.4

223.1.1.3

223.1.9.2 223.1.7.0

223.1.9.1 223.1.7.1
223.1.8.1 223.1.8.0

223.1.2.6 223.1.3.27

223.1.2.1 223.1.2.2 223.1.3.1 223.1.3.2

Network Layer 4-49


Type of addresses in IPv4
Network
Network address - The address by which we refer to
the network.
 E.g.: 10.0.0.0
 Broadcast address - A special address used to send
data to all hosts in the network.
 The broadcast address uses the highest address in the
network range.
 E.g.: 10.0.0.255
 Host addresses - The addresses assigned to the end
devices in the network.
 E.g.: 10.0.0.1
IP addressing: CIDR
CIDR: Classless InterDomain Routing
 subnet portion of address of arbitrary
length
 address format: a.b.c.d/x, where x is #
bits in subnet portion of address
subnet host
part part
11001000 00010111 00010000 00000000
200.23.16.0/23

Network Layer 4-51


Subnetting
 Subnetting take places when we extend the default
subnet mask.
 We cannot perform subnetting with default subnet
mask and every classes have default subnet mask.
 Now find the host bits borrowed to create subnets and
convert them in decimal.
 For example find the subnet mask of address
188.25.45.48/20 ?
1. Class B, Default Subnet mask: 255.255.0.0
2. Borrowed 4 bit from host part so mask is now:
11111111 11111111 11110000 00000000
255 255 240 0
How many subnets from given subnet
mask?
 To calculate the number of subnets provided by given
subnet mask we use 2N , where N = number of bits
borrowed from host bits to create subnets.
 For example, 192.168.1.0/27, N is 3.
 By looking at address we can determined that this
address is belong to class C and default subnet mask
255.255.255.0 [/24 in CIDR].
 In given address we borrowed 27 - 24 = 3 host bits to
create subnets.
 Now 23 = 8, so our answer is 8.
What are the valid subnets?
 Calculating valid subnet is two steps process.
 First calculate total subnet by using formula 2N.
 In second step find the block size and count from zero
in block until subnet mask value.
 For example calculate the valid subnets for
192.168.1.0/26
1. Borrowed host bits are 2 [26-24]
2. Total subnets are 22 = 4
3. Subnet mask would be 255.255.255.192
4. Block size would be 256-192 = 64
5. Start counting from zero at blocks of 64, so our valid subnets
would be 0,64,128,192
What are the total hosts?
 Total hosts are the hosts available per subnet
 To calculate total hosts use formula 2H = Total hosts
 H is the number of host bits
 For example in address 192.168.1.0/26
 We have 32 - 26
1. [Total bits in IP address - Bits consumed by network address]
=6
2. Total hosts per subnet would be 26 = 64
Network Prefixes
 For Class C, Default subnet mask of class C is
255.255.255.0
 CIDR notation of class C is /24, which means 24 bits
from IP address are already consumed by network
portion.
 We have 8 host bits remain.
 Subnetting moves from left to right. So Class C subnet
masks can only
CIDR be the following:
Decimal Binary
/25 128 10000000
/26 192 11000000
/27 224 11100000
/28 240 11110000
/29 248 11111000
/30 252 11111100
Network Prefixes- Example
 /25
 CIDR /25 has subnet mask 255.255.255.128 and 128 is
10000000 in binary.
 We used one host bit in network address.
 N = 1 [Number of host bit]
 H = 7 [Remaining host bits]
 Total subnets ( 2N ) : 21 = 2
 Block size (256 - subnet mask) :- 256 - 128 = 128
 Valid subnets ( Count blocks from 0) :- 0, 128
 Total hosts (2H) :- 27 = 128
 Valid hosts per subnet ( Total host - 2 ) :- 128 - 2 = 126
IP addresses: how to get
one?
Q: How does a host get IP address?
 hard-coded by system admin in a file
 Windows: control-panel->network->configuration->tcp/ip-
>properties
 UNIX: /etc/rc.config

 DHCP: Dynamic Host Configuration Protocol:


dynamically get address from as server
 “plug-and-play”

Network Layer 4-58


DHCP: Dynamic Host Configuration
Protocol
goal: allow host to dynamically obtain its IP address from network
server when it joins network
 can renew its lease on address in use
 allows reuse of addresses (only hold address while connected/“on”)
 support for mobile users who want to join network (more shortly)
DHCP overview:
 host broadcasts “DHCP discover” msg [optional]
 DHCP server responds with “DHCP offer” msg [optional]
 host requests IP address: “DHCP request” msg
 DHCP server sends address: “DHCP ack” msg
DHCP client-server
scenario
223.1.1.0/24
DHCP
server
223.1.1.1 223.1.2.1

223.1.1.2
223.1.1.4
arriving DHCP
223.1.2.9
client needs
address in this
223.1.2.2
223.1.1.3 223.1.3.27 network
223.1.2.0/24

223.1.3.1 223.1.3.2

223.1.3.0/24
Network Layer 4-60
DHCP client-server
scenario
DHCP server: 223.1.2.5 DHCP discover arriving
src : 0.0.0.0, 68
client
Broadcast: is there a
dest.: 255.255.255.255,67
DHCP server
yiaddr: 0.0.0.0out
transaction
there?ID: 654
DHCP offer
src: 223.1.2.5, 67
Broadcast: I’m a DHCP
dest: 255.255.255.255, 68
server!
yiaddrr:Here’s
223.1.2.4an IP
address youID:can
transaction 654 use
lifetime: 3600 secs
DHCP request
src: 0.0.0.0, 68
dest:: 255.255.255.255, 67
Broadcast: OK. I’ll
yiaddrr: 223.1.2.4
take that IPID:address!
transaction 655
lifetime: 3600 secs
DHCP ACK
src: 223.1.2.5, 67
dest: 255.255.255.255,
Broadcast: 68
OK. You’ve
yiaddrr: 223.1.2.4
gottransaction
that IPID:address!
655
lifetime: 3600 secs
Network Layer 4-61
DHCP: more than IP
addresses
DHCP can return more than just allocated
IP address on subnet:
 address of first-hop router for client
 name and IP address of DNS sever
 network mask (indicating network versus
host portion of address)

Network Layer 4-62


DHCP: example  connecting laptop needs
its IP address, addr of
first-hop router, addr of
DHCP DHCP DNS server: use DHCP
DHCP UDP
DHCP
IP
DHCP
Eth  DHCP request
Phy
DHCP encapsulated in UDP,
encapsulated in IP,
DHCP DHCP 168.1.1.1 encapsulated in 802.1
DHCP
DHCP
UDP Ethernet
DHCP
IP
router with DHCP
 Ethernet frame
Eth
Phy server built into broadcast (dest:
router FFFFFFFFFFFF) on LAN,
received at router
 Ethernet demuxed
running DHCP to
server
IP demuxed, UDP
demuxed to DHCP
Network Layer 4-63
DHCP: example
 DCP server formulates
DHCP DHCP DHCP ACK containing client
DHCP
UDP ’s IP address, IP address of
DHCP
IP first-hop router for client,
DHCP name & IP address of DNS
Eth server
Phy

DHCP DHCP  encapsulation of


DHCP UDP DHCP server, frame
DHCP
IP
DHCP
Eth router with DHCP forwarded to client,
DHCP Phy server built into demuxing up to
router DHCP at client
 client now knows its
IP address, name
and IP address of
DSN server, IP
address of its first-
hop router Network Layer 4-64
IP addresses: how to get
one?
Q: how does network get subnet part of IP
addr?
A: gets allocated portion of its provider
ISP’s address space
ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20

Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23


Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23
Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/23
... ….. …. ….
Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23

Network Layer 4-65


Hierarchical addressing: route
aggregation
erarchical addressing allows efficient advertisement of routin
formation:
Organization 0
200.23.16.0/23
Organization 1
“Send me anything
200.23.18.0/23
with addresses
Organization 2 beginning
200.23.20.0/23
... Fly-By-Night-ISP 200.23.16.0/20”

Organization 7 ... Internet

200.23.30.0/23
“Send me anything
ISPs-R-Us
with addresses
beginning
199.31.0.0/16”

Network Layer 4-66


Hierarchical addressing: more specific
routes
ISPs-R-Us has a more specific route to Organization 1

Organization 0
200.23.16.0/23

“Send me anything
with addresses
Organization 2 beginning
200.23.20.0/23
... Fly-By-Night-ISP 200.23.16.0/20”

Organization 7 ... Internet

200.23.30.0/23
“Send me anything
ISPs-R-Us
with addresses
Organization 1 beginning 199.31.0.0/16
200.23.18.0/23 or 200.23.18.0/23”

Network Layer 4-67


IP addressing: the last word...
Q: how does an ISP get block of
addresses?
A: ICANN: Internet Corporation for
Assigned
Names and Numbers
http://www.icann.org/
 allocates addresses
 manages DNS
 assigns domain names, resolves
disputes
Network Layer 4-68
NAT: network address
translation
rest of local network
Internet (e.g., home network)
10.0.0.1
10.0.0/24
10.0.0.4
10.0.0.2
138.76.29.7
10.0.0.3

all datagrams leaving datagrams with source or


local destination in this network
network have same have 10.0.0/24 address for
single source NAT IP source, destination (as usual)
address:
138.76.29.7,different Network Layer 4-69
NAT: network address
translation
motivation: local network uses just one IP
address as far as outside world is concerned:
 range of addresses not needed from ISP: just
one IP address for all devices
 can change addresses of devices in local
network without notifying outside world
 can change ISP without changing addresses
of devices in local network
 devices inside local net not explicitly
addressable, visible by outside world (a
security plus)
Network Layer 4-70
NAT: network address
translation
implementation: NAT router must:

 outgoing datagrams: replace (source IP address, port #) of


every outgoing datagram to (NAT IP address, new port #)
. . . remote clients/servers will respond using (NAT IP
address, new port #) as destination addr

 remember (in NAT translation table) every (source IP


address, port #) to (NAT IP address, new port #) translation
pair

 incoming datagrams: replace (NAT IP address, new port #) in


dest fields of every incoming datagram with corresponding
(source IP address, port #) stored in NAT table

Network Layer 4-71


NAT: network address
translation
NAT translation table
2: NAT router 1: host 10.0.0.1
WAN side addr LAN side addr sends datagram to
changes datagram
source addr from 138.76.29.7, 5001 10.0.0.1, 3345 128.119.40.186, 80
10.0.0.1, 3345 to …… ……
138.76.29.7, 5001, S: 10.0.0.1, 3345
updates table D: 128.119.40.186, 80
10.0.0.1
1
S: 138.76.29.7, 5001
2 D: 128.119.40.186, 80 10.0.0.4
10.0.0.2
138.76.29.7 S: 128.119.40.186, 80
4
D: 10.0.0.1, 3345
S: 128.119.40.186, 80
D: 138.76.29.7, 5001 3 10.0.0.3
4: NAT router
3: reply arrives changes datagram
dest. address: dest addr from
138.76.29.7, 5001 138.76.29.7, 5001 to 10.0.0.1, 3345

Network Layer 4-72


NAT: network address
translation
 16-bit port-number field:
 60,000 simultaneous connections with a
single LAN-side address!
 NAT is controversial:
 routers should only process up to layer 3
 violates end-to-end argument
• NAT possibility must be taken into account
by app designers, e.g., P2P applications
 address shortage should instead be
solved by IPv6
Network Layer 4-73
NAT traversal problem
 client wants to connect to
server with address
10.0.0.1
10.0.0.1 client
 server address 10.0.0.1 local
to LAN (client can’t use it as ?
destination addr) 10.0.0.4
 only one externally visible
138.76.29.7 NAT
NATed address: 138.76.29.7 router
 solution1: statically
configure NAT to forward
incoming connection
requests at given port to
server
 e.g., (123.76.29.7, port 2500)
always forwarded to 10.0.0.1
port 25000 Network Layer 4-74
NAT traversal problem
 solution 2: Universal Plug
and Play (UPnP) Internet
10.0.0.1
Gateway Device (IGD)
Protocol. Allows NATed IGD
host to:
 learn public IP address
(138.76.29.7) NAT
 add/remove port router
mappings (with lease
times)

i.e., automate static NAT


port map configuration

Network Layer 4-75


NAT traversal problem
 solution 3: relaying (used in Skype)
 NATed client establishes connection to
relay
 external client connects to relay
 relay bridges packets between to
connections
2. connection
to 1. connection 10.0.0.1
relay initiated to
by client relay initiated
3. relaying by NATed host
client established
138.76.29.7 NAT
router

Network Layer 4-76


Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
 hierarchical routing
4.3 what’s inside a
router 4.6 routing in the
4.4 IP: Internet Protocol Internet
  RIP
datagram format
  OSPF
IPv4 addressing
  BGP
ICMP
 IPv6 4.7 broadcast and
multicast routing

Network Layer 4-77


ICMP: internet control message
protocol
 used by hosts & routers
to communicate network- Type Code description
level information 0 0 echo reply (ping)
 error reporting: 3 0 dest. network unreachable
unreachable host, network, 3 1 dest host unreachable
port, protocol 3 2 dest protocol unreachable
 echo request/reply (used by 3 3 dest port unreachable
ping) 3 6 dest network unknown
 network-layer “above” IP: 3 7 dest host unknown
 ICMP msgs carried in IP 4 0 source quench (congestion
datagrams control - not used)
 ICMP message: type, code 8 0 echo request (ping)
plus first 8 bytes of IP 9 0 route advertisement
datagram causing error 10 0 router discovery
11 0 TTL expired
12 0 bad IP header
Network Layer 4-78
Traceroute and ICMP
 source sends series of UDP  when ICMP
segments to dest messages arrives,
 first set has TTL =1 source records RTTs
 second set has TTL=2, etc.
 unlikely port number stopping criteria:
 UDP segment
 when nth set of datagrams
arrives to nth router: eventually arrives at
 router discards datagrams destination host
 and sends source ICMP  destination returns
messages (type 11, code 0)
 ICMP messages includes
ICMP “port
name of router & IP address unreachable”
message (type 3,
3 probes
code 3)
3 probes
 source stops
3 probes
Network Layer 4-79
IPv6: motivation
 initial motivation: 32-bit address space
soon to be completely allocated.
 additional motivation:
 header format helps speed
processing/forwarding
 header changes to facilitate QoS

IPv6 datagram format:


 fixed-length 40 byte header
 no fragmentation allowed

Network Layer 4-80


IPv6 datagram format
priority: identify priority among datagrams in flow
flow Label: identify datagrams in same “flow.”
(concept of“flow” not well defined).
next header: identify upper layer protocol for data
ver pri flow label
payload len next hdr hop limit
source address
(128 bits)
destination address
(128 bits)

data

32 bits
Network Layer 4-81
Other changes from IPv4
 checksum: removed entirely to reduce
processing time at each hop
 options: allowed, but outside of header,
indicated by “Next Header” field
 ICMPv6: new version of ICMP
 additional message types, e.g. “Packet Too
Big”
 multicast group management functions

Network Layer 4-82


Transition from IPv4 to
IPv6
 not all routers can be upgraded simultaneously
 no “flag days”
 how will network operate with mixed IPv4
and IPv6 routers?
 tunneling: IPv6 datagram carried as payload in
IPv4 datagram among IPv4 routers
IPv4 header fields IPv6 header fields
IPv4 payload
IPv4 source, dest addr IPv6 source dest addr
UDP/TCP payload

IPv6 datagram
IPv4 datagram
Network Layer 4-83
Tunneling
A B IPv4 tunnel E F
connecting IPv6 routers
logical view:
IPv6 IPv6 IPv6 IPv6

A B C D E F
physical view:
IPv6 IPv6 IPv4 IPv4 IPv6 IPv6

Network Layer 4-84


Tunneling
A B IPv4 tunnel E F
connecting IPv6 routers
logical view:
IPv6 IPv6 IPv6 IPv6

A B C D E F
physical view:
IPv6 IPv6 IPv4 IPv4 IPv6 IPv6

flow: X src:B src:B flow: X


src: A dest: E src: A
dest: F
dest: E
dest: F
Flow: X Flow: X
Src: A Src: A
Dest: F Dest: F
data data

data data

A-to-B: E-to-F:
IPv6 B-to-C: B-to-C:
IPv6 inside IPv6
IPv6 inside
IPv4 IPv4 Network Layer 4-85
IPv6:
adoption
 US National Institutes of Standards
estimate [2013]:
 ~3% of industry IP routers
 ~11% of US gov’t routers

 Long (long!) time for deployment, use


 20 years and counting!
 think of application-level changes in last 20
years: WWW, Facebook, …
 Why?

Network Layer 4-86


Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
 hierarchical routing
4.3 what’s inside a
router 4.6 routing in the
4.4 IP: Internet Protocol Internet
  RIP
datagram format
  OSPF
IPv4 addressing
  BGP
ICMP
 IPv6 4.7 broadcast and
multicast routing

Network Layer 4-87


Interplay between routing,
forwarding
routing algorithm
routing algorithm determines
end-end-path through network
forwarding table determines
local forwarding table
dest address output local forwarding at this router
address-range 1 3 link
address-range 2 2
address-range 3 2
address-range 4 1

IP destination address in
arriving packet’s header
1
3 2

Network Layer 4-88


Graph abstraction
5
v 3 w
2 5
u 2 1 z
3
1 2
x y
graph: G = (N,E) 1

N = set of routers = { u, v, w, x, y, z }

E = set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) }

aside: graph abstraction is useful in other network contexts, e.g.,


P2P, where N is set of peers and E is set of TCP connections

Network Layer 4-89


Graph abstraction: costs
5 c(x,x’) = cost of link (x,x’)
v 3 w e.g., c(w,z) = 5
2 5
u 2 1 z cost could always be 1, or
3
1 inversely related to bandwidth,
x y 2
1 or inversely related to
congestion

cost of path (x1, x2, x3,…, xp) = c(x1,x2) + c(x2,x3) + … + c(xp-1,xp)

key question: what is the least-cost path between u and z


routing algorithm: algorithm that finds that least cost pat
Network Layer 4-90
Routing algorithm
classification
Q: global or decentralized
information? Q: static or dynamic?
global: static:
 all routers have complete  routes change slowly

topology, link cost info over time


 “link state” algorithms dynamic:
decentralized:  routes change more
 router knows physically- quickly
connected neighbors, link  periodic update
costs to neighbors  in response to link
 iterative process of cost changes
computation, exchange of
info with neighbors
 “distance vector” algorithms

Network Layer 4-91


Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
 hierarchical routing
4.3 what’s inside a
router 4.6 routing in the
4.4 IP: Internet Protocol Internet
  RIP
datagram format
  OSPF
IPv4 addressing
  BGP
ICMP
 IPv6 4.7 broadcast and
multicast routing

Network Layer 4-92


A Link-State Routing
Algorithm
Dijkstra’s algorithm
 net topology, link costs notation:
known to all nodes  c(x,y): link cost from
 accomplished via “link state node x to y; = ∞ if
broadcast” not direct neighbors
 all nodes have same info  D(v): current value of
 computes least cost paths cost of path from
from one node (‘source”) source to dest. v
to all other nodes  p(v): predecessor
 gives forwarding table for
that node
node along path from
source to v
 iterative: after k
iterations, know least cost
 N': set of nodes
path to k dest.’s whose least cost path
definitively known
Network Layer 4-93
Dijsktra’s Algorithm
1 Initialization:
2 N' = {u}
3 for all nodes v
4 if v adjacent to u
5 then D(v) = c(u,v)
6 else D(v) = ∞
7
8 Loop
9 find w not in N' such that D(w) is a minimum
10 add w to N'
11 update D(v) for all v adjacent to w and not in N' :
12 D(v) = min( D(v), D(w) + c(w,v) )
13 /* new cost to v is either old cost to v or known
14 shortest path cost to w plus cost from w to v */
15 until all nodes in N' Network Layer 4-94
Dijkstra’s algorithm: example
D(v) D(w) D(x) D(y) D(z)
Step N' p(v) p(w) p(x) p(y) p(z)
0 u 7,u 3,u 5,u ∞ ∞
1 uw 6,w 5,u 11,w ∞
2 uwx 6,w 11,w 14,x
3 uwxv 10,v 14,x
4 uwxvy 12,y
5 uwxvyz x
9

notes: 5
4
7
 construct shortest path
8
tree by tracing 3
predecessor nodes u w y z
2
 ties can exist (can be
broken arbitrarily) 3
7 4
v
Network Layer 4-95
Dijkstra’s algorithm: another
example
Step N' D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) D(z),p(z)
0 u 2,u 5,u 1,u ∞ ∞
1 ux 2,u 4,x 2,x ∞
2 uxy 2,u 3,y 4,y
3 uxyv 3,y 4,y
4 uxyvw 4,y
5 uxyvwz

v 3 w
2 5
u 2 1 z
3
1 2
x 1
y

Network Layer 4-96


Dijkstra’s algorithm: example
(2)
resulting shortest-path tree from u:

v w
u z
x y

resulting forwarding table in u:


destination link
v (u,v)
x (u,x)
y (u,x)
w (u,x)
z (u,x)
Network Layer 4-97
Dijkstra’s algorithm,
discussion
algorithm complexity: n nodes
 each iteration: need to check all nodes, w, not in N
 n(n+1)/2 comparisons: O(n2)
 more efficient implementations possible: O(nlogn)
oscillations possible:
 e.g., support link cost equals amount of carried traffic:

1 A 1+e 2+e A A 2+e A


0 0 2+e 0
D 0 0 B D 1+e 1 B D B D 1+e 1 B
0 0
0 e 0 0
1 C C 0 1
C 1+e C 0
1
e given these costs, given these costs, given these costs,
initially find new routing…. find new routing….find new routing…
resulting in new costs resulting in new cos
resulting in new costs
Network Layer 4-98
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
 hierarchical routing
4.3 what’s inside a
router 4.6 routing in the
4.4 IP: Internet Protocol Internet
  RIP
datagram format
  OSPF
IPv4 addressing
  BGP
ICMP
 IPv6 4.7 broadcast and
multicast routing

Network Layer 4-99


Distance vector algorithm
Bellman-Ford equation (dynamic
programming)

let
dx(y) := cost of least-cost path from x to
y v
then
cost from neighbor v to destination
dx(y) = min {c(x,v)
cost + dvv(y) }
to neighbor
min taken over all neighbors v of x
Network Layer 4-100
Bellman-Ford example
5
3 clearly, dv(z) = 5, dx(z) = 3, dw(z) = 3
v w 5
2
u 2 1B-F equation says:
z
3
1
x y 2 du(z) = min { c(u,v) + dv(z),
1
c(u,x) + dx(z),
c(u,w) + dw(z) }
= min {2 + 5,
1 + 3,
node achieving minimum is next 5 + 3} = 4
hop in shortest path, used in forwarding table
Network Layer 4-101
Distance vector algorithm
 Dx(y) = estimate of least cost from x to
y
 x maintains distance vector Dx = [Dx(y): y є
N]
 node x:
 knows cost to each neighbor v: c(x,v)
 maintains its neighbors’ distance
vectors. For each neighbor v, x
maintains
Dv = [Dv(y): y є N ]
Network Layer 4-102
Distance vector algorithm
key idea:
 from time-to-time, each node sends its own
distance vector estimate to neighbors
 when x receives new DV estimate from
neighbor, it updates its own DV using B-F
equation:
Dx(y) ← minv{c(x,v) + Dv(y)} for each node y ∊ N
 under minor, natural conditions, the

estimate Dx(y) converge to the actual


least cost dx(y)
Network Layer 4-103
Distance vector algorithm
iterative, each node:
asynchronous: each
local iteration caused
by: wait for (change in local link
 local link cost change cost or msg from neighbor)
 DV update message
from neighbor
distributed: recompute estimates
 each node notifies
neighbors only when its
DV changes if DV to any dest has
 neighbors then notify changed, notify neighbors
their neighbors if
necessary

Network Layer 4-104


Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)} Dx(z) = min{c(x,y) +
= min{2+0 , 7+1} = 2 Dy(z), c(x,z) + Dz(z)}
= min{2+1 , 7+0} = 3
node x cost to cost to
table x y z x y z
x 0 2 7 x 0 2 3

from
from

y ∞∞ ∞ y 2 0 1
z ∞∞ ∞ z 7 1 0
node y cost to
table x y z y
2 1
x ∞ ∞ ∞ x z
from

y 2 0 1 7
z ∞∞ ∞

node z cost to
table x y z
x ∞∞ ∞
from

y ∞∞ ∞
z 7 1 0
time
Network Layer 4-105
Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)} Dx(z) = min{c(x,y) +
= min{2+0 , 7+1} = 2 Dy(z), c(x,z) + Dz(z)}
= min{2+1 , 7+0} = 3
node x cost to cost to cost to
table x y z x y z x y z
x 0 2 7 x 0 2 3 x 0 2 3

from
from

y y 2 0 1

from
∞∞ ∞ y 2 0 1
z ∞∞ ∞ z 7 1 0 z 3 1 0
node y cost to cost to cost to
table x y z x y z x y z y
2 1
x ∞ ∞ ∞ x 0 2 7 x 0 2 3 x z
from

from

y y 2 7

from
2 0 1 0 1 y 2 0 1
z ∞∞ ∞ z 7 1 0 z 3 1 0

node z cost to cost to cost to


table x y z x y z x y z
x ∞∞ ∞ x 0 2 7 x 0 2 3
from

y 2 from y 2 0 1
from

y ∞∞ ∞ 0 1
z z 3 1 0 z 3 1 0
7 1 0
time
Network Layer 4-106
Distance vector: link cost
changes
link cost changes:
 node detects local link cost 1
change 4
y
1
 updates routing info,
x z
recalculates 50
distance vector
 if DV changes, notify
neighbors
t0 : y detects link-cost change, updates its DV, informs its
“good neighbors.
news t1 : z receives update from y, updates its table, computes new
travels least cost to x , sends its neighbors its DV.
fast” t : y receives z’s update, updates its distance table. y’s least costs
2
do not change, so y does not send a message to z.

Network Layer 4-107


Distance vector: link cost
changes
link cost changes:
 node detects local link cost 60
change 4
y
1
 bad news travels slow -
x z
50
“count to infinity” problem!
 44 iterations before
algorithm stabilizes: see
poisoned
text reverse:
 If Z routes through Y to get to X :
 Z tells Y its (Z’s) distance to X is infinite (so Y
won’t route to X via Z)
 will this completely solve count to infinity
problem?
Network Layer 4-108
Comparison of LS and DV
algorithms
message complexity robustness: what happens
 LS: with n nodes, E links, if router malfunctions?
O(nE) msgs sent LS:
 DV: exchange between  node can advertise
neighbors only incorrect link cost
 convergence time varies  each node computes only
its own table
speed of convergence DV:
 LS: O(n2) algorithm requires  DV node can advertise
O(nE) msgs
 may have oscillations incorrect path cost
 each node’s table used by
 DV: convergence time varies others
 may be routing loops
• error propagate thru
 count-to-infinity problem network

Network Layer 4-109


Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
 hierarchical routing
4.3 what’s inside a
router 4.6 routing in the
4.4 IP: Internet Protocol Internet
  RIP
datagram format
  OSPF
IPv4 addressing
  BGP
ICMP
 IPv6 4.7 broadcast and
multicast routing

Network Layer 4-110


Hierarchical
routing
our routing study thus far -
idealization
 all routers identical
 network “flat”

… not true in practice


scale: with 600 million administrative
destinations: autonomy
 can’t store all dest’s in  internet = network of
routing tables! networks
 routing table exchange  each network admin may
would swamp links! want to control routing in
its own network

Network Layer 4-111


Hierarchical
routing
 aggregate routers into gateway router:
regions, “autonomous
systems” (AS)
 at “edge” of its own
AS
 routers in same AS
run same routing
 has link to router in
protocol another AS
 “intra-AS” routing
protocol
 routers in different AS
can run different intra-
AS routing protocol

Network Layer 4-112


Interconnected ASes
3c
3a 2c
3b AS3 2a
1c 2b
AS2
1a 1b AS1
1d  forwarding table
configured by both
intra- and inter-AS
Intra-AS
Routing
Inter-AS
Routing
routing algorithm
algorithm algorithm  intra-AS sets entries
Forwarding for internal dests
table
 inter-AS & intra-AS
sets entries for
external dests
Network Layer 4-113
Inter-AS tasks
 suppose router in AS1 must:
AS1 receives 1. learn which dests
datagram destined are reachable
outside of AS1: through AS2, which
 router should through AS3
forward packet to 2. propagate this
gateway router, reachability info to
but which one? all routers in AS1
3c job of inter-AS routing!
3a
3b 2c
AS3 1c
other
2a networks
other 1a 2b
networks 1b AS2
AS1 1d
Network Layer 4-114
Example: setting forwarding table in
router 1d
 suppose AS1 learns (via inter-AS protocol) that
subnet x reachable via AS3 (gateway 1c), but not
via AS2
 inter-AS protocol propagates reachability info to
all internal routers
 router 1d determines from intra-AS routing info that
its interface I is on the least cost path to 1c
 installs forwarding table entry (x,I)
3c … x
3a
3b 2c
AS3 1c
other
2a networks
other 1a 2b
networks 1b AS2
AS1 1d
Network Layer 4-115
Example: choosing among multiple
ASes
 now suppose AS1 learns from inter-AS protocol
that subnet x is reachable from AS3 and from
AS2.
 to configure forwarding table, router 1d must
determine which gateway it should forward
packets towards for dest x
 this is also job of inter-AS routing protocol!
… x …
3b
3c
3a …
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d
?
Network Layer 4-116
Example: choosing among multiple
ASes
 now suppose AS1 learns from inter-AS protocol that
subnet x is reachable from AS3 and from AS2.
 to configure forwarding table, router 1d must
determine towards which gateway it should forward
packets for dest x
 this is also job of inter-AS routing protocol!
 hot potato routing: send packet towards closest of
two routers.

use routing info determine from


learn from inter-AS hot potato routing: forwarding table the
from intra-AS
protocol that subnet choose the gateway interface I that leads
protocol to determine
x is reachable via that has the to least-cost gateway.
costs of least-cost
multiple gateways smallest least cost Enter (x,I) in
paths to each
of the gateways forwarding table

Network Layer 4-117


Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
 hierarchical routing
4.3 what’s inside a
router 4.6 routing in the
4.4 IP: Internet Protocol Internet
  RIP
datagram format
  OSPF
IPv4 addressing
  BGP
ICMP
 IPv6 4.7 broadcast and
multicast routing

Network Layer 4-118


Intra-AS Routing
 also known as interior gateway
protocols (IGP)
 most common intra-AS routing
protocols:
 RIP: Routing Information Protocol
 OSPF: Open Shortest Path First
 IGRP: Interior Gateway Routing
Protocol (Cisco proprietary)

Network Layer 4-119


RIP ( Routing Information
Protocol)
 included in BSD-UNIX distribution in 1982
 distance vector algorithm
 distance metric: # hops (max = 15 hops), each link has cost 1
 DVs exchanged with neighbors every 30 sec in response message (aka
advertisement)
 each advertisement: list of up to 25 destination subnets (in IP addressing sense)

from router A to destination subnets:


u v subnet hops
A B w u 1
v 2
w 2
x
z C D x 3
y y 3
z 2 Network Layer 4-120
RIP: example

z
w x y
A D B

C
routing table in router D
destination subnet next router # hops to dest
w A 2
y B 2
z B 7
x -- 1
…. …. ....
Network Layer 4-121
RIP: example
A-to-D advertisement
dest next hops
w - 1
x - 1
z C 4
…. … ... z
w x y
A D B

C
routing table in router D
destination subnet next router # hops to dest
w A 2
y B A 2 5
z B 7
x -- 1
…. …. ....
Network Layer 4-122
RIP: link failure, recovery
if no advertisement heard after 180 sec -->
neighbor/link declared dead
 routes via neighbor invalidated
 new advertisements sent to neighbors
 neighbors in turn send out new advertisements
(if tables changed)
 link failure info quickly (?) propagates to entire
net
 poison reverse used to prevent ping-pong
loops (infinite distance = 16 hops)

Network Layer 4-123


RIP table processing
 RIP routing tables managed by
application-level process called route-d
(daemon)
 advertisements sent in UDP packets,
periodically
routed repeated routed

transport transprt
(UDP) (UDP)
network forwarding forwarding network
(IP) table table (IP)
link link
physical physical

Network Layer 4-124


OSPF (Open Shortest Path
First)
 “open”: publicly available
 uses link state algorithm
 LS packet dissemination
 topology map at each node
 route computation using Dijkstra’s algorithm
 OSPF advertisement carries one entry per
neighbor
 advertisements flooded to entire AS
 carried in OSPF messages directly over IP (rather
than TCP or UDP
 IS-IS routing protocol: nearly identical to
OSPF Network Layer 4-125
OSPF “advanced” features (not
in RIP)
 security: all OSPF messages authenticated
(to prevent malicious intrusion)
 multiple same-cost paths allowed (only one
path in RIP)
 for each link, multiple cost metrics for
different TOS (e.g., satellite link cost set “low
” for best effort ToS; high for real time ToS)
 integrated uni- and multicast support:
 Multicast OSPF (MOSPF) uses same
topology data base as OSPF
 hierarchical OSPF in large domains.
Network Layer 4-126
Hierarchical
OSPF boundary router
backbone router

backbone
area
border
routers

area 3

internal
area 1 routers
area 2

Network Layer 4-127


Hierarchical
OSPF
 two-level hierarchy: local area, backbone.
 link-state advertisements only in area
 each nodes has detailed area topology; only know
direction (shortest path) to nets in other areas.
 area border routers: “summarize” distances to nets
in own area, advertise to other Area Border routers.
 backbone routers: run OSPF routing limited to
backbone.
 boundary routers: connect to other AS’s.

Network Layer 4-128


Internet inter-AS routing: BGP
 BGP (Border Gateway Protocol): the de
facto inter-domain routing protocol
 “glue that holds the Internet together”
 BGP provides each AS a means to:
 eBGP: obtain subnet reachability information
from neighboring ASs.
 iBGP: propagate reachability information to all
AS-internal routers.
 determine “good” routes to other networks
based on reachability information and policy.
 allows subnet to advertise its existence to
rest of Internet: “I am here”
Network Layer 4-129
BGP basics
 BGP session: two BGP routers (“peers”) exchange
BGP messages:
 advertising paths to different destination network prefixes
(“path vector” protocol)
 exchanged over semi-permanent TCP connections
 when AS3 advertises a prefix to AS1:
 AS3 promises it will forward datagrams towards that prefix
 AS3 can aggregate prefixes in its advertisement

3c
BGP
3a message
3b 2c
AS3 1c
other
2a networks
other 1a 2b
networks 1b AS2
AS1 1d
Network Layer 4-130
BGP basics: distributing path
information
 using eBGP session between 3a and 1c, AS3 sends prefix
reachability info to AS1.
 1c can then use iBGP do distribute new prefix info to all routers in
AS1
 1b can then re-advertise new reachability info to AS2 over 1b-to-2a
eBGP session
 when router learns of new prefix, it creates entry for prefix
in its forwarding table.

eBGP session
3a iBGP session
3b 2c
AS3 1c
other
2a networks
other 1a 2b
networks 1b AS2
AS1 1d
Network Layer 4-131
Path attributes and BGP
routes
 advertised prefix includes BGP attributes
 prefix + attributes = “route”
 two important attributes:
 AS-PATH: contains ASs through which prefix
advertisement has passed: e.g., AS 67, AS 17
 NEXT-HOP: indicates specific internal-AS router to
next-hop AS. (may be multiple links from current AS
to next-hop-AS)
 gateway router receiving route advertisement
uses import policy to accept/decline
 e.g., never route through AS x
 policy-based routing

Network Layer 4-132


BGP route selection
 router may learn about more than 1
route to destination AS, selects route
based on:
1. local preference value attribute: policy
decision
2. shortest AS-PATH
3. closest NEXT-HOP router: hot potato
routing
4. additional criteria

Network Layer 4-133


BGP messages
 BGP messages exchanged between peers over
TCP connection
 BGP messages:
 OPEN: opens TCP connection to peer and
authenticates sender
 UPDATE: advertises new path (or withdraws
old)
 KEEPALIVE: keeps connection alive in absence
of UPDATES; also ACKs OPEN request
 NOTIFICATION: reports errors in previous msg;
also used to close connection

Network Layer 4-134


Putting it Altogether:
How Does an Entry Get
Into a Router’s Forwarding
Table?
 Answer is complicated!

 Ties together hierarchical routing (Section


4.5.3) with BGP (4.6.3) and OSPF (4.6.2).

 Provides nice overview of BGP!


How does entry get in forwarding
table?

routing algorithms
Assume prefix is
entry
local forwarding table
prefix output port
in another AS.
138.16.64/22 3
124.12/16 2
212/8 4
………….. …

Dest IP 1
3 2
How does entry get in forwarding
table?

High-level overview
1. Router becomes aware of prefix
2. Router determines output port for prefix
3. Router enters prefix-port in forwarding
table
Router becomes aware of
prefix
3c
BGP
3a message
3b 2c
AS3 1c
other
2a networks
other 1a 2b
networks 1b AS2
AS1 1d

 BGP message contains “routes”


 “route” is a prefix and attributes: AS-PATH, NEXT-HOP,

 Example: route:
 Prefix:138.16.64/22 ; AS-PATH: AS3 AS131 ;

NEXT-HOP: 201.44.13.125
Router may receive multiple
routes
3c
BGP
3a message
3b 2c
AS3 1c
other
2a networks
other 1a 2b
networks 1b AS2
AS1 1d

 Router may receive multiple routes for


same prefix
 Has to select one route
Select best BGP route to
prefix
 Router selects route based on shortest
AS-PATH
 Example: select

 AS2 AS17 to 138.16.64/22


 AS3 AS131 AS201 to 138.16.64/22

 What if there is a tie? We’ll come back


to that!
Find best intra-route to BGP
route
 Use selected route’s NEXT-HOP attribute
 Route’s NEXT-HOP attribute is the IP address of
the router interface that begins the AS PATH.
 Example:
 AS-PATH: AS2 AS17 ; NEXT-HOP:
111.99.86.55
 Router uses OSPF to find shortest path
from 1c
3c to 111.99.86.55
3a 111.99.86.55
3b 2c
AS3 1c
other
2a networks
other 1a 2b
networks 1b AS2
AS1 1d
Router identifies port for
route
 Identifies port along the OSPF shortest path
 Adds prefix-port entry to its forwarding table:
 (138.16.64/22 , port 4)

3c router
3a port
3b 2c
AS3 1 other
2 1c
4 2a networks
other 3 2b
networks
1a 1b
1d AS2
AS1
Hot Potato Routing
 Suppose there two or more best inter-
routes.
 Then choose route with closest NEXT-HOP
 Use OSPF to determine which gateway is
closest
 Q: From 1c, chose AS3 AS131 or AS2 AS17?
 A: route
3c AS3 AS201 since it is closer
3a
3b 2c
AS3 1c
other
2a networks
other 1a 2b
networks 1b AS2
AS1 1d
How does entry get in forwarding
table?
Summary
1. Router becomes aware of prefix
 via BGP route advertisements from other routers
2. Determine router output port for prefix
 Use BGP route selection to find best inter-AS route
 Use OSPF to find best intra-AS route leading to
best inter-AS route
 Router identifies router port for that best route
3. Enter prefix-port entry in forwarding table
BGP routing policy
legend: provider
B network
X
W A
customer
C network:
Y

 A,B,C are provider networks


 X,W,Y are customer (of provider networks)
 X is dual-homed: attached to two networks
 X does not want to route from B via X to C
 .. so X will not advertise to B a route to C

Network Layer 4-145


BGP routing policy (2)
legend: provider
B network
X
W A
customer
C network:
Y
 A advertises path AW to B
 B advertises path BAW to X
 Should B advertise path BAW to C?
 No way! B gets no “revenue” for routing CBAW since
neither W nor C are B’s customers
 B wants to force C to route to w via A
 B wants to route only to/from its customers!

Network Layer 4-146


Why different Intra-, Inter-AS routing
?
policy:
 inter-AS: admin wants control over how its traffic
routed, who routes through its net.
 intra-AS: single admin, so no policy decisions
needed
scale:
 hierarchical routing saves table size, reduced
update traffic
performance:
 intra-AS: can focus on performance
 inter-AS: policy may dominate over performance

Network Layer 4-147


Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
 hierarchical routing
4.3 what’s inside a
router 4.6 routing in the
4.4 IP: Internet Protocol Internet
  RIP
datagram format
  OSPF
IPv4 addressing
  BGP
ICMP
 IPv6 4.7 broadcast and
multicast routing

Network Layer 4-148


Broadcast routing
 deliver packets from source to all other nodes
 source duplication is inefficient:
duplicate
duplicate R1 creation/transmission R1
duplicate
R2 R2

R3 R4 R3 R4

source in-network
duplication duplication
 source duplication: how does source
determine recipient addresses?
Network Layer 4-149
In-network duplication
 flooding: when node receives broadcast
packet, sends copy to all neighbors
 problems: cycles & broadcast storm
 controlled flooding: node only broadcasts pkt
if it hasn’t broadcast same packet before
 node keeps track of packet ids already broadacsted
 or reverse path forwarding (RPF): only forward
packet if it arrived on shortest path between node
and source
 spanning tree:
 no redundant packets received by any node

Network Layer 4-150


Spanning tree
 first construct a spanning tree
 nodes then forward/make copies only
along spanning tree
A A

B B
c c

D D
F E F E
G G
(a) broadcast initiated at A (b) broadcast initiated at D

Network Layer 4-151


Spanning tree: creation
 center node
 each node sends unicast join message to
center node
 message forwarded until it arrives at a node
already belonging to spanning tree
A A
3
B B
c c
4
2
D D
F E F E
1 5
G G
(a) stepwise construction of (b) constructed spanning
spanning tree (center: E) tree
Network Layer 4-152
Multicast routing: problem
statement
goal: find a tree (or trees) connecting routers
having local mcast group members legend
 tree: not all paths between routers used group
member
 shared-tree: same tree used by all group members not group
 source-based: different tree from each member
sender to rcvrs router
with a
group
member
router
without
group
member

shared tree source-based trees


Network Layer 4-153
Approaches for building mcast
trees
approaches:
 source-based tree: one tree per source
 shortest path trees
 reverse path forwarding
 group-shared tree: group uses one tree
 minimal spanning (Steiner)
 center-based trees

…we first look at basic approaches, then specific protocols


adopting these approaches

Network Layer 4-154


Shortest path tree
 mcast forwarding tree: tree of shortest
path routes from source to all receivers
 Dijkstra’s algorithm
s: source LEGEND
R1 2 router with attached
1 R4
group member
R2 5 router with no attached
3 4 group member
R5
6 i link used for forwarding,
R3 i indicates order link
R6 R7
added by algorithm

Network Layer 4-155


Reverse path forwarding
 rely on router’s knowledge of unicast
shortest path from it to sender
 each router has simple forwarding
behavior:
if (mcast datagram received on incoming link
on shortest path back to center)
then flood datagram onto all outgoing links
else ignore datagram

Network Layer 4-156


Reverse path forwarding:
example
s: source LEGEND
R1
R4 router with attached
group member
R2
router with no attached
R5 group member
R3 datagram will be forwarded
R6 R7
datagram will not be
forwarded

 result is a source-specific reverse SPT


 may be a bad choice with asymmetric
links Network Layer 4-157
Reverse path forwarding:
pruning
 forwarding tree contains subtrees with no mcast
group members
 no need to forward datagrams down subtree
 “prune” msgs sent upstream by router with
no downstream group members
s: source
R1 LEGEND
R4
router with attached
R2 group member
P
router with no attached
R5 group member
P
R3 P prune message
R6 links with multicast
R7 forwarding

Network Layer 4-158


Shared-tree: steiner tree
 steiner tree: minimum cost tree
connecting all routers with attached
group members
 problem is NP-complete
 excellent heuristics exists
 not used in practice:
 computational complexity
 information about entire network needed
 monolithic: rerun whenever a router needs
to join/leave
Network Layer 4-159
Center-based trees
 single delivery tree shared by all
 one router identified as “center” of tree
 to join:
 edge router sends unicast join-msg
addressed to center router
 join-msg “processed” by intermediate
routers and forwarded towards center
 join-msg either hits existing tree branch for
this center, or arrives at center
 path taken by join-msg becomes new
branch of tree for this router
Network Layer 4-160
Center-based trees:
example
suppose R6 chosen as center:
LEGEND
R1 router with attached
R4
3 group member
R2 router with no attached
2 group member
R5 1
path order in which join
R3 messages generated
1 R6
R7

Network Layer 4-161


Internet Multicasting Routing:
DVMRP
 DVMRP: distance vector multicast routing
protocol, RFC1075
 flood and prune: reverse path
forwarding, source-based tree
 RPF tree based on DVMRP’s own routing
tables constructed by communicating DVMRP
routers
 no assumptions about underlying unicast
 initial datagram to mcast group flooded
everywhere via RPF
 routers not wanting group: send upstream
prune msgs
Network Layer 4-162
DVMRP: continued…
 soft state: DVMRP router periodically (1
min.) “forgets” branches are pruned:
 mcast data again flows down unpruned branch
 downstream router: reprune or else continue
to receive data
 routers can quickly regraft to tree
 following IGMP join at leaf
 odds and ends
 commonly implemented in commercial router

Network Layer 4-163


Tunneling
Q: how to connect “islands” of multicast
routers in a “sea” of unicast routers?

physical topology logical topology


 mcast datagram encapsulated inside “normal”
(non-multicast-addressed) datagram
 normal IP datagram sent thru “tunnel” via
regular IP unicast to receiving mcast router
(recall IPv6 inside IPv4 tunneling)
 receiving mcast router unencapsulates to get
mcast datagram Network Layer 4-164
PIM: Protocol Independent
Multicast
 not dependent on any specific underlying
unicast routing algorithm (works with all)
 two different multicast distribution scenarios :

dense: sparse:
 group members  # networks with group
densely packed, in members small wrt #
“close” proximity. interconnected
networks
 bandwidth more
plentiful  group members “widely
dispersed”
 bandwidth not plentiful
Network Layer 4-165
Consequences of sparse-dense
dichotomy:
dense sparse:
 group membership by  no membership until
routers assumed until routers explicitly join
routers explicitly  receiver- driven
prune construction of mcast
 data-driven tree (e.g., center-
construction on mcast based)
tree (e.g., RPF)  bandwidth and non-
 bandwidth and non- group-router
group-router processing
processing profligate conservative

Network Layer 4-166


PIM- dense mode
flood-and-prune RPF: similar to
DVMRP but…
 underlying unicast protocol provides
RPF info for incoming datagram
 less complicated (less efficient)
downstream flood than DVMRP
reduces reliance on underlying
routing algorithm
 has protocol mechanism for router to
detect it is a leaf-node router
Network Layer 4-167
PIM - sparse
mode
 center-based approach
 router sends join msg to R1
rendezvous point (RP) R4
join
 intermediate routers R2
update state and join
forward join R5
 after joining via RP, join
R3
router can switch to R6
source-specific tree R7
all data multicast rendezvous
 increased
from rendezvous point
performance: less point
concentration,
shorter paths

Network Layer 4-168


PIM - sparse
mode
sender(s):
 unicast data to RP, R1
which distributes join
R4
down RP-rooted tree
R2
 RP can extend join
mcast tree join
R5
upstream to source R3
R6
 RP can send stop R7
msg if no attached all data multicast rendezvous
from rendezvous
receivers point
point
 “no one is listening!”

Network Layer 4-169

You might also like