0% found this document useful (0 votes)

15 views5 pages

L Net

2020 version of 6.S081

Uploaded by

memoko8574

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views5 pages

L Net

2020 version of 6.S081

Uploaded by

memoko8574

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 5

6.

S081 2020 Lecture 21: Networking

topics
packet formats and protocols
software stack in kernel
today's paper -- livelock

overall network architecture

diagram: apps, host, NIC, LAN, NIC, host, apps
diagram: hosts, LAN, router, ..., LAN, hosts

ethernet packet format

[printout of kernel/net.h, struct eth]
start "flag"
destination ethernet address -- 48 bits
source ethernet address -- 48 bits
ether type -- 16 bits
payload
end "flag"

[eth tcpdump output]

ethernet addresses
a simple ethernet LAN broadcasts: all host NICs see all packets
a host uses dest addr to decide if the packet is really for it
today's ethernet LAN is a switch, with a cable to each host
the switch uses the dest addr to decide which host to send a packet to
ethernet addresses are assigned by NIC manufacturer
<24-bit manufacturer ID, 24-bit serial number>

ARP
[kernel/net.h, struct arp]
a request/response protocol to translate IP address to ethernet address
"nested" inside an ethernet packet, with ether type 0x0806
ARP header indicates request vs response
request: desired IP address
request packets are broadcast to every host on a switch
all the hosts process the packet, but only owner of that IP addr responds
response: the corresponding ethernet address

[arp tcpdump output]

note:
the habit is to "nest" higher-level packets inside lower-level packets
e.g. ARP inside ethernet
so you often see a sequence of headers (ether, then ARP)
one layer's payload is the next layer's header

the ethernet header is enough to get a packet to a local host

but more is needed to route the packet to a distant Internet host

IP header
[kernel/net.h, struct ip]
ether type 0x0800
lots of stuff, but addresses are the most critical
a 32-bit IP address is enough to route to any Internet computer
the high bits contain a "network number" that helps routers
understand how to forward through the Internet
the protocol number tells the destination what to do with the packet
i.e. which higher-level protocol to hand it to (usually UDP or TCP)

[ip tcpdump output]

UDP header
[kernel/net.h, struct udp]
once a packet is at the right host, what app should it go to?
UDP header sits inside IP packet
contains src and dst port numbers
an application uses the "socket API" system calls to tell
the kernel it would like to receive all packets sent to a particular port
some ports are "well known", e.g. port 53 is reserved for DNS server
others are allocated as-needed for the client ends of connections
after the UDP header: payload, e.g. DNS request or response

[udp tcpdump output]

TCP
like UDP but has sequence number fields so it can retransmit lost packets,
and match send rate to network and destination capacity

layering view of a typical kernel network stack

apps e.g. web browser, DNS server
socket, port->fd table
UDP | TCP
IP, routing table | ARP, table
NIC drivers
-- packet buffers w/ allocator (see struct mbuf in net.h)
-- each layer parses, validates, and strips headers on the way in
discards packet if there's a problem
-- each layer prepends a header on the way out
-- software layer structure partially follows header nesting

control flow view of a typical kernel network stack

multiple independent actors
each has input packets, processes them, produces output
here's a typical setup (much variation; lab setup is simpler)
NICs, with internal or DMA buffering
rx interrupt handler, copies from NIC to s/w input queue
tx interrupt handler, copies from s/w output queue to NIC
"network thread"
reads packets from s/w input queue
what to do with it? ARP, forward, put on socket queue
applications -- read from per-socket queue

why all these queues of buffers?

absorb temporary input bursts
keep output NIC busy while computing
allow independent control flow (NICs vs network thread vs apps)

other arrangements are possible and sometimes much better

e.g. user-level stack
e.g. direct user access to NIC (see Intel's DPDK)
e.g. polling, as in today's paper

NIC packet buffering

paper's NIC queues packets in its own memory
driver s/w must copy to RAM
lab assignment's NIC (the Intel e1000) DMAs into host RAM
s/w prepares a "ring" of buffer pointers for rx, and tx
NIC DMAs each packet to memory pointed to by successive ring elements
why: DMA is faster than s/w copy loop
DMA can go on concurrently with compute tasks

Today's paper: Eliminating Receive Livelock in an Interrupt-Driven Kernel,

by Mogul and Ramakrishnan

Why are we reading this paper?

To illustrate some tradeoffs in kernel network stack structure
It's a famous and influential paper
Livelock / congestion collapse comes up in many situations

Explain Figure 6-1

This is the original system, without the authors' fixes.
Why does it go up?
What determines how high the peak is?
peak @ 5000 => 200 us/pkt.
Why does it go down?
What determines how fast it goes down?
What happens to the packets that aren't forwarded?

A disk uses interrupts -- would a disk cause this kind of problem?

How about the UART?
How about a host receiving TCP traffic?

Why not completely process each packet in the interrupt handler?

I.e. forward it?
(this is what the network lab does)
(still need out Q. starve tx intr. starve other devs' rx intrs.
no story for user processing.)

Why not always poll, never use interrupts?

Overall goal:
once we've started to spend CPU on a packet, make sure we finish that packet!
i.e. avoid partially processing, then discarding due to overload
for special case of forwarding:
give output priority over input
interrupts allow us no control

What's the paper's solution?

No IP input queue
NIC receive interrupt just wakes up thread
Then leaves interrupts *disabled* for that NIC
Thread does all processing,
re-checks NIC for more input,
only re-enables interrupts if no input waiting

NIC intr:
wake up net thread (but don't read any packets)
disable NIC interrupts

loop:
if NIC packets waiting
read a few packets from NIC
process each packet
(this is the polling part)
else
enable interrupts
sleep

What happens when packets arrive too fast?

Why does this help avoid livelock?

What happens when packets arrive slowly?

Modern Linux uses a scheme -- NAPI -- inspired by this paper.

Explain Figure 6-3

This graph includes their system.
Why do the empty squares level off?
What happens to the excess packets?
Why does "Polling (no quota)" work badly?
Input still starves xmit-complete processing
Why does it immediately fall to zero, rather than gradually decreasing?
Livelock is made worse by doing even more processing before discard
I.e. each excess rx pkt consumes many tx pkts of CPU time

Explain Figure 6-4

(this is with every packet going through a user-level program)
Why does "Polling, no feedback" behave badly?
There's a queue in front of screend
We can still give 100% to input thread, 0% to screend
Why does "Polling w/ feedback" behave well?
Input thread sleeps when queue to screend near-full
Wakes up when queue near-empty

What would happen if screend hung?

Big picture: polling loop is a place to exert scheduling control

Why are the two solutions different?

1. Polling thread *with quotas*
2. Feedback from full queue
Perhaps they could have used #2 for both
Feedback doesn't require magic numbers
But hard to retro-fit into existing UNIX structure

What if processing has more complex structure?

Chain of processing stages with queues?
Does feedback work?
What happens when a late stage is slow?
Split at some point, multiple parallel paths?
No so great; one slow path blocks all paths

Can we formulate any general principles?

Don't spend time on new work before completing existing work
Design so that efficiency increases with load,
rather than decreasing. E.g. the paper's switch from
interrupts to polling under high load.

Similar phenomena arise in other areas of systems

Timeout + retransmission in networks, as number of connections grows
Spin-locks, as number of cores grows

A general lesson: complex (multi-stage) systems may need careful

scheduling of resources if they are to survive loads close to
capacity

Linux Kernel Networking Overview
No ratings yet
Linux Kernel Networking Overview
25 pages
12.7 Ingenieria de Protocolos
No ratings yet
12.7 Ingenieria de Protocolos
23 pages
CS144 Lab 4: Network Interface Implementation
100% (1)
CS144 Lab 4: Network Interface Implementation
7 pages
Mogul 96 Usenix
No ratings yet
Mogul 96 Usenix
14 pages
Raids and Availability
No ratings yet
Raids and Availability
3 pages
Topic 2
No ratings yet
Topic 2
47 pages
Recap: Networking Choices
No ratings yet
Recap: Networking Choices
3 pages
TCP/IP Protocol Architecture Overview
No ratings yet
TCP/IP Protocol Architecture Overview
99 pages
Cisco Interview Questions
No ratings yet
Cisco Interview Questions
8 pages
CN Solved Put Acem
No ratings yet
CN Solved Put Acem
38 pages
Lecture 3
No ratings yet
Lecture 3
25 pages
Implementing Network Interface in CS144
No ratings yet
Implementing Network Interface in CS144
8 pages
Lecture 2
No ratings yet
Lecture 2
66 pages
Curso CCNA
No ratings yet
Curso CCNA
103 pages
ICMP Protocol and Connectionless Delivery
100% (1)
ICMP Protocol and Connectionless Delivery
77 pages
Modernizing The BSD Networking Stack: Dennis Ferguson
No ratings yet
Modernizing The BSD Networking Stack: Dennis Ferguson
36 pages
Stanford CS144 Networking Course Overview
No ratings yet
Stanford CS144 Networking Course Overview
38 pages
Chapter 2 TCP-IP
No ratings yet
Chapter 2 TCP-IP
29 pages
TCP/IP Networking Fundamentals Overview
No ratings yet
TCP/IP Networking Fundamentals Overview
187 pages
Chapter6 Physical-Trang-3
No ratings yet
Chapter6 Physical-Trang-3
35 pages
Unit I Network Architecture: Presentation By: Kaythry P. Assistant Professor, ECE SSN College of Engineering
No ratings yet
Unit I Network Architecture: Presentation By: Kaythry P. Assistant Professor, ECE SSN College of Engineering
44 pages
Top Half Bottom Half
No ratings yet
Top Half Bottom Half
13 pages
Networking Essentials Exam Notes
100% (1)
Networking Essentials Exam Notes
6 pages
Noorul Islam College of Engineering, Kumaracoil: Department of Information Technology
No ratings yet
Noorul Islam College of Engineering, Kumaracoil: Department of Information Technology
20 pages
MHVLUG 2017-04 Network Receive Stack
No ratings yet
MHVLUG 2017-04 Network Receive Stack
43 pages
Layer 2: Data Framing For Fun and Profit
No ratings yet
Layer 2: Data Framing For Fun and Profit
29 pages
Week10.1 Final
No ratings yet
Week10.1 Final
79 pages
Module 1 DLC-chapter11
No ratings yet
Module 1 DLC-chapter11
82 pages
Linux Kernel Networking: Sockets Explained
No ratings yet
Linux Kernel Networking: Sockets Explained
98 pages
Linux Kernel Networking: Socket System Calls
No ratings yet
Linux Kernel Networking: Socket System Calls
98 pages
Optimizing Packet Reception in IDS
No ratings yet
Optimizing Packet Reception in IDS
36 pages
CCIE RS Quick Review Kit
No ratings yet
CCIE RS Quick Review Kit
63 pages
IP Basics and Routing Protocols PDF
No ratings yet
IP Basics and Routing Protocols PDF
241 pages
1 10 Lwip
No ratings yet
1 10 Lwip
25 pages
Questions
No ratings yet
Questions
25 pages
Network Layers Explained
No ratings yet
Network Layers Explained
13 pages
Ethernet Protocols & MAC Addressing
No ratings yet
Ethernet Protocols & MAC Addressing
4 pages
Library Internals for Developers
No ratings yet
Library Internals for Developers
28 pages
A. Physical
No ratings yet
A. Physical
17 pages
Advanced Networking for Developers
No ratings yet
Advanced Networking for Developers
42 pages
Computer Network Layers CIS748 Class Notes
No ratings yet
Computer Network Layers CIS748 Class Notes
10 pages
Understanding TCP/IP Link Layer Functions
No ratings yet
Understanding TCP/IP Link Layer Functions
6 pages
1-Introduction To Computer Networks
No ratings yet
1-Introduction To Computer Networks
39 pages
02 Protocols
No ratings yet
02 Protocols
10 pages
L Organization
No ratings yet
L Organization
6 pages
6S081 Intro To C Fa21
No ratings yet
6S081 Intro To C Fa21
66 pages
L Lockv2
No ratings yet
L Lockv2
4 pages
L Meltdown
No ratings yet
L Meltdown
4 pages
L Journal
No ratings yet
L Journal
7 pages
Microkernel
No ratings yet
Microkernel
15 pages
L Fs
No ratings yet
L Fs
6 pages
Cs221 Section2 Problems
No ratings yet
Cs221 Section2 Problems
5 pages
L VMM
No ratings yet
L VMM
6 pages
Cs221 LEC 3 Slides
No ratings yet
Cs221 LEC 3 Slides
43 pages
cs221 - LEC 6-Slides
No ratings yet
cs221 - LEC 6-Slides
59 pages
cs221 Section1 Solutions
No ratings yet
cs221 Section1 Solutions
11 pages
Cs221 LEC 2 Slides
No ratings yet
Cs221 LEC 2 Slides
37 pages
Cs221 LEC 4 Slides
No ratings yet
Cs221 LEC 4 Slides
73 pages
Cs221 LEC 1 Slides
No ratings yet
Cs221 LEC 1 Slides
23 pages
Cs221 Final Review p2
No ratings yet
Cs221 Final Review p2
23 pages
Configuring VXLAN BGP EVPN
100% (3)
Configuring VXLAN BGP EVPN
13 pages
Ir85 Argtp 2022-10-20
No ratings yet
Ir85 Argtp 2022-10-20
110 pages
Ccna Voice
No ratings yet
Ccna Voice
169 pages
Data Communications V2-Update 3
No ratings yet
Data Communications V2-Update 3
2 pages
Completing An Isdn Bri Call: © 2000, Cisco Systems, Inc
No ratings yet
Completing An Isdn Bri Call: © 2000, Cisco Systems, Inc
34 pages
Configuring Cisco Session Border Controllers
No ratings yet
Configuring Cisco Session Border Controllers
12 pages
NSE4 - FGT-7.2 Exam - Free Actual Q&As, Page 1 - ExamTopics
No ratings yet
NSE4 - FGT-7.2 Exam - Free Actual Q&As, Page 1 - ExamTopics
4 pages
Tutorial On Bridges, Routers, Switches, Oh My!: Radia Perlman
No ratings yet
Tutorial On Bridges, Routers, Switches, Oh My!: Radia Perlman
127 pages
Computer Networks Exam Solutions
No ratings yet
Computer Networks Exam Solutions
33 pages
Interface Errors & Discards - Rev
No ratings yet
Interface Errors & Discards - Rev
2 pages
RS485 to Ethernet Gateway Manual
No ratings yet
RS485 to Ethernet Gateway Manual
7 pages
High-Speed Network Traffic Management
No ratings yet
High-Speed Network Traffic Management
23 pages
RHCE Questions
No ratings yet
RHCE Questions
6 pages
Configure VLT and DCB Across Two Dell MXL 10 - 40GbE Chassis Switches - Notes Wiki
No ratings yet
Configure VLT and DCB Across Two Dell MXL 10 - 40GbE Chassis Switches - Notes Wiki
8 pages
Computer Networks Assignment Week 2
No ratings yet
Computer Networks Assignment Week 2
3 pages
DLR Push Api Specifications (Version:1.0.0, July 10,2012)
No ratings yet
DLR Push Api Specifications (Version:1.0.0, July 10,2012)
3 pages
N10-009 CompTIA Real Exam Questions
No ratings yet
N10-009 CompTIA Real Exam Questions
44 pages
Cis - DCCN BS (CS)
No ratings yet
Cis - DCCN BS (CS)
3 pages
Telstra Turbo 7 Gateway IP WAN Guide
No ratings yet
Telstra Turbo 7 Gateway IP WAN Guide
18 pages
IOS-XE 17.3.1 - TDM: July 2020
No ratings yet
IOS-XE 17.3.1 - TDM: July 2020
168 pages
Switch FSW 0811
No ratings yet
Switch FSW 0811
2 pages
LAB Activity 3 - QoS
No ratings yet
LAB Activity 3 - QoS
7 pages
Summary of Resolved Issues in BTS3900 V100R012C10SPC320 (eGBTS)
No ratings yet
Summary of Resolved Issues in BTS3900 V100R012C10SPC320 (eGBTS)
82 pages
WLC Config Best Practice
No ratings yet
WLC Config Best Practice
46 pages
SP3368 M
No ratings yet
SP3368 M
33 pages
Indosat Project Core Network LLD GGMDN01 V1.0
No ratings yet
Indosat Project Core Network LLD GGMDN01 V1.0
44 pages
CCH With Nutanix Field Guide
No ratings yet
CCH With Nutanix Field Guide
101 pages
IPv6 Configuration for Routers R1, R2, R3
No ratings yet
IPv6 Configuration for Routers R1, R2, R3
3 pages
ICMP Protocol: Functions and Types
No ratings yet
ICMP Protocol: Functions and Types
23 pages
FBs-CBEH Data Sheet
No ratings yet
FBs-CBEH Data Sheet
2 pages

L Net

Uploaded by

L Net

Uploaded by

6.

S081 2020 Lecture 21: Networking

overall network architecture

ethernet packet format

[eth tcpdump output]

[arp tcpdump output]

the ethernet header is enough to get a packet to a local host

[ip tcpdump output]

[udp tcpdump output]

layering view of a typical kernel network stack

control flow view of a typical kernel network stack

why all these queues of buffers?

other arrangements are possible and sometimes much better

NIC packet buffering

Today's paper: Eliminating Receive Livelock in an Interrupt-Driven Kernel,

Why are we reading this paper?

Explain Figure 6-1

A disk uses interrupts -- would a disk cause this kind of problem?

Why not completely process each packet in the interrupt handler?

Why not always poll, never use interrupts?

What's the paper's solution?

What happens when packets arrive too fast?

What happens when packets arrive slowly?

Modern Linux uses a scheme -- NAPI -- inspired by this paper.

Explain Figure 6-3

Explain Figure 6-4

What would happen if screend hung?

Big picture: polling loop is a place to exert scheduling control

Why are the two solutions different?

What if processing has more complex structure?

Can we formulate any general principles?

Similar phenomena arise in other areas of systems

A general lesson: complex (multi-stage) systems may need careful

You might also like