Department of Computer Sciences THE UNIVERSITY OF TEXAS AT AUSTIN
CS 378 (Spring 2003)
Linux Kernel Programming
Yongguang Zhang
([email protected])
Copyright 2003, Yongguang Zhang
This Lecture
• Memory Memory
• Process Memory
• Questions?
• Looking for Volunteer
– Explore BitKeeper for our use
– “ Community building” credit
Spring 2003 © 2003 Yongguang Zhang 2
Last Lecture
• Managing Physical Memory
• Managing Pages
• Managing Kernel Dynamic Memory
• Managing Process Address Space
• Paging/Swapping
Spring 2003 © 2003 Yongguang Zhang 3
Summary: Kernel Memory
Kernel data objects
Allocation functions:
Caches and slabs kmem_cache_alloc(), kmalloc()
Kernel virtual address space
Allocation functions:
Page Tables alloc_pages(), vmalloc()
Physical Memory: nodes, zones, pages
Spring 2003 © 2003 Yongguang Zhang 4
VM Area: Basics
struct vm_area_struct
vm_mm
vm_start
vm_end
vm_next
struct mm_struct
mmap
pgd vm_mm
vm_start
vm_end
vm_next
virtual
addr space
vm_mm
vm_start
Task (process, thread) vm_end
vm_next
Spring 2003 © 2003 Yongguang Zhang 5
VM Area Lookup by Address
• Given a virtual address, need to look up a VM
Area fast
– Used in page fault, memory mapping, VM area
operations (like locking, etc.)
• Data structure: Red-Black tree and Cache
struct mm_struct {
....
rb_root_t mm_rb;
struct vm_area_struct * mmap_cache;
....
};
Spring 2003 © 2003 Yongguang Zhang 6
Find VMA
• More data structure
struct vm_area_struct {
....
rb_node_t vm_rb;
....
};
• Function:
struct vm_area_struct * find_vma(struct mm_struct *
mm, unsigned long addr)
Spring 2003 © 2003 Yongguang Zhang 7
Shared Memory
• Two (or more) processes can shared the same
physical memory region
– May mapped to different virtual addresses
• Through system calls
– shmat(), shmdt()
• Implicitly
– Shared library
Spring 2003 © 2003 Yongguang Zhang 8
VM Area: Shared Memory
struct vm_area_struct Phyiscal
mmap vm_mm
vm_start vm_mm mmap
pgd vm_end
vm_next vm_start
vm_end pgd
vm_next
struct mm_struct vm_mm
vm_start vm_mm
vm_end vm_start
vm_next vm_end
vm_next
vm_next_share
vm_next_share
vm_mm
vm_start
vm_end
vm_next
Spring 2003 © 2003 Yongguang Zhang 9
Backing Store
• Each VM area can be mapped to a file (in
secondary memory)
• Explicit memory mapping through system call
– mmap(), munmap(), mremap()
• Implicit mmaping
– Code segment (loading from an excutable binary file)
– Swapping (mapped to the swap file)
Spring 2003 © 2003 Yongguang Zhang 10
VM Area: Backing Store
• Data structure (in include/linux/mm.h)
struct vm_area_struct {
....
unsigned long vm_pgoff; /* offset in page */
struct file * vm_file; /* mapped file */
....
};
Spring 2003 © 2003 Yongguang Zhang 11
Demand Paging
• Page frame for a VM area is not in core
– Page frame is not allocated when VM area is created
– Page frame can be swapped out
• Handled by page fault
Spring 2003 © 2003 Yongguang Zhang 12
Page Fault
• Exception handler
– Raised by address translation (hardware)
– Call do_page_fault() to handle this interrupt
• do_page_fault()
– Architecture-specific
– i386: arch/i386/mm/fault.c
– Find a page frame in the physical memory
– Load the missing page in
– Update the page tables
Spring 2003 © 2003 Yongguang Zhang 13
do_page_fault()
…
vma = find_vma(mm, address);
if (!vma)
goto bad_area;
if (vma->vm_start <= address)
goto good_area;
…
good_area:
…
handle_mm_fault(mm, vma, address, write);
…
bad_area:
…
force_sig_info(SIGSEGV, &info, tsk);
…
Spring 2003 © 2003 Yongguang Zhang 14
handle_mm_fault()
…
pgd = pgd_offset(mm, address);
pmd = pmd_alloc(mm, pgd, address);
if (pmd) {
pte = pte_alloc(mm, pmd, address);
if (pte)
return handle_pte_fault(...);
}
…
handle_pte_fault():
– do_no_page() if pte entry is all-zero
– Do_swap_page() if pte entry is none-zero
Spring 2003 © 2003 Yongguang Zhang 15
Summary: Process Memory
Process virtual address space
Memory Areas
Page Tables
Page Fault
Backing Store Kswapd Physical Memory
Spring 2003 © 2003 Yongguang Zhang 16
Linux Process Structure
• Process: “a program in execution”
– Referred as “task” in Linux kernel
• Task context:
– Kernel-mode virtual address space (code, data, and
stack)
– Program counter, CPU's registers
– Other states
Spring 2003 © 2003 Yongguang Zhang 17
Process Descriptor
• All Kernel Information about a Process is in one
Data Structure: struct task_struct
– Defined in include/linux/sched.h
– One per process/thread
– Many many fields
• No separate data structure for process and thread
– Thread is just a task
• Pointer to this task
– Macro: current
Spring 2003 © 2003 Yongguang Zhang 18
Storage for Process Descriptors
• 8K per process to Store Both Process Descriptor
and Stack
– In include/linux/sched.h:
# define INIT_TASK_SIZE 2048*sizeof(long)
…
union task_union {
struct task_struct task;
unsigned long stack[INIT_TASK_SIZE/sizeof(long)];
};
Spring 2003 © 2003 Yongguang Zhang 19
Task Union
current
= %esp & ~8192UL
Task_struct
8KB
%esp
Stack
Spring 2003 © 2003 Yongguang Zhang 20
Allocating Task Union
• To create a new task union, call
alloc_task_struct()
• In include/asm-i386/processor.h:
#define alloc_task_struct() ((struct task_struct *)
__get_free_pages(GFP_KERNEL,1))
– This gets exactly 8K = 2 physical pages
• To release a task union, call free_task_struct()
Spring 2003 © 2003 Yongguang Zhang 21
Process State
• task_struct field:
– long state;
• 5 States (see include/linux/sched.h):
#define TASK_RUNNING 0
#define TASK_INTERRUPTIBLE 1
#define TASK_UNINTERRUPTIBLE 2
#define TASK_ZOMBIE 4
#define TASK_STOPPED 8
Spring 2003 © 2003 Yongguang Zhang 22
Process IDs
• task_struct fields:
– pid_t pid
– pid_t pgrp
– uid_t uid,euid, …
– gid_t gid,egid, …
Spring 2003 © 2003 Yongguang Zhang 23
Process Lists and Links
• List of All Processes
• PID Hash
• Run List
• Wait Queues
• Links to Related Tasks
Spring 2003 © 2003 Yongguang Zhang 24
Process List
• Double Linked List for All Task Descriptors
– task_struct fields:
• struct task_struct *next_task, *prev_task;
• First Process in the list: init_task
– include/linux/sched.h:
• #define INIT_TASK(tsk) …
– arch/i386/kernel/init_task.c:
• union task_union init_task_union =
{ INIT_TASK(init_task_union.task) };
– include/asm/arch/processor.h:
• #define init_task (init_task_union.task)
Spring 2003 © 2003 Yongguang Zhang 25
PID Hash
• Data Structure to Optimize PID-lookup in the
Process List: Hash by PID
– In include/linux/sched.h
#define PIDHASH_SZ (4096 >> 2)
extern struct task_struct *pidhash[PIDHASH_SZ];
#define pid_hashfn(x) ((((x) >> 8) ^ (x)) & (PIDHASH_SZ - 1))
– Hash Operations:
hash_pid(), unhash_pid(), *find_task_by_pid(int pid)
– task_struct Fields: (for hash collision)
struct task_struct *pidhash_next;
struct task_struct **pidhash_pprev;
Spring 2003 © 2003 Yongguang Zhang 26
Run Queue
• List of all processes in TASK_RUNNING state
• task_struct fields:
– struct list_head run_list;
• struct list_head
– A generic double-linked list data structure widely used
in Linux kernel
– Defined in include/linux/list.h
struct list_head {
struct list_head *next, *prev;
};
Spring 2003 © 2003 Yongguang Zhang 27
Wait Queues
• List of process waiting for a particular event
– TASK_INTERRUPTIBLE
– TASK_UNINTERRUPTIBLE
• Data structure in include/linux/wait.h
struct __wait_queue {
unsigned int flags;
struct task_struct * task;
struct list_head task_list;
};
typedef struct __wait_queue wait_queue_t;
Spring 2003 © 2003 Yongguang Zhang 28
Process Links
• Links To
– Original parent
– Parent
– Child (youngest)
– Younger sibling
– Older sibling
• task_struct fields:
– struct task_struct *p_opptr, *p_pptr, *p_cptr,
*p_ysptr, *p_osptr;
Spring 2003 © 2003 Yongguang Zhang 29
Other Fields
• Virtual Memory Management
– task_struct field: struct mm_struct *mm
• Filesystem information ($HOME, $CWD)
– task_struct field: struct fs_struct *fs
• Open File Information
– task_struct field: struct files_struct *files
• Signal handlers
• Limits
Spring 2003 © 2003 Yongguang Zhang 30
The CPU State
• task_struct field:
– struct thread_struct thread;
• The Thread of Execution
• Contents
– Program counter, CPU’s registers, etc.
Spring 2003 © 2003 Yongguang Zhang 31
The Data Structure thread_struct
• In include/asm-i386/processor.h :
– struct thread_struct
• Fields
– Hardware Registers
• unsigned long esp0, eip, esp, fs, gs;
– Hardware debugging registers
• unsigned long debugreg[8];
– Fault info
• unsigned long cr2, trap_no, error_code;
– Floating point info,Virtual 86 mode info, IO
permissions
Spring 2003 © 2003 Yongguang Zhang 32
Kernel Thread
• A Kernel Thread is a “ lightweight” Process
– Has a Process Descriptor and schedulable
– Runs only in kernel mode (has not user-mode address
space)
– Share kernel address space with other kernel threads
– Share other kernel data structure
• File systems (home directory, current work directory)
• Open file descriptors
• A Typical Kernel has many Threads
– Depends on configuration
Spring 2003 © 2003 Yongguang Zhang 33
Well-known Kernel Threads
• Process 0 (swapper)
• Process 1: init
• Other Threads: (ps ax)
2? SW 0:00 [keventd]
3? SW 0:00 [kapm-idled]
4? SWN 0:00 [ksoftirqd_CPU0]
5? SW 4:45 [kswapd]
6? SW 0:00 [kreclaimd]
7? SW 0:00 [bdflush]
8? SW 0:00 [kupdated]
9? SW< 0:00 [mdrecoveryd]
10 ? SW 0:00 [phpd_notify]
14 ? SW 0:40 [kjournald]
80 ? SW 0:00 [khubd]
Spring 2003 © 2003 Yongguang Zhang 34
Summary
• Process Management:
– LKP §3.1.1 & 3.1.2
– ULK §3
• Projects from now on
– 5 group projects (out of 5 instead of out of 8)
– One per week
– First group project: to pose the coming Monday, due
in a week
Spring 2003 © 2003 Yongguang Zhang 35