Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Task 1
Task 2
Task 3
Process States
User mode Running Interrupt Return from system call System mode Scheduler Ready Waiting Interrupt routine System call
Process States
Running
Task is active and running in the non-privileged user mode. If an interrupt or system call occurs, it is switched to the privileged system mode.
Interrupt
routine
call
software interrupt
/
Process States
Waiting
when system call or interrupt is complete scheduler switches the process to ready state
Ready
structure
task_struct in include/linux/sched.h Also accessed by assembly code, cannot alter the sequence or add declarations in the front states
TASK_RUNNING (0): ready or running TASK_INTERRUPTIBLE(1), TASK_UNINTERRUPTIBLE(2): waiting for certain events. TASK_UNINTERRUPTIBLE means a task cannot accept any other signals. TASK_ZOMBIE(3): process terminated but still has its task structure TASK_STOPPED(4): process has been halted TASK_SWAPPING(5): not used.
/
Task Structure
struct task_struct { /* these are hardcoded - don't touch */ volatile long state; volatile indicates that this value can be altered by interrupt routines long counter; long priority; counter variable holds the time in ticks for the process can still run before a mandatory scheduling action is carried out. Counter is used as dynamic priority for scheduler priority holds the static priority of a process
/
Task Structure
unsigned long signal; unsigned long blocked; signal contains a bit mask for signals received for the process. It is evaluated in the routing ret_from_sys_call() which is called after every system call and after slow interrupts. blocked contains a bit mask for signals to be blocked unsigned long flags; flags contains the combination of the system status flags
Task Structure
Process
flags:
#define PF_ALIGNWARN 0x00000001 /* Print alignment warning msgs */ /* Not implemented yet, only for 486*/ #define PF_PTRACED 0x00000010 /* set if ptrace (0) has been called. */ #define PF_TRACESYS 0x00000020 /* tracing system calls */ #define PF_FORKNOEXEC 0x00000040 /* forked but didn't exec */ #define PF_SUPERPRIV 0x00000100 /* used super-user privileges */ #define PF_DUMPCORE 0x00000200 /* dumped core */ #define PF_SIGNALED 0x00000400 /* killed by a signal */ #define PF_STARTING 0x00000002 /* being created */ #define PF_EXITING 0x00000004 /* getting shut down */ #define PF_USEDFPU 0x00100000 /* Process used the FPU this quantum (SMP only) */ #define PF_DTRACE 0x00200000 /* delayed trace (used on m68k) */
/
Task Structure
int errno; int debugreg[8]; errno holds the error code for the last faulty system call. debugreg contains the 80x86s debugging registers. struct exec_domain *exec_domain; which UNIX is emulated for each process struct task_struct *next_task, *prev_task; all processes are linked through these two pointers init_task points to the start and end of this list struct task_struct *next_run, *prev_run; list of processes that apply for the processor
/
Task Structure
struct task_struct *p_opptr, *p_pptr, *p_cptr, *p_ysptr, *p_osptr; pointers to (original) parent process, youngest child, younger sibling, older sibling, respectively parent p_cptr p_pptr p_pptr p_pptr p_osptr child p_ysptr
/
youngest child
p_osptr
p_ysptr
oldest child
Task Structure
struct mm_struct *mm; memory management information
struct mm_struct { int count; pgd_t * pgd; unsigned long context; unsigned long start_code, end_code, start_data, end_data; unsigned long start_brk, brk, start_stack, start_mmap; unsigned long arg_start, arg_end, env_start, env_end; unsigned long rss, total_vm, locked_vm; unsigned long def_flags; struct vm_area_struct * mmap; struct vm_area_struct * mmap_avl; struct semaphore mmap_sem; };
/
Virtual Memory
Task Structure
unsigned long kernel_stack_page; stack when a process is running in system mode unsigned long saved_kernel_stack; save the old stack pointer when running MS-DOS emulator (vm86) int pid, pgrp, session, leader; process id, group id, session belongs to, and session leader unsigned short uid,euid,suid,fsuid; unsigned short gid,egid,sgid,fsgid; user id, effective user id, file system user id group id, effective group id, file system group id
/
Task Structure
uid,
euid, suid, gid, egid, sgid Each process has a real user ID and group ID and an effective user ID and group ID.
The real ID identifies the person using the system The effective ID determines their access privileges. execve() changes the effective user or group ID to the owner or group of the executed file if the file has the set-user-ID (suid) or set-group-ID (sgid) modes. The real UID and GID are not affected. The effective user ID and effective group ID of the new process image are saved as the saved set-user-ID and saved set-group-ID respectively, for use by setuid(3V).
Task Structure
gid are inherited from parent euid, egid, fsuid, fsgid can be set at run time (owner of the executable file) int groups[NGROUPS]; A process may be assigned to many groups struct fs_struct *fs; file system information struct fs_struct { int count; /* for future expansions */ unsigned short umask; /* access mode */ struct inode * root, * pwd; /* root dir and current dir */ };
/
Uid,
Task Structure
struct files_struct *files; open file information (file descriptors) struct files_struct { /* open file table structure */ int count; fd_set close_on_exec; /* files to be closed when exec is issued */ fd_set open_fds; /* open files (bitmask) */ struct file * fd[NR_OPEN]; };
Task Structure
long utime, stime, cutime, cstime, start_time; time spend in user mode, system mode, total time of children process spend in user mode, system mode, and the time when the process generated, respectively. unsigned long it_real_value, it_prof_value, it_virt_value; unsigned long it_real_incr, it_prof_incr, it_virt_incr; struct timer_list real_timer; timer for alarm system call (SIGALRM) time in ticks until the timer will be trigger, for reinitialization, real-time interval timer, respectively.
Task Structure
struct sem_undo *semundo; semaphores need to be released when a process terminated struct sem_queue *semsleeping; semaphore waiting queue struct wait_queue *wait_chldexit; When a process calls wait4(), it will halt until a child process terminates at this queue. struct rlimit rlim[RLIM_NLIMITS]; limits of the use of resources (setrlimit(), getrlimit())
Task Structure
struct signal_struct *sig; struct signal_struct { int count; struct sigaction action[32]; }; Signal handlers int exit_code, exit_signal; return code and the signal that causes the program aborted char comm[16]; name of the program that executed by the process
/
Task Structure
unsigned long personality; description of the characteristics of this version of UNIX (see also exec_domain) int dumpable:1; whether a memory dump is to be executed int did_exec:1; is the process still running the old program (no execve, ) struct desc_struct *ldt; used by WINE, windows emulator
Task Structure
struct linux_binfmt *binfmt; functions responsible for loading the program struct thread_struct tss; holds all the data on the current processor status at the time of the last transition from user mode to system mode, all registers are saved here. struct thread_struct can be found in asmi386/processor.h which, among other definitions, include 8086 related information: struct vm86_struct * vm86_info; unsigned long screen_bitmap; unsigned long v86flags, v86mask, v86mode;
/
Task Structure
unsigned long policy, rt_priority; Scheduling policies: classic (SCHED_OTHER), real-time (SCHED_RR, SCHED_FIFO) rt_priority :real-time priority #ifdef __SMP__ int processor; int last_processor; int lock_depth; #endif When running on a multi-processor machine, need to know on which processor the task is running, .., etc.
/
Process Table
struct task_struct init_task; points to the start of the doubly linked task list struct task_struct *task[NR_TASKS]; task table #define current (0+current_set[smp_processor_id()]) struct task_struct *current_set[NR_CPUS]; current process (for multi-processor architecture) #define for_each_task(p) \ for (p = &init_task ; (p = p->next_task) != &init_task ; ) macro for find all processes the first task is skipped (init_task)
/
Memory Management
Macros
#define __get_free_page(priority) __get_free_pages((priority),0,0) #define __get_dma_pages(priority, order) __get_free_pages((priority),(order),1) extern unsigned long __get_free_pages(int priority, unsigned long gfporder, int dma); defined in linux/mm.h, page size is 4KB priority: GFP_BUFFER, GFP_ATOMIC, GFP_KERNEL, GFP_NOBUFFER, GFP_NFS (what to do if not enough pages are free) order:number of pages to be reserved (in power of 2) dma: address can be addressed by DMA component
/
Memory Management
Functions
extern inline unsigned long get_free_page(int priority) { unsigned long page; page = __get_free_page(priority); if (page) memset((void *) page, 0, PAGE_SIZE); return page; } Will clear the page
Memory Management
Functions
void *kmalloc(size_t size, int priority) void kfree(void *__ptr) malloc() and free() in the kernel
Waiting Queues
Structures
struct wait_queue { struct task_struct * task; struct wait_queue * next; }; include/linux/wait.h wait until condition met
Functions
(sched.h)
extern inline void add_wait_queue(struct wait_queue ** p, struct wait_queue * wait) extern inline void remove_wait_queue(struct wait_queue ** p, struct wait_queue * wait)
/
Waiting Queues
Functions
void sleep_on(struct wait_queue ** p); void interruptible_sleep_on(struct wait_queue ** p); void wake_up(struct wait_queue ** p); void wake_up_interruptible(struct wait_queue ** p); kernel/sched.c sleep_on sets process state to TASK_UNINTERRUPTIBLE or TASK_INTERRUPTIBLE wait_up sets process state to TASK_RUNNING
Semaphores
Structure
for semaphores
struct semaphore { int count; int waiting; struct wait_queue * wait; }; asm-i386/semaphore.h
Functions
extern inline void down(struct semaphore * sem) extern inline void up(struct semaphore * sem)
unit of ticks (10 ms) Global variable, jiffies, denotes the time in ticks since the system booted Structure for timer (old)
struct timer_struct { unsigned long expires; void (*fn)(void); }; extern struct timer_struct timer_table[32]; extern unsigned long timer_active; /* which entry is valid? */
/
struct timer_list { struct timer_list *next; struct timer_list *prev; unsigned long expires; unsigned long data; /* arguments */ void (*function)(unsigned long); }; extern void add_timer(struct timer_list * timer); extern int del_timer(struct timer_list * timer);
Process Management
Signal
Interrupt
Booting
Timer
Scheduler
Signal
Signals
SIGHUP SIGINT SIGQUIT SIGILL SIGTRAP SIGABRT SIGIOT SIGBUS SIGFPE SIGKILL SIGUSR1
()
1 2 3 4 5 6 6 7 8 9 10 hangup interrupt quit illegal instruction trace trap abort (generated by abort(3) routine) Input/Output Trap (obsolete) bus error arithmetic exception kill (cannot be caught, blocked, or ignored) user-defined signal 1
/
Signal
SIGSEGV 11 segmentation violation SIGUSR2 12 user-defined signal 2 SIGPIPE 13 write on a pipe or other socket with no one to read it SIGALRM 14 alarm clock SIGTERM 15 software termination signal SIGTKFLT 16 SIGCHLD 17 child status has changed SIGCONT 18 continue after stop SIGSTOP 19 stop (cannot be caught, blocked, or ignored) SIGTSTP 20 stop signal generated from keyboard SIGTTIN 21 background read attempted from control terminal
/
Signal
SIGTTOU 22 background write attempted to control terminal SIGURG 23 urgent condition present on socket SIGXCPU 24 cpu time limit exceeded (see getrlimit(2)) SIGXFSZ 25 file size limit exceeded (see getrlimit(2)) SIGVTALRM 26 virtual time alarm (see getitimer(2)) SIGPROF 27 profiling timer alarm (see getitimer(2)) SIGWINCH 28 window changed (see termio(4) and win(4S)) SIGIO 29 I/O is possible on a descriptor (see fcntl(2V)) SIGPOLL 29 SIGIO SIGPWR 30 Power Failure (for UPS) SIGUNUSED 31
/
calls
the signal sig to a process or a group of processes If pid is greater than zero, the signal is sent to the process with the PID pid. If pid is zero, the signal is sent to the process group of the current process. If pid is -1, the signal is sent to all processes, except the system processes and current process If pid is less than -1, the signal is sent to all process of the process group -pid
/
calls
real or effective user ID of the sending processing must match the real or saved set-user ID of the receiving process, unless the effective user ID of the sending process is super-user. A single exception is the signal SIGCONT, which requires the sending and receiving processes belong to the same session. Errors: EINVAL: invalid sig ESRCH: process or process group does not exist EPERM: no privileges
/
calls
linux/kernel/exit.c sys_kill() -> send_sig(), kill_pg(), kill_proc() -> generate() see also force_sig(), kill_sl() also called from ret_from_sys_call() -> do_signal()->send_sig() ->handle_signal() (signal.c, 223) ->setup_frame() (160) ->regs->eip = sa->sa_handler (213)
/
sys_kill
Linux/kernel/exit.c,
line 318-339
322-323: If pid is zero, the signal is sent to the process group of the current process. 324-334: If pid is -1, the signal is sent to all processes, except the system processes (PID=0 or 1) and current process. for_each_task macro is defined in include/linux/sched.h, line 491. If count is zero, return error code ESRCH. 335-336:If pid is less than -1, the signal is sent to all process of the process group -pid. 338: If pid is greater than zero, the signal is sent to the process with the PID pid.
/
kill_pg
Linux/kernel/exit.c,
line 258-275.
264-265: sig must be in [1..32], pgrp (process group id) must be greater than zero 266-273: for each process, if its process group id is pgrp, then sends signal sig to it (send_sig). If success, send_sig will return zero. 274: if found=0, then no process has been found, return error ESRCH, else return zero.
kill_proc
Linux/kernel/exit.c,
line 301-312
305-306: sig must be in [1..32]. 307-310: if a process with pid is found, sends signal sig to it (send_sig) 311: if no process has been found, return error ESRCH
send_sig
Linux/kernel/exit.c,
line 73-101
75-76: p cannot be null and sig must less than or equal to 32 77: priv is privilege (0 for normal process, 1 for super user), SIGCONT can only send to process belongs to the same sessin 78-79: The real or effective user ID of the sending processing must match the real or saved set-user ID of the receiving process, unless the effective user ID of the sending process is super-user. 80: super user? 81: If none of above conditions is true, return error
/
send_sig
82-83: if sig=0, do nothing 84-88: if sig in the task struct is null (in zombie state), do nothing 89-95: if sig is SIGKILL or SIGCONT, and the process is in state TASK_STOPPED, wake up the process and reset SIGSTOP, SIGTSTP, SIGTTIN, SIGTTOU signals. 96-97: if sig is SIGSTOP, SIGTSTP, SIGTTIN, or SIGTTOU, reset SIGCONT. 99: actually generate the signal
generate
Linux/kernel/exit.c,
line 29-51
31: set up signal mask 32: action of the signal, sa=p->sig->action[sig-1] 39: if the signal is not blocked and the process is not traced 41: and if the handler of the signal is SIG_IGN (to be ignored) and the signal is not from state change of child process 42: then return immediately. 44-46: if the handler if SIG_DFL (default action) and the signal is SIGCONT, SIGCHLD, SIGWINCH, SIGURG, then return immediately. (wake up has been done for SIGCONT)
/
generate
48: finally, set the signal 49-50: if the signal receiving process is interruptable and the signal is not to be blocked, then wake up the process.
force_sig
Linux/kernel/exit.c,
line 57-70
force to send a signal to a process (cannot be ignored) 60: if the process is not in zombie state 61-62: set the signal and get the signal action struct 63: really set the signal 64: the signal cannot be blocked, so clear the bit in p->blocked 65-66: if the handler is SIG_IGN, reset it to SIG_DFL 67-68: wake up the process if it is interruptible
kill_sl
Linux/kernel/exit.c,
line 282-299
sends a signal to the session leader 288-289: sig must be in [1..32]. Session must be greater than zero 290-297: for each process, checks to see if session id is equal to sess and the process is the session leader, then sends signal to the session leader (send_sig) 298: return error if no process is found
calls
calls
calls
sigpending(sigset_t *set)
the set of signals that are blocked from delivery and pending for the calling process in set.
ssetmask(int mask), sgetmask() set/get blocked singals of current process, obsolete by sigprocmask(). Sigsuspend(int restart, unsigned long oldmask, unsigned long newmask)
replaces
the processs signal mask with newmask and then suspends the process until delivery of a signal.
/
sys_sigaction
Linux/kernel/signal.c,
line 150-182
155-156: check signal number [1..32] 157: get the old sigaction (p) 158-170: if action (new setting) is not null, check if it can be read. If yes, copy the content of action to new_sa 171-176: if oldaction is not null, stores the old sigaction (p) to oldaction 177-180: replace sigaction with new_sa
sys_sigprocmask
Linux/kernel/signal.c,
line 29-60
34-52: if set (new mask) is not null, process set depends on how; SIG_BLOCK, add blocked signals to oset, SIG_UNBLOCK: unblock blocked signals from oset, SIG_SETMASK: reset blocked signals with set. 53-58: if oset is no null, copy old_set (current->blocked) to seet.
Sys_sigpending
Linux/kernel/signal.c,
line 80-88
stores signals pending but blocked into set. 84: check if set can be write 85-86: if yes, copy current blocked signals to set.
Interrupt
To
allow the hardware to communicate with the operating system Source files
arch/i386/kernel/irq.c include/asm-i386/irq.h
Interrupt
handlers
slow, fast, bad (irq.c, lines 142-172) build the interrupt handler first
line
Interrupt
Interrupt
number
First set : 0-7 Second set: 8-15 0 for timer On SMP board (486 and above)
irq13
On a 386
irq13
Interrupt
Do_IRQ()
struct irqaction * action = *(irq + irq_action); while (action) { do_random |= action->flags; action->handler(irq, action->dev_id, regs); action = action->next; }
Data Structure
struct irqaction { void (*handler)(int, void *, struct pt_regs *); unsigned long flags; unsigned long mask; const char *name; void *dev_id;struct irqaction *next; };
Interrupt
Interrupt
IDT[] -> interrupt[] (or fast_interrupt[], bad_interrupt[]) IRQi_interrupt (or fast or bad) -> do_IRQ()->irqaction[] irqaction[i]->handler -> jump to ret_from_sys_call jump to handle_bottom_half (if bh_mask & bh_active) do_bottom_half -> bh_base[] -> bh_base[i]
/
Interrupt
request_irq()->setup_x86_irq() (init fn) setup_x86_irq:
fast_interrupt[], bad_interrupt[] BUILDIRQ macro assembly codeassembly code interruptfast_interruptcall do_IRQ bad_interrupt interruptcall do_IRQjumpret_from_sys_call / (fast_interrupt)
Interrupt
do_IRQ
irqaction[]actionhandler jumpret_from_sys_call jumphandle_bottom_half (bh_mask & bh_active) handle_bottom_half assembly code call do_bottom_halfdo_bottom_half bh_base[]function
Interrupt
bh_base[]
init_bh() irqrequest_irq()
interrupt:
bottom
half:
start_kernel() -> sched_init() -> init_bh(TIMER_BH, timer_bh) calldo_timerjump ret_from_sys_call -> handle_bottom_half -> do_bottom_half->bh_base[0]->timer_bh
&el3_interrupt, ) 356
interrupt
el3_interrupt()
NET_BHinit?
net_dev_init()
init_IRQ()
arch/i386/kernel/irq.c 536void init_IRQ(void)functionIRQ 545~547outb_poutboutputbyteport 548~549for loopset_intr_gate bad_interrupt arrayset_intr_gatesystem.h235247bad_interrupt[] interrupt handlerrequest_irq()flag interrupt[]fast_ interrupt[] 555~556request_region()apricot.cfunction resource.cmacro 557~558setup_x86_irq()Interrupt Descriptor Table(IDT)
/
setup_x86_irq( )
395setup_x86_irq() 401p = irq_action + irq; irq_action219 16NULLstructirq 0~15irq 402~417IRQsharefast bad interruptshareslow interrupt interrupt share 426~432IRQsharefast interrupt fast_interrupt[]interrupt[] int request_irq() 437~467deviceIRQ request_irq()functiondeviceIRQ IRQhandlerdevice
/
Boot
Boot
process
BIOS
reads
the first sector of the boot disk (floppy, hard disk, , according to the BIOS parameter setting) Load the boot sector (512 bytes), which will contain program code for loading the operating system kernel (e.g., Linux Loader, LILO), to 0x7C00 (arch/i386/boot/bootsect.s, 35) in real mode boot sector ends with 0xAA55
Boot disk
Floppy:
the first sector Hard disk: the first sector is the master boot record (MBR)
/
Code for loading the boot sector of the active partition Partition 1 MBR and extended Partition 2 partition table Partition 3 Partition 4 0xAA55
/
MBR
MBR
Extended partition
If
more than 4 partitions are needed The first sector of extended partition is same as MBR The first partition entry is for the first logical drive The second partition entry points to the next logical drive (MBR)
The first sector of each primary or extended partition contains a boot sector
/
1 2 1 1 2 4
4
HD SEC CYL Begin: sector and cylinder number of boot sector SYS System code: 0x83 Linux, 0x82: swap, 0x05: extend End: head number HD End: sector and cylinder number of boot sector SEC CYL low byte high byte Relative sector number of start sector low byte high byte
Number of sectors in the partition
Active Partition
Booting is
carried out from the active partition which is determined by the boot flag Operations of MBR
determine active partition load the boot sector of the active partition jump into the boot sector at offset 0
Boot Process
Compressed
Kernel size
Include/linux/config.h, DEF_SYSSIZE = 0x7F00 clicks = 508 KB. (1 click=16 bytes) zImage is less than this size zImages source is arch/i386/boot/bootsect.s, it is loaded to 0x7C00 first, it is then moved to 0x90000 and jump to there to start execution. Setup.s is then loaded to 0x90200 and kernel image is loaded to 0x10000 (64KB) Setup.s moves the kernel from 0x10000 to 0x1000(4KB) to save memory and then enters the protected mode, jumps to 0x1000 (line 520-536)
/
Bootsect.c
Line
59-69
Moves code from 0x7C00 (BOOTSEG) to 0x90000(INITSEG) 64-65: set si, di to zero rep: repeat 68 68: move word by word until cx=0 (initialize to 256) 66: cld clears DF flag in EFLAG to 0 which makes the move statement goes up (increases the address for data movement)
/
Boot Process
Uncompress
Kernel
The start point is at arch/i386/kernel/head.s It initializes the system and then calls start_kernel So the system then runs from start_kernel()
starts from start: in arch/i386/boot/setup.s setup.s is responsible for initializing the hardware, asking the bios for memory/disk/other parameters, and putting them in memory 0x90000-0x901FF 520-521: switch to protected mode 534-536: jmp 0x1000, KERNEL_CS
jmpi
more initializations 858: creates process 1 (kernel_thread(init, NULL,0)) process 0 is an idle process, do nothing and runs when no other process needs CPU process 1 calls the init() and starts some daemons 868: process 0 enters an infinite idle loop
/
bdflush is responsible for synchronization of the buffer cache contents with the file system 929: kswapd is the background pageout daemon (swaping) 937: setup initializes the file systems and mounts the root file system 986-991: connects to the console and open file descriptors 0, 1, 2 (console) 993-997: tries to execute one of the programs /etc/init, /bin/init, /sbin/init. 999-1003: if none of the three programs exists, executes /etc/rc
/
enters an infinite loop in which a shell is started for users to login on the console.
Related Codes
ITIMER_REAL
Itimer.c/115, sched.c/606
Timer Interrupt
Important
global variables
jiffies
kernel/sched.c
(96): unsigned long volatile jiffies=0; ticks (10ms) since the system was started up
xtime
kernel/sched.c
actual
time
Timer
interrupt
updates jiffies and make the bottom half active the bottom half is called later, after handling other interrupts
/
Timer Interrupt
Timer Interrupt
do_timer
(kernel/sched.c, 1077-1095)
1079: increase jiffies 1080: increase lost_ticks (ticks since last called of the bottom half routine) 1081: mark the bottom half active (include/linux/ interrupt.h) 1082-1083: increase lost_ticks_system if in kernel mode (ticks spent in kernel mode since last called of the bottom half routing) 1084-1092: profile 1093-1094: mark timer queue handler active
/
Timer Interrupt
Bottom half
1072:
Timer Interrupt
update_process_times
(977-1049)
981: user time = ticks - system time 983: decrease the time quota used by current process 984-987: if the time quota is used up, need to reschedule 988-992: kernel statistics 994: update current processs times (924-975) 929-930: update processs user and system times 932-940: check if the process has used up its CPU limitation (setrlimit for setting limit of resource usage). If exceeds soft limit, sends SIGXCPU. If exceeds hard quota, sends SIGKILL to kill the process.
/
Timer Interrupt
update_process_times
(977-1049)
994: update current processs times (924-975) 947-953: update interval timers. When timers have expired, sends SIGVTALRM. 960-966: update profile
run_timer_list
(649-665)
654: check timer list to see which timer has expired 655-662: prepare to call timer handler
run_old_timers
(667-683)
Scheduler
Classes
Real-time (soft)
Preemptive:
rt_priority SCHED_FIFO a process runs until it relinquishes control or a process with higher rt_priority wishes to run SCHED_RR can be interrupted if its time slice has expired and there are other processes with the same priority wishes to run (round robin with the same class)
Classic
SCHED_OTHER
Scheduler
Schedule()
system
Called when
call (indirectly, sleep_on -> schedule) after slow_interrupt, ret_from_sys_call is called to check the need_resched flag timer interrupt will also set the need_resched flag
Major tasks
routines
need to be called regularly determine the process with highest priority make the process to be the current process
/
Scheduler
Schedule()
303-304: cannot be called within a nested interrupt 306-310: the bottom halves of the interrupt routines (timeuncritical). E.g., the timer interrupt. 312: routines registered to be run in scheduler (chap. 7) 318-321: if current process belongs to the SCHED_RR class and its time slice has expired, move it to the end of run queue. 323-325: if current process is in TASK_INTERRUPTIBLE state and the signal it is waiting has arrived, make it runnable again 326-333: if current process is waiting for timeout and the timeout has expired, make it runnable again
/
Scheduler
Schedule()
334-335: the current process must wait for an event, remove it from the run queue 357-364: looks for the process with highest priority..
goodness(lines
235-281) return values -1000: dont select this task 0: out of time (no results) +ve: the larger, the better 1000: real-time process 255-256: real-time process 265: simply use p->counter as its weight 277-278: a slight favor to the current process
/
Scheduler
Schedule()
367-370: all processs counter is 0, re-calculate 386-401: have a new process become the current process, do the context switch (switch_to()) switch_to() in include/asm-i386/system.h, lines 53122 104-105: if next is the current task, do nothing 106-109: clears the TS-flag if the task we switched to has used the math co-processor latest 111-112: switch to the next task 114-120: reloads the debug regs if necessary.
/
System Call
IDT
table
kernel_start()calltrap_init() (arch/i386/kernel/traps.c, 322) trap_init()trap call set_system_gate(0x80, &system_call) IDT[0x80]system_call trap 0x80call system_call system callint 0x80
/
system call
fork()include/asm-i386/ unistd.h 272 static inline _syscall0(int,fork) _syscall0174extend
int fork(void) { long __res; __asm__ volatile ("int $0x80" : "=a" (__res) : "0" (__NR_fork)); /* 2 */ if (__res >= 0) return (type) __res; errno = -__res; return -1;
system_call
arch/i386/kernel/entry.s281
290(system call
number)sys_call_table[] function (sys_fork)null trace flag 304call function(sys_fork) system call322 ret_from_sys_callslow interrupt
/