Sei sulla pagina 1di 40

The Windows Operating System

Goals
Hardware-portable
Used to support MIPS, PowerPC and Alpha Currently supports x86, ia64, and amd64 Multiple vendors build hardware

Software-portable
POSIX, OS2, and Win32 subsystems
OS2 is dead POSIX is still supportedseparate product Lots of Win32 software out there in the world

Goals
High performance
Anticipated PC speeds approaching minicomputers and mainframes Async IO model is standard Support for large physical memories SMP was an early design goal Designed to support multi-threaded processes Kernel has to be reentrant

Process Model
Threads and processes are distinct Process:
Address space Handle table (Handles => file descriptors) Process default security token

Thread:
Execution Context Optional thread-specific security token

Tokens
Who you arelist of identities
Each identity is a SID

Also contains Privileges


Shutdown, Load drivers, Backup, Debug

Can be passed through LPC ports and named pipe requests


Server side can use this to selectively impersonate the client.

Object Manager
Uniform interface to kernel mode objects. Handles are 32bit opaque integers Per-process handle table maps handles to objects and permissions on the objects Implements refcount GC
Pointer counttotal number of references Handle countnumber of open handles

Object Manager
Implements an object namespace
Win32 objects are under \BaseNamedObjects Devices under \Device
This includes filesystems

Drive letters are symbolic links


\??\C: => the appropriate filesystem device

Some things have other names


Processes and threads are opened by specifying a CID: (Process.Thread)

Standard operations on handles


CloseHandle() DuplicateHandle()
Takes source and destination process Very useful for servers

WaitForSingleObject(), WaitForMultipleObjects()
Wait for something to happen Can wait on up to 64 handles at once

Security Descriptors
Each object has a Security Descriptor
Ownerspecial SID, CREATOR_OWNER Groupspecial SID, CREATOR_GROUP DACL
Discretionary Access Control List List of SIDs and granted or denied access rights

SACL
System Access Control List List of SIDs and access rights to be audited

Access Rights
typedef struct _ACCESS_MASK { USHORT SpecificRights; UCHAR StandardRights; UCHAR AccessSystemAcl : 1; UCHAR Reserved : 3; UCHAR GenericAll : 1; UCHAR GenericExecute : 1; UCHAR GenericWrite : 1; UCHAR GenericRead : 1; } ACCESS_MASK;

Security Use
Objects are referred to via handles Security checks occur when an object is opened
Open requests contain a mask of requested access rights If granted to the token by the DACL, the handle contains those access rights

Access rights are checked on use


Just a bit testvery fast

Object Open
evt = OpenEvent(EVENT_MODIFY_STATE, FALSE, "SomeName"); Finds the event object by name Walks the DACL, looking for token SIDs Keeps looking until all permissions are granted If access is granted, inserts a handle to the object into the processs handle table, with EVENT_MODIFY_STATE access

Object Use
SetEvent(evt);
SetEvent() requires EVENT_MODIFY_STATE access, and an event object. The kernel looks up the handle in the processs handle table. Checks to make sure that it maps to an event object, and that the granted access bits contain the EVENT_MODIFY_STATE bit. If all is good, the event is set.

Object Use
WaitForSingleObject(evt)
WaitForSingleObject() requires a synchronization object (like an event) and SYNCHRONIZE access. evt maps to an event object SYNCHRONIZE access was not requested when the handle was inserted. Even if the DACL permits it, the wait fails.

Types of Objects
Events
State is set or clear. Can clear when a wait completes (auto-reset)

Mutexes
Can be acquired by a single thread at a time. Automatically release when owner exits.

Semaphores
Maintain a count Waits decrement the count

More objects
Threads, Processes, Timerslike events Registry Keys
Manipulate data in the registrycentralized store of system configuration info.

LPC Ports
Fast local RPC Security tokens can transfer over LPC calls

Files

Files & IO
File objects maintain a current offset, and a pointer to the underlying stream. Default internal model is asynchronous
Synchronous IO just waits for the IO to complete Async IO can set an event, or run a callback in the thread which queued the IO, or post a message to an IO completion port.

Each request is an IRP

IRPs
Maintain state of IO requests, independent of the thread working on the IO IRPs are handed off through the device stack to their destinations
Threads process IRPs Initiating thread processes the IRP until a device returns STATUS_PENDING Subsequent processing can be done in kernel worker threads

Interrupts
IRQLInterrupt Request Level:
0 => PASSIVE_LEVEL
Processor is running threads All usermode code is at IRQL 0

1 => APC_LEVEL; threads, APCs disabled 2 => DISPATCH_LEVEL


Running as the processor: cant stop! Cant take a page fault Only locks available are KSPIN_LOCKs

Interupts
3-26 => Device Interrupt Service Routines
Device interrupts are mapped to an IRQL and an interrupt service routine; ISR is called at that IRQL

27 => PROFILE_LEVELprofiling 28 => CLOCK2_LEVELclock interrupt 29 => IPI_LEVELinterprocessor interrupt


Requests another processor to do something

30 => POWER_LEVELpower failure 31 => HIGH_LEVELinterrupts disabled

Interrupts
Hardware signals an interrupt Interrupts ISR runs at device IRQL
Has to be fast; get off the processor and allow other ISRs to run Typically queues a DPC, acknowledges the interrupt, and returns

DPCDelayed Procedure Call


Further processing at DISPATCH_LEVEL Queues work to kernel worker threads

IO Completion
Driver calls IO Manager to complete the IRP IO Manager queues a kernel mode APC to the initiating thread APC: Asynchronous Procedure Call
Kernel mode APC preempts thread execution Writes data back to user mode in the context of the thread which initiated the IO Signals completion of the IO

IO Cache
Classic: block cache
Page mappings translate directly to blocks on the underlying partition.

Windows: stream cache


Page mappings are offsets within a stream. IO Cache Manager uses the same mappings. All cache management (trimming) is centralized in the memory manager All modifications show up in mapped views.

Virtual Memory
Sectionsanother object type
Can be created to map a file Can also be created off the pagefile Optionally named, for shared memory

Reservation
Range of VA which will not be handed out for some other purpose

Committed
VA which actually maps to something

Aside: CreateProcess
Just a user mode Win32 API { NtCreateFile(&file, szImage); NtCreateSection(&sec, file); NtCreateProcess(&proc, sec); NtCreateThread(&thrd, proc); } WaitForSingleObject(proc);

Virtual Memory
Memory Manager maintains processorspecific page table entry mappings.
Some parts of the address space are shared between processesfor instance, the kernels address space and the per-session space.

On a pagefault, mm reads in the data Pages can be mapped without the appropriate access what to do?

Signals
With threads, signals dont work very well. Some software designs expect to touch inaccessible memory.
Large structured files Concurrent garbage collection SLists

Single global handler has to somehow know about all possible situations.

Structured Exception Handling


Exceptions unwind the stack
Almost like C++! C++ matches against a type hierarchy SEH calls exception filter codefilters are Turing-complete.

Two ways to deal with exceptions:


try/finally try/except

try/finally
res = AllocateSomeResource(); try { SomeOperation(res); } finally { if (AbnormalTermination()) { FreeSomeResource(res); } } return res;

try/except
try { SomeOperationWhichMayAV(); } except (Filter( GetExceptionCode(), GetExceptionInformation())) { DoSomethingElse(); }

try/except
GetExceptionCode()
A code indicating the cause of the exception

GetExceptionInformation()
Additional code-specific info The full processor context

Filter decides what to do


EXCEPTION_EXECUTE_HANDLER EXCEPTION_CONTINUE_SEARCH EXCEPTION_CONTINUE_EXECUTION

Structured Exception Handling


On x86, TEB points to stack of EXCEPTION_REGISTRATION_RECORD
auto structs, pointing to handler code pushed by function prolog popped by function epilog

On exception, RtlDispatchException() walks the list.


Runs the filters to figure out what to do Calls handler functions

Structured Exception Handling


On x86, theres some overhead with pushing and popping the registration record On ia64, there is no overhead
Stack traces are reliable Its always possible to look up the handler

Exception handling is very slow


Especially on ia64

Used only for truly exceptional conditions

Structured Exception Handling


Used in kernel mode too!
Most user mode access will just work Still need to validate address ranges & data Works great for SMP when another thread might be in the middle of modifying the address space Expected read exceptions are returned as status codes from system calls Expected writes are returned as SUCCESS Unexpected => buggy kernel => blue screen

Top-level Exception Filter


Top frame on each thread defines a catchall exception filter Top-level exception filter:
Notifies the debugger (if being debugged) Launches a just-in-time debugger (if set up) Loads faultrep.dll to report the failure

Faultrep.dll
faultrep.dll offers to report the failure back to Microsoft We analyze the failures
A significant number are recognized instantly; we can tell the user what happened and how to fix it. The others go through the standard triage process; developers analyze the dumps and figure out what happened.

OCA
67 million machines running XP Tens of thousands of drivers Over 100 drivers on any given machine One bug in one driver => Crash A significant number of crashes come from third-party drivers (some of which ship on the CD) Lots of different problems, though

Driver Verifier
Controlled by verifier.exe Special-pools allocations
Detects allocation overruns & use after free

Validates some behaviors


IRQLtouching paged memory? DMA buffers

Can inject failuresuseful for testing behavior under sub-optimal conditions

Stress
Every night, a couple hundred machines run stress on the latest build Stress exercises filesystems, memory, GUI, scheduler, &c, trying to uncover lowmemory handling problems and race conditions Every morning, the stress test team triages failed machines Developers debug the failures

Questions?

Potrebbero piacerti anche