Sei sulla pagina 1di 22

Windows Bugcheck

Analysis
Why Windows Crashes?
Windows crashes (i.e.: stops executions and displays the blue screen) for many different reasons: a
reference to a memory address that causes an access violation, an unexpected exception or trap, a
faulting kernel mode driver and so on. It's important to understand that Windows could go on even in
presence of serious problems during its execution, isolating the error and trying to recover some way:
but the detected problem could be caused by a more deep and serious error that could result in more
exceptions raised during the operating system processing that could finally lead to RAM and/or disk
data corruption. This is unacceptable, of course, so Windows adopts a sort of "fail, fast and safe
policy" that consists in stopping the execution, switching the display in a low-resolution VGA mode,
painting a blue background, writing memory status and crash informations to a file (the memory
dump file) and displaying a stop code containing a message and some indications to the user. "Blue
Screen Of Death", "Bugcheck" and "Stop errors" are different words that represent the same class of
unhandled exception that occurs in kernel.

l mode execution and causes the system to shut down (and possibly reboot). The source of the issue
can be anything from a power fluctuation in the system to a damaged component or a
software/hardware bug.
In Windows 7 and previous versions, the BSOD looks like the following

Figure 1: the "actual" BSOD.

Windows Bugcheck
Analysis
whereas in Windows 8 it actually looks like the following (a little less "scary" than the previous one)

Figure 2: BSOD.
It's interesting to observe the distribution of the bugcheck according to their causes: the book
"Windows Internals, 5th Edition " provides the following chart displaying the distribution of error
categories for Windows Vista SP1 in September 2008.

Windows Bugcheck
Analysis

Figure 3: distribution of error categories.

Back to top

Some Terminology
Blue screen: when the system encounters a hardware problem, data inconsistency, or similar error, it
may display a blue screen containing information that can be used to determine the cause of the error.
This information includes the STOP code and whether a crash dump file was created. It may also
include a list of loaded drivers and a stack trace.
Crash dump file: you can configure the system to write information to a crash dump file on your
hard disk whenever a STOP code is generated. The file (memory.dmp) contains information the
debugger can use to analyze the error. This file can be as big as the physical memory contained in the
computer. By default, it's located in the Windows\Minidump folder.
Debugger: a program designed to help detect, locate, and correct errors in another program. It allows
the user to step through the execution of the process and its threads, monitoring memory, variables,
and other elements of process and thread context.
Kernel mode: the processor mode in which system services and device drivers run. All interfaces and
CPU instructions are available, and all memory is accessible.
Minidump file: a minidump is a smaller version of a complete, or kernel memory dump. Usually
Microsoft will want a kernel memory dump. But the debugger will analyze a mini-dump and quite
possibly give information needed to resolve. If it's all you have, then debug it, rather than waiting for
the machine to crash again. Open the file in the debugger (see below) just as opening memory.dmp in

Windows Bugcheck
Analysis
the demonstration.
STOP code: the error code that identifies the error that stopped the system kernel from continuing to
run. It is the first set of hexadecimal values displayed on the blue screen. At a minimum, frontline
Admins should be required to note this code, and the four other codes displayed in parenthesis and
any drivers identified on the screen. Often, this is all you really need.
Symbol files: all system applications, drivers, and DLLs are built such that their debugging
information resides in separate files known as symbol files. Therefore, the system is smaller and faster,
yet it can still be debugged if the symbol files are available. You don't need the Symbol files to debug:
the debugger will automatically access the ones it needs from Microsoft's public site.
Back to top

The Blue Screen


Regardless of the reason for a system crash, the function that actually performs the crash
is KeBugCheckEx , documented in the Windows Driver Kit (WDK). This function takes a stop code
(also called a bugcheck code) and four parameters that must be interpreted on a perstop code basis.
After KeBugCheckEx masks out all interrupts on all processors of the system, it switches the display
into a low-resolution VGA graphics mode (one implemented by all Windows-supported video
cards), paints a blue background and displays the stop code, followed by some text suggesting what
the user can do. Finally, KeBugCheckEx calls any registered device driver bugcheck callbacks
(registered by calling the KeRegisterBugCheckCallback function), allowing drivers an opportunity
to stop their devices. It then calls registered reason callbacks (registered by calling
the KeRegisterBugCheckReasonCallback function), which allow drivers to append data to the
crash dump or write crash dump information to alternate devices. KeBugCheckEx displays the textual
representation of the stop code near the top of the blue screen as well as the numeric stop code and
the four parameters at the bottom of the blue screen: the first line in the Technical Information section
lists the stop code and the four additional parameters passed to KeBugCheckEx; a text line near the
top of the screen provides the text equivalent of the stop codes numeric identifier (sometimes it's
even possible that system data structures have been so seriously corrupted that the blue screen isnt
displayed).
Back to top

Identifying the Stop Error


Many different types of Stop errors occur: each has its own possible causes and requires a unique
troubleshooting process; therefore, the first step in troubleshooting a Stop error is to identify the Stop
error. You need the following information about the Stop error to begin troubleshooting:

stop error number: this number uniquely identifies the Stop error;

stop error parameters: these parameters provide additional information about the Stop
error. Their meaning is specific to the Stop error number;

Windows Bugcheck
Analysis

driver information: when available, the driver information identifies the most likely source of
the problem. Not all Stop errors are caused by drivers, however.

This information is often displayed as part of the Stop message: if possible, write it down to use as a
reference during the troubleshooting process. If the operating system restarts before you can write
down the information, you can often retrieve the information from the "System" Event Log in Event
Viewer. If you are unable to gather the Stop error number from the Stop message and the System Log,
you can retrieve it from a memory dump file. By default, Windows is configured to create a memory
dump whenever a Stop error occurs. If no memory dump file was created, configure the system to
create a memory dump file. Then, if the Stop error reoccurs, you will be able to extract the necessary
information from the memory dump file.
Back to top

Understanding the Stop Message


The Stop message reports informations about the Stop error and assists the system
administrator (who understands how to interpret the information) in isolating and eventually resolving
the problem that caused the Stop error. The Stop message provides a great deal of useful information,
including the Stop error number, or bugcheck code. The Stop message uses a full-screen character
mode format and consists of several major sections, as shown in Figure 1, which display the following
informations:

Bugcheck Information: this section lists the Stop error descriptive name. Descriptive names
are directly related to the Stop error number listed in the Technical Information section.

Recommended User Action: this section informs the user that a problem has occurred and
that Windows was shut down. It also provides the symbolic name of the Stop error (in Figure
1, the symbolic name is DRIVER_IRQL_NOT_LESS_OR_EQUAL). It also attempts to describe the
problem and lists suggestions for recovery.

Technical Information: this section lists the Stop error number, also known as the
bugcheck code, followed by up to four Stop errorspecific codes (displayed as hexadecimal
numbers beginning with a "0x" prefix and enclosed in parentheses), which identify related
parameters. In Figure 1, the Stop error number is 0x000000D1 (often written as 0xD1).

Driver Information: this section identifies the driver associated with the Stop error.

Debug Port and Dump Status Information: this section lists Component Object Model
(COM) port parameters that a kernel debugger uses, if enabled. If you have enabled memory
dump file saves, this section also indicates whether one was successfully written.

Back to top

Collecting a Kernel-Mode Crash Dump

Windows Bugcheck
Analysis
Most modern desktop installations of Windows are configured to collect small memory dumps
automatically. The file dump generation settings can be configured in the "Advanced" tab of the
"System Properties" window, as you can see in the Figure 4.

Figure 4: setting the dump generation options.


Table 1 summarizes the different locations that Windows uses to store the memory dump files (also
read the Microsoft Knowledge Base article KB254649 "Overview of memory dump file options for
Windows 2000, Windows XP, Windows Server 2003, Windows Vista, Windows Server 2008, Windows 7
and Windows Server 2008 R2 ").

Memory
Dump
Type

Default Location
(variable)

Default Location
(typical)

Paging File
Requirement
s

Windows Bugcheck
Analysis
Small
memory
dump

%systemroot
%\Minidump\

c:\Windows\Minidump

>2 MB

Kernel
memory
dump

%systemroot
%\Memory.dmp

c:\Windows\Memory.d
mp

Large enough
for kernel
memory

Complete
%systemroot
memory
%\Memory.dmp
dump

c:\Windows\Memory.d
mp

All physical
RAM + 1 MB

Table 1: memory dump file location and size.

You can verify that the system correctly creates a dump file whenever a Stop error occurs by manually
forcing a system crash: read the "How To Manually Initiate a Windows Stop Error and Create a Dump
File (en-US)" page for further informations.

Back to top

Preparing the Environment


The first step is getting the Debugging Tools you need to analyze the crash dump files produced after a
system crash.
Older versione of the Debugging Tools were provided as standalone installers, that you can download
from theMicrosoft Windows Hardware Dev Center , paying attention to download and install the
appropriate version according to your system's architecture (32 bit or 64 bit); modern versions
are included with the Microsoft Windows SDK and the Windows Driver Kit .
If you decide to install the Windows SDK, be sure to check the check box to include the Debugging
Tools in the installation process, as you can see in Figure 5.

Windows Bugcheck
Analysis

Figure 5: installing the Windows SDK.


After installation, the symbols path needs to be set to ensure that there are enough symbols for the
debugger to determine what actually occurred and what was loaded. The entire symbol collection
offered to the public can be downloaded and placed on a local drive, or an Internet location can be
specified to pull the symbols on demand. I suggest you to pull them from the Internet: the correct
version of the symbols will be downloaded on demand and willnot become outdated by
installation of hotfixes and service packs. The Microsoft Knowledge Base article "Use the
Microsoft Symbol Server to obtain debug symbol files " (KB311503) provides you with the
instructions to follow to use the Microsoft Symbol Server to obtain debug symbol files: basically, you
can create a folder (for example, C:\Symbols) and set the environment variable

_NT_SYMBOL_PATH = srv*c:\Symbols*http://msdl.microsoft.com/download/symbols
as you can see in Figure 6.

Windows Bugcheck
Analysis

Figure 6: setting the _NT_SYMBOL_PATH variable.


Back to top

Analyzing the Crash Dump File


Start WinDbg from the Start menu (the exact position of WinDbg will vary according to your Windows
version) and select File -> Open Crash Dump... (or press CTRL+D): select the appropriate .DMP file and
let the debugger perform its initial operations: the kernel symbols are loaded and the debugger
displays some basic informations about the analyzed system and the reported bugcheck, along with
the indication of the module that probably made the system crash.

Windows Bugcheck
Analysis

Windows Bugcheck
Analysis
Figure 7: starting the debugging process.
After that, you need to get detailed informations about the current exception or bug check: in the lower
pane of the Command windows, type the command "!analyze -v" and hit ENTER (the "-v" option
displays verbose output).

Windows Bugcheck
Analysis

Windows Bugcheck
Analysis
Figure 8-a: analyzing the dump file (part 1).

Windows Bugcheck
Analysis

Windows Bugcheck
Analysis
Figure 8-b: analyzing the dump file (part 2).
As you can see, the system crashed because of a DRIVER_IRQL_NOT_LESS_OR_EQUAL bugcheck,
whose Stop code is 0x000000D1. The faulting module seems to be "e1k6232" (the image file is
e1k6232.sys): we enter the "lm" command with some options ("v" causes the display to be verbose,
including the symbol file name, the image file name, checksum information, version information, date
stamps, time stamps, and information about whether the module is managed code; "m" specifies a
pattern that the module name must match) as in the following

Windows Bugcheck
Analysis

Windows Bugcheck
Analysis
Figure 9: displaying module informations.
and we can get more informations about that module.
Then we perform a quich search on the web (http://systemexplorer.net/db/e1k6232.sys.html ) and
discover that "e1k6232.sys" is a driver belonging to the Intel Gigabit Adapter developed by Intel
Corporation: in this case, we could fix the issue by downloading and installing an updated version of
this driver (this DMP file comes from a PC really affected by this problem and updating the driver
effectively solved the issue). Further troubleshooting is dependent on the specific error. Some errors
may require the driver verifier to be enabled to determine a root cause: this tool verifies that drivers
are not making illegal function calls or causing system corruption and it can identify conditions such as
memory corruption, mishandled I/O request packets (IRPs), invalid direct memory access (DMA) buffer
usage and possible deadlocks. The !verifier extension in the kernel debugger can be used to monitor
and report on statistics related to Driver Verifier in context of a debugging session.
Back to top

Common Stop Messages


The following Stop error descriptions can help you to troubleshoot problems that cause Stop errors.

Stop 0xA (IRQL_NOT_LESS_OR_EQUAL)


The Stop 0xA message indicates that a kernel-mode process or driver attempted to access a memory
location to which it did not have permission or at a kernel IRQL that was too high. A kernel-mode
process can access only other processes that have an IRQL lower than or equal to its own. This Stop
message is typically the result of faulty or incompatible hardware or software. This Stop message has
four parameters:
1.

memory address that was improperly referenced

2.

IRQL that was required to access the memory

3.

type of access

4.

0x00 = read operation

0x01 = write operation

address of the instruction that attempted to reference memory specified in parameter 1

If the last parameter is within the address range of a device driver used by the system, the driver itself
can be determined by reading the line that begins with

Windows Bugcheck
Analysis
**Address 0xZZZZZZZZ has base at <address>- <driver name>
If the third parameter is the same as the first parameter, a special condition exists in which a system
worker routinecarried out by a worker thread to handle background tasks known as work items
returned at a higher IRQL. In that case, some of the four parameters take on new meanings
1.

address of the worker routine

2.

kernel IRQL

3.

address of the worker routine

4.

address of the work item

To resolve an error caused by a faulty device driver, system service or basic input/output
system (BIOS), follow these steps
1.

restart the system;

2.

press F8 at the character-based menu that displays the operating system choices;

3.

select the Last Known Good Configuration option from the Windows Advanced Options menu;
this option is most effective when only one driver or service is added at a time.

To resolve an error caused by an incompatible device driver, system service, virus scanner or
backup tool, follow these steps
1.

check the System Log in Event Viewer for error messages that might identify the device or
driver that caused the error;

2.

try disabling memory caching of the BIOS;

3.

run the hardware diagnostics supplied by the system manufacturer, especially the memory
scanner;

4.

make sure the latest Service Pack and Windows updates are installed;

5.

if the system has small computer system interface (SCSI) adapters, contact the adapter
manufacturer to obtain updated Windows drivers. Try disabling sync negotiation in the SCSI
BIOS, checking the cabling and the SCSI IDs of each device and confirming proper termination;

6.

for integrated device electronics (IDE) devices, define the onboard IDE port as Primary only.
Also, check each IDE device for the proper master/subordinate/stand-alone setting. Try
removing all IDE devices except for hard disks.

Windows Bugcheck
Analysis
If the Stop 0xA message is encountered while upgrading to a newer Windows version, the problem
might be due to an incompatible driver, system service, virus scanner or backup. To avoid problems
while upgrading, simplify hardware configuration and remove all third-party device drivers and system
services (including virus scanners) prior to running setup. After successfully installing Windows,
contact the hardware manufacturer to obtain compatible updates.
If the Stop error occurs when resuming from hibernation or suspend, read the Microsoft Knowledge
Base articles941492 and 945577 .
If the Stop error occurs when starting a mobile computer that has the lid closed, refer to the Microsoft
Knowledge Base article 941507 .
Back to top

Stop 0xD1 (IRQL_NOT_LESS_OR_EQUAL)


The Stop 0xD1 message indicates that the system attempted to access pageable memory using a
kernel process IRQL that was too high. Drivers that have used improper addresses typically cause this
error. This Stop message has fourparameters:
1.

memory referenced

2.

IRQL at time of reference

3.

type of access

4.

0x00 = read operation

0x01 = write operation

address that referenced memory

Stop 0xD1 messages can occur after you install faulty drivers or system services. If a driver is
listed by name, disable, remove, or roll back that driver to resolve the error. If disabling or removing
drivers resolves the error, contact the manufacturer about a possible update. Using updated software
is especially important for backup programs, multimedia applications, antivirus scanners, DVD
playback, and CD mastering tools.
Back to top

Stop 0x00000124 (WHEA_UNCORRECTABLE_ERROR)

Windows Bugcheck
Analysis
The Stop 0x00000124 message occurs when Windows has a problem handling a PCI-Express device.
Most often, this occurs when adding or removing a hot-pluggable PCI-Express card; however, it can
occur with driver- or hardware-related problems for PCI-Express cards.
To troubleshoot 0x00000124 stop errors, first make sure you have applied all Windows updates and
driver updates. If you recently updated a driver, roll back the change. If the stop error continues to
occur, remove PCI-Express cards one by one to identify the problematic hardware. When you have
identified the card causing the problem, contact the hardware manufacturer for further troubleshooting
assistance. The driver might need to be updated, or the card itself could be faulty.
The meanings of the parameters are described in Table 2.

Paramet
er 1

Parameter 2

Parameter 3

Parameter 4

Cause of error

A machine check exception


occurred.

0x0

Address of
WHEA_ERROR_RECOR
D structure.

High 32 bits of
MCi_STATUS MSR
for the MCA bank
that had the
error.

Low 32 bits of
MCi_STATUS MSR
for the MCA bank
that had the
error.

0x1

Address of
WHEA_ERROR_RECOR
D structure.

Reserved.

Reserved.

A corrected machine check


exception occurred.

0x2

Address of
WHEA_ERROR_RECOR
D structure.

Reserved.

Reserved.

A corrected platform error


occurred.

0x3

Address of
WHEA_ERROR_RECOR
D structure.

Reserved.

Reserved.

A nonmaskable Interrupt (NMI)


error occurred.

These parameter descriptions


apply if the processor is based
on the x64 architecture, or the
x86 architecture that has the
MCA feature available (for
example, Intel Pentium Pro,
Pentium IV, or Xeon).

Windows Bugcheck
Analysis
0x4

Address of
WHEA_ERROR_RECOR
D structure.

Reserved

Reserved.

An uncorrectable PCI Express


error occurred.

0x5

Address of
WHEA_ERROR_RECOR
D structure.

Reserved.

Reserved.

A generic hardware error


occurred.

0x6

Address of
WHEA_ERROR_RECOR
D structure

Reserved.

Reserved.

An initialization error occurred.

0x7

Address of
WHEA_ERROR_RECOR
D structure.

Reserved.

Reserved.

A BOOT error occurred.

0x8

Address of
WHEA_ERROR_RECOR
D structure

Reserved.

Reserved.

A Scalable Coherent Interface


(SCI) generic error occurred.

0x9

Address of
WHEA_ERROR_RECOR
D structure.

Length, in bytes, Address of the


of the SAL log.
SAL log.

An uncorrectable Itaniumbased machine check abort


error occurred.

0xA

Address of
WHEA_ERROR_RECOR
D structure

Reserved.

Reserved.

A corrected Itanium-based
machine check error occurred.

0xB

Address of
WHEA_ERROR_RECOR
D structure.

Reserved.

Reserved.

A corrected Itanium platform


error occurred.

Table 2: meanings of the parameters.

Windows Bugcheck
Analysis

Potrebbero piacerti anche