Jose L. Flores, All Rights Reserved.
The concept of an Interupt ReQuest Level (IRQL) associated with each processor running Windows NT is a fairly simple thing to understand. It is well documented (practically every device driver book donates a significant portion of it's coverage to it) and, at a conceptual level, well understood. So why does the need for this document? While it's well understood what IRQL does, what IRQL actually is is an entirely different matter. The internals of IRQL implmentation is so poorly understood that when I turned to a very well-known internals expert for help in understanding the IRQL dance, I was offered completely incorrect answers as to why KeRaiseIrql(), etc. actually do anything meaningful, and why they should ever be called at all on a uniprocessor machine. This document explains these mysteries (for x86 machines anyway), as well as how IRQL and DPCs are intertwined.
The purpose of associating an IRQL with a processor is to act as a synchronization mechanism. The idea is there are N IRQ Levels and each of these IRQLs defines a priority for each of the M interrupt vectors in the system. Each individual interrupt vector is assigned an IRQL value. The idea behind the IRQL concept is that an interrupt service routine running at some IRQL will not be interrupted by a vector who's associated IRQL is less than or equal to the current IRQL. Likewise a normal kernel mode routine running at a raised IRQL should not be interrupted by an ISR with an associated IRQL less than or equal to the current raised IRQL. For example, if your driver needs to do a high precision calibration of expected bus transfer rates, processor speed, etc, you should be able to raise to the highest IRQL and expect no interrupts (other than an unlikely NMI) will be acknowledged. Unfortunately, this is not the case. Interrupts occur and are acknowledge even when code is executing at the highest IRQLon most uniprocessor implementations. So who's responcible for this? Clearly syncronization handling involving interrupts is architecture specific. Not surpisingly, IRQL is completely dependent on the implementation of the HAL for a particular architecture. There are many HALs for x86 processors, but by far the most widely used HAL implementations are the x86 version of <halmps.dll> for architectures conforming to the Intel MultiProcessor Specification, and the x86 version of hal.dll for plain vanilla uniprocessor machines.
IRQL on an SMP machine running the halmps.dll HAL is IRQL the way it was meant to be. This implementation makes use of the Intel Advanced Programmable Interrupt Controller extensively for everything from timers and thread scheduling, to profiling and performance counting. IRQL on this HAL is defined completely by the Task Priority Register inside each processor's Local APIC. Each change in the value of a processor's IRQL is represented with a corresponding change in that processor's TPR. The HAL Internally maintains a lookup table for converting IRQL values to TPR values and vice versa. This is kept in an array of bytes in HalpIRQLtoTPR, and <HalpTPRtoIRQL>. When the IRQL is lowered, the highest lower priority interrupt which had been pending is instantly fired and it's ISR begins executing. This process is repeated untill all pending interrupts (including Software interrupts for requesting APC & DPCs) are drained. Finally, control is returned to the original interrupted code.
While IRQL on a uniprocessor machine prevents the execution of core ISRs for lower priority interrupt vectors, it does not prevent those interrupts from firing and at least part of the ISR from being executed. IRQL on a uniprocessor machine is purely artificial. As shown below, when you raise or lower IRQL on a uniprocessor machine you are literally, just changing a byte value inside of the PCR strucutre for that processor.
VOID KeRaiseIrql( IN KIRQL NewIrql, OUT PIRQL OldIrql )
*pOldIrql = PCR->CurrentIrql;
PCR->CurrentIrql = NewIrql;
Possible implementation of KeRaiseIrql()
There is absolutely no change to the state of any hardware anywhere. There is no hidden mapping of the PCR->CurrentIrql memory location to some hardware regsister. Because there is no change to any hardware, it should not come as a surprise that there is nothing to prevent lower priority interrupts from firing even when running at the highest IRQL possible. The only way to physically prevent specific interrupts from firing on the standard HAL implementation is to either execute an STI instruction to disable all interrupts, or write to the PIC's IMR register directly. The HAL does offer 2 undocumented API's to truely disable or enable interrupts using HalDisableSystemInterrupt() and HalEnableSystemInterrupt(), respectively.
So if playing with IRQL values does not prevent your code from being interrupted, then why do we use it all on uniprocessor systems? Portability could, of course, be argued as a possible reason, but a better reason is because the NT kernel performs explicit checks of the IRQL value at key points in it's execution. These checks are not performed explicitly on SMP systems, because the use of the hardware Task Priority Register takes care of this automagically. Each time an interrupt is signaled in hardware, it's vector is fired immediately. Interrupt handlers are implemented almost exclusively within ntoskrnl. Notable exceptions are the profile, and performance interrupt handlers. When the vector is fired, the handler begins execution immediately and NT saves the contex for the processor's current state. A few other bookkeeping tasks are performed such as incrementing variables used for counting interrupts executed by the kernel's performance counters, and finally, an explicit test of the vector's associated IRQL is made with the current executing IRQL. If the vector's IRQL is greater than the current IRQL, execution in the ISR continues a normal. For interrupts objects allocated via IoConnectInterrupt(), normal execution involves acquiring the interrupt's spinlock (on an SMP machine) and calling the interrupt object.
If the vector's IRQL is less than or equal to the currently executing code's IRQL, then everything is undone, the processor's context is restored, and control is returned to the interrupted code. In quite a few cases, the interrupt is actually acknowledged and dismissed. In cases where the interrupt is dismissed, variables inside the PCR are used to mark the particular interrupt vectors as having had an interrupt fire that was not completely serviced. However, the dissmissal process never involved dismissing the interupt on the device that requested it. Because this logging is done at a coarse level (using a bitmask), dismissing the interrupt is usually followed with actually masking the particular interrupt in hardware to prevent further execution of the ISR until the pending interrupt can be serviced completely.
Each time IRQL is dropped, a check is performed to see if there are any 'pending' interrupts. If so, then a software interrupt is issued (after the IRQL is dropped) corresponding to the vector to which the hardware interupt is mapped. For example, if the clock interupt is mapped to vector 0x30 with an IRQL of 0x1C, then if the current IRQL is dropping from 0x1F to 0x2, then the HAL will execute a INT 0x30 instruction. The ISR executes as normal, and the interrupt is now completely serviced. The pending bit is cleared, and the interrupt is unmasked.
A special case is made for APC & DPC requests. These are normally described as being implemented as software interrupts. This is not true on a uniprocessor HAL. The when the pending interrupt check described above is performed, a call is made directly into regular functions that implement APC & DPC requests, HalpAPCInterrupt() and HalpDispatchInterrupt2(), respectively.
* Irql doesn't prevent ints...only CLI, or masking interrupts flags will.....so why do it? because there are excplicit tests for IRQL...where is IRQL used? keraiseirql, etc, system, interrupt handlers, spin locks, releasing a fast mutex...why do we care? calibrations are interrupted: briefly at least.
Ilustrate the difference between IRQL on an SMP machien & on a uniprocessor HAL graphically...also point out that the IRQL value in the PCR never changes on an SMP machine..makes getting an IRQL on another processor much more difficult...also point out that HalEnableSystemInterrupt only work on SYSTEM interupts...vectors about 0x30.