To debug an issue such as this, you can use WinDbg to look at what the thread is currently doing. Open a local kernel debugger session, and start by listing the information about the Notmyfault.exe process with the !process command:lkd> !process 0 7 notmyfault.exe
PROCESS 86843ab0 SessionId: 1 Cid: 0594 Peb: 7ffd8000 ParentCid: 05c8
DirBase: ce21f380 ObjectTable: 9cfb5070 HandleCount: 33.
Image: NotMyfault.exe
VadRoot 86658138 Vads 44 Clone 0 Private 210. Modified 5. Locked 0.
DeviceMap 987545a8
...
THREAD 868139b8 Cid 0594.0230 Teb: 7ffde000 Win32Thread: 00000000
WAIT: (Executive) KernelMode Non-Alertable
86797c64 NotificationEvent
IRP List:
86a51228: (0006,0094) Flags: 00060000 Mdl: 00000000
...
ChildEBP RetAddr Args to Child
88ae4b78 81cf23bf 868139b8 86813a40 00000000 nt!KiSwapContext+0x26
88ae4bbc 81c8fcf8 868139b8 86797c08 86797c64 nt!KiSwapThread+0x44f
88ae4c14 81e8a356 86797c64 00000000 00000000 nt!KeWaitForSingleObject+0x492
88ae4c40 81e875a3 86a51228 86797c08 86a51228 nt!IopCancelAlertedRequest+0x6d
88ae4c64 81e87cba 00000103 86797c08 00000000 nt!IopSynchronousServiceTail+0x267
88ae4d00 81e7198e 86727920 86a51228 00000000 nt!IopXxxControlFile+0x6b7
88ae4d34 81c92a7a 0000007c 00000000 00000000 nt!NtDeviceIoControlFile+0x2a
88ae4d34 77139a94 0000007c 00000000 00000000 nt!KiFastCallEntry+0x12a
01d5fecc 00000000 00000000 00000000 00000000 ntdll!KiFastSystemCallRet
...
From the stack trace, you can see that the thread that initiated the I/O realized that the IRP had been cancelled (IopSynchronousServiceTail called IopCancelAlertedRequest) and is now waiting for the cancellation or completion. The next step is to use the same debugger extension command used in the previous experiments, !irp, and attempt to analyze the problem. Copy the IRP pointer, and examine it with !irp:lkd> !irp 86a51228
Irp is active with 1 stacks 1 is current (= 0x86a51298)
No Mdl: No System Buffer: Thread 868139b8: Irp stack trace.
cmd flg cl Device File Completion-Context
>[ e, 0] 5 0 86727920 86797c08 00000000-00000000
\Driver\MYFAULT
Args: 00000000 00000000 83360020 00000000
From this output, it is obvious who the culprit driver is: \Driver\MYFAULT, or Myfault.sys. The name of the driver emphasizes that the only way this situation can happen is through a driver problem and not a buggy application. Unfortunately, now that you know which driver caused this issue, there isn’t much you can do—a system reboot is necessary because Windows can never safely assume it is okay to ignore the fact that cancellation hasn’t occurred yet. The IRP could return at any time and cause corruption of system memory. If you encounter this situation in practice, you should check for a newer version of the driver, which might include a fix for the bug.
I/O Completion Ports
Writing a high-performance server application requires implementing an efficient threading model. Having either too few or too many server threads to process client requests can lead to performance problems. For example, if a server creates a single thread to handle all requests, clients can become starved because the server will be tied up processing one request at a time. A single thread could simultaneously process multiple requests, switching from one to another as I/O operations are started, but this architecture introduces significant complexity and can’t take advantage of systems with more than one logical processor. At the other extreme, a server could create a big pool of threads so that virtually every client request is processed by a dedicated thread. This scenario usually leads to thread-thrashing, in which lots of threads wake up, perform some CPU processing, block while waiting for I/O, and then, after request processing is completed, block again waiting for a new request. If nothing else, having too many threads results in excessive context switching, caused by the scheduler having to divide processor time among multiple active threads.
The goal of a server is to incur as few context switches as possible by having its threads avoid unnecessary blocking, while at the same time maximizing parallelism by using multiple threads. The ideal is for there to be a thread actively servicing a client request on every processor and for those threads not to block when they complete a request if additional requests are waiting. For this optimal process to work correctly, however, the application must have a way to activate another thread when a thread processing a client request blocks on I/O (such as when it reads from a file as part of the processing).
The IoCompletion Object