The Final Windows Handle Leak Guide

One of the harder lesson during development is how to deal with leaks. Most of the time you are leaking memory but sometimes also OS resources like Handles. These are more difficult to detect because the application does not use excessive amounts of memory when leaking OS resources. Leaks are not just wasteful; they can also be a source of hard-to-spot bugs. Below is a list of handle types that can be leaked. There are many uncommon handle types, primarily used by Windows itself:

  • ALPC Port
  • Composition
  • CoreMessaging
  • Desktop
  • Device
  • Directory
  • DxgkCompositionObject
  • DxgkDisplayManagerObject
  • DxgkSharedResource
  • DxgkSharedSyncObject
  • EnergyTracker
  • EtwConsumer
  • EtwRegistration
  • Event
  • File
  • FilterCommunicationPort
  • FilterConnectionPort
  • IRTimer
  • IoCompletion
  • IoCompletionReserve
  • Job
  • Key
  • Mutant
  • Partition
  • PcwObject
  • PowerRequest
  • Process
  • RawInputManager
  • Section
  • Semaphore
  • Session
  • SymbolicLink
  • Thread
  • Timer
  • TmEn
  • TmRm
  • TmTm
  • TmTx
  • Token
  • TpWorkerFactory
  • UserApcReserve
  • WaitCompletionPacket
  • WindowStation
  • WmiGuid

Most of the time, you are leaking event, file, process, and thread handles. Common issues include file sharing violations when you forget to close a file before opening it again. If you forget to close thread or process handles, you will see process/thread IDs reaching the range of millions. Named events, if not closed properly, can lead to hard-to-spot race conditions. When you create a named event with the same name again, you are merely reopening the existing one, which might already be in an unexpected signaled state. This can result in unexpected application behavior.

So how can you track down handle leaks? There are a few commercial options that hook into the relevant APIs, but Windows has a built-in mechanism to track handle leaks using Event Tracing for Windows (ETW).

Since Windows 10, Windows Performance Recorder (WPR) is included, allowing you to start tracking your handle leaks right away. However, there’s a problem: the WPR version in Windows 10 is broken and often returns the infamous error message when you stop profiling:

wpr -stop problem.etl

        Cannot change thread mode after it is set.

        Profile Id: RunningProfile

        Error code: 0x80010106

See https://devblogs.microsoft.com/performance-diagnostics/wpr-start-and-stop-commands/ for more information. To fix that you can use the latest Windows Performance Toolkit which comes with Windows 11. The version from Windows 11 works already out of the box.

To start tracking handle leak you can use wpr

wpr -start Handle
// Repro your issue
wpr -stop c:\temp\HandleLeak.etl

or xperf (see section recording data https://github.com/Siemens-Healthineers/ETWAnalyzer/blob/main/ETWAnalyzer/Documentation/DumpObjectRefCommand.md)

That approach works, but the main problem with handle tracing is that over 90% of the handle data is related to events, which quickly fills up your ETW in-memory ring buffers. ETW is usually configured to record into memory ring buffers, which cannot exceed about 10% of your physical RAM. This limitation severely impacts the usefulness of ETW handle tracing if you need to track handles over an extended period (e.g., hours).

However, there is a solution. My recording UI, ETWController (available on GitHub: ETWController), includes a handle type filter that allows you to limit handle tracing to a specific handle type. To my knowledge, this functionality is not yet exposed via xperf or WPR. This filtering capability can help you effectively manage the amount of traced data and track handle leaks more efficiently over extended periods.

That allows you to record for not so common handle types for a very long time. If you need to record even longer, you can modify the supplied Handle profile to remove the CloseHandle stackwalks which will reduce the amount of traced data by ca. 30-40%. Stack traces for an ETW event are usually much larger than the actual traced data.

So, how do you know which handle type you are after? The great Process Explorer clone, System Informer, can help. Here’s how:

  1. Open System Informer.
  2. Double-click on the process you’re interested in.
  3. Select the “Handles” tab.
  4. Click on the “Options” menu and choose “Statistics”

This will provide you with a handle type summary, which is a good first step in identifying the leaked handle type. This summary helps you determine which specific handles to focus on, making it easier to use tools like ETWController for more precise tracking and troubleshooting.

After having recorded the data you can track down your leaks in WPA where you can add Handles to your analysis pane from the Memory view.

A useful grouping is Creating Process, Handle Type, Object Name, Lifetime, Close Stack, Create Stack left of the yellow line. That way, you can quickly find all not closed (aka leaked) handles and directly go to the leaking stacks.

That works nicely, but sometimes you need to find correlations between different handles and look at the timing. My tool ETWAnalyzer converts the binary ETL file into compressed Json files which are much more accessible but still hard disk friendly. To extract from an ETL file handle data you can use:

ETWAnalyzer -extract ObjectRef -fd yourETW.etl -symserver ms

C>ETWAnalyzer -extract ObjectRef -fd c:\temp\Boot_HandleTrace_MAGNON.etl -symserver ms
1 - files found to extract.
Success Extraction of c:\temp\Extract\Boot_HandleTrace_MAGNON.json7z
Extracted 1/1 - Failed 0 files.
Extracted: 1 files in 00 00:02:37, Failed Files 0

Then you can query the data with ETWAnalyzer in console mode

ETWAnalyzer -console

Now load the extracted json file

.load c:\temp\Extract\Boot_HandleTrace_MAGNON.json7z

And query the handle data for a system wide summary of the top 5 handle types. For a full documentation of the .dump ObjectRef command see https://github.com/Siemens-Healthineers/ETWAnalyzer/blob/main/ETWAnalyzer/Documentation/DumpObjectRefCommand.md

.dump ObjectRef -ShowTotal Total -TopN 5

To see all handle events (-Dump ObjectRef) of explorer (-pn or -ProcessName) for the registry key \REGISTRY\USER\S-1-5-21-3592500153-2310523904-3374296524-1001 (-ObjectName) end up with this query:

.dump ObjectRef -pn explorer.exe -ObjectName \REGISTRY\USER\S-1-5-21-3592500153-2310523904-3374296524-1001

This displays a long list of events because most of the handles are closed. At the end, a summary is printed:

Objects Created/Destroyed: 2672/2667 Diff: 5, Handles Created/Closed/Duplicated: 3186/3181/0 Diff: 5, RefChanges: 0, FileMap/Unmap: 0/0

which tells you that 5 kernel objects and 5 handles have not been closed during the recording. To show just the leaking handles you can add -Leak to the previous query which will just show the events which were not yet closed during the recording:

.dump ObjectRef -pn explorer.exe -ObjectName \REGISTRY\USER\S-1-5-21-3592500153-2310523904-3374296524-1001 -Leak

To view the stack traces of the pending allocations, add -ShowStack and you have your leaking stacks

You need to check the stacks if that is a real leak or simply a case of normal application behavior. For 5 handles it is most likely not a leak and explorer just did not yet get around to release them yet. But if you loose thousands of handles in such stacks, then you might have found a leak.

Handles reference kernel objects which are reference counted. You can open an existing kernel object (e.g. a named event) and just increase the reference count by one. The underlying kernel object is only released when the last handle reference is closed.

ETWAnalyzer shows object activity sorted by time grouped by

  • Handle Create
  • Handle Duplicate
  • Handle Close

calls. This makes it easy to spot handle leaks caused by incorrect threading where you e.g. have a data race and assign a just created handle value to a memory location while another thread did the same. Due to missing locking, you will “forget” aka leak the other handle which is then never closed. The code below is an example of such a data race if you call GetOrCreateEvent from multiple threads:

class EventManager
{
private:
    HANDLE m_Event;
    int m_Id;

public:
HANDLE GetOrCreateEvent()
{
    // Here is a lock missing!
    if (m_Event == nullptr)  
    {
      m_Event = CreateEvent(m_Id);
    }
    return m_Event;
 }

 ~EventManager()
 {
      if (m_Event != nullptr)
      {
          BOOL lret = ::CloseHandle(m_Event);
          m_Event = nullptr;
      }
 }
....

More details are at https://github.com/Siemens-Healthineers/ETWAnalyzer/blob/main/ETWAnalyzer/Documentation/DumpObjectRefCommand.md

Handle leak tracking has never been easier. If you find one with the tooling drop me note that it has been helpful.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.