MS Performance HUD – Analyze EventLog Reading Performance In Realtime

This post explains why your home machine can work fast while many corporate PCs show high CPU consumption at random times especially after boot for several minutes with no apparent explanation.  Event Tracing for Windows (ETW) is a fantastic piece of technology. I love it because it enables me and perhaps you to tackle systemic issues regardless if the problem is my application, an Explorer plugin, a slow device driver or your Antivirus Scanner gone wild. Last time I have complained about slow WMI queries (https://aloiskraus.wordpress.com/2019/11/17/the-wmi-query-from-hell/) which were induced by some System Management software. It turns out that reading from the Windows Eventlog is a quite complex undertaking when you try to understand the slow read performance. For starters I try to read all Application event log messages via a WMI query

 

Slow Event Log Read Performance

SELECT * FROM Win32_NTLogEvent WHERE Logfile = ‘Application’

This runs ca. 35s which is quite a while. To put the number into perspective lets check the size of the event log which is 25 MB where 60232 messages were stored. This means we are reading a binary file with 0,7MB/s on my i7-4770K CPU @ 3.50GHz. That is really slow! Lets record some data with ETW and analyze it with the new Performance HUD Tool which was released at 29. May 2020. When I select the event log service which shows high CPU I can easily find why it is so busy:

image

We are reading 60K messages. While we are doing that the event log service process accesses the Registry over 800K times to find the currently configured language! The method EvtRpcMessageRender formats an event log message which in turn calls FormatMessage which does not reuse any cached thread culture but always reads the data from the registry. This means we have for every read message 13 registry accesses which is way to much. The Performance HUD shows on the lower grid performance insights and a tooltip what you can do about it. The performance of FormatMessage seems to be less important than to ensure 100% correctness when the language is switched because it must use the updated value instantaneously. The HUD Tool makes ETW data much more accessible than it was before. If I want to know how much data the Event Log service has read I just need to switch over to the File I/O tab which shows me the 25 MB parsed Application event log. If I need it I can also get the call stack by hovering over the blue underlined and press Ctrl+Click to find the corresponding stack traces.

image

Now we have proven that the event log service reads 25 MB of data but to format the read data it needs  800K registry reads just to determine the current language which costs in total over 35s of execution time. At least that is what I can see with the HUD Tool inside a single process. This tool is definitely not a v1.0 and is used within MS for quite some time based on the many helpful tooltips from performance experts. The HUD Tools is a great addition to the toolbox which warrants more detailed posts in the future. If you have problems to get it running drop me a note and I will see if I can help.

Performance HUD can be run in two modes

  • Realtime attached to a specific process
  • Offline while parsing data from an existing recorded ETL file
    • This requires CPU Profiling, Context Switch, File IO, Registry tracing with stacks enabled
    • For UI Hangs and User/GDI Handle leaks you need additionally to record the Microsoft-Windows-Win32k ETW Provider with 0x8000030000240000 flags + Stacks
      • The new GDI/User Leak tracking flags are (Win 10 1903. Windows Server 2019 e.g. is based on Win 10 1807. No server OS or Win 10 LTS release supports that yet …)
        • 0x0000010000000000  GdiHandleOperation
        • 0x0000020000000000  UserHandleOperation

Realtime ETW tracing is a great thing which makes it much easier for the average performance investigator to analyze ETW data right away. Especially UI thread hangs due to blocking calls got fantastic support which makes finding your UI hangs a no brainer.

 

System Wide Analysis

A first check is great to do in HUD but a more detailed analysis is still best done in WPA (and a 4K monitor). Lets see how the CPU consumption across processes is spent:

 

image

image

The WMI Query is executed from my test process named EventLogFlusherAndReader.exe (14796) which consumes 10s mostly reading registry keys for the COM class factory of some WMI COM objects. …

Next is is the svchost(1676) which is the Event Log Service Process with also 10s of CPU. There it is spending also most of the time as we have seen earlier with Registry Reading. In our case it is 5s/10s. Below is a picture how a WMI Event Log Query is actually executed:

 

image

WMI can do much but only with the help of many lackeys which work for us. We are reading a binary file and decode its messages but with WMI this causes significant CPU in 4 other system processes! Most time is spent inside the Event Log Service and the WMI Worker process and our WMI reading process. I was wondering why the EventLogReader process consumes so much CPU while it should only receive the data from WMI. It turns out there is more behind it because the COM objects which are used are not only for WMI but also for VirtualBox. Seem the VM host intercepts quite a bit of calls on my machine. It did uninstall VirtualBox (this was quite a journey because I have move the C:\Windows\Installer folder to a different drive which breaks many things. Note to self: Do this never again!) and checked again. Sure enough the additional Registry Queries are gone but the total runtime did not change much. The reason is that the Performance HUD Tool enables expensive ETW Logging event for every Registry access which exaggerates the costs of Registry accesses quit a bit. With HUD profiling enabled I did see application slowdowns of up to 40%. You should therefore still measure without Performance HUD enabled to get a less distorted picture what is really our bottleneck.

So what can we do about slow event log reading performance with WMI? As it turns out not much. MS should really check if it needs to access the language registry keys so often without any caching. In theory we can format the message by ourselves but this is only part of the story since ETW supports for an ETW event log message up to 3 resource dlls.

image

 

  1. MessageFileName: Same as old event log message table dlls which contains the localized resource string where the paramters are expanded at %1 %2, … as classic events
  2. ParameterMessageFile:  This contains constants such as Access Denied, Granted, …  which are defined with two %% followed by the resource number e.g. %%8100
  3. ResourceFileName: Defines the WEVT_TEMPLATE resource which contains event names, task names etc inside this binary structure

What will happen when I format messages in the old way by loading the resource dll and format the expansion parameters by myself? For an ETW Provider which is nice and which defines only resource based strings in the EventLog like this

image

I can correctly decode the message with the old EventLog API of .NET:

image

But when I remove the EventMessageFile registry key then I get an unresolved message just like any “modern” ETW provider would declare it in the EventLog if it is not nice to legacy APIs: logMsg.Message would error out and give me a message

The description for Event ID ‘xxx’ in Source ‘xxxxx’ cannot be found. The local computer may not have the necessary registry information or message DLL files to display the message, or you may not have permission to access them.

Event Log providers since Windows Vista only need a ProviderGuid below the event log registry key which makes it impossible to format resource based messages without the newer ETW Event Rendering APIs.

This is an issue if you are monitoring log messages issued by low level device drivers to report e.g. disk failures and other hardware issues. These important messages will slip through your monitoring system if you use the .NET System.Diagnostics.EventLog class. It is the fastest way to read messages but unfortunately you will miss out messages. The situation can be fixed to some extent by creating a registry key EventMessageFile for the ETW provider below the corresponding Event Log Source registry node. You need to look up the ProviderGuid below Computer\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\WINEVT\Publishers\ to find the corresponding MessageFileName. That will enable proper rendering of ETW based providers.

 

image

 

There is one caveat though. Fixed string constants which are formatted via %% in the resource based messages that are baked up by the ParameterFileName resource dll will not be formatted and will stay as placeholders in your formatted message. I have seen so far mostly for security related ETW providers to use that more advanced string expansion machinery.

The best option is still to use the newer .NET APIs in the form of System.Diagnostics.Eventing.Reader.EventLogReader which will will properly format all messages. The only issue is that this API just like WMI calls into the event log service where the messages formatting is done out of process which is much slower. I might be obsessed with performance when I complain about newer but more powerful APIs. Lets check things by reading and formatting 10K messages from the Application and Security Event log with the old System.Diagnostics.EventLog (readoldapi) and the newer System.Diagnostics.Eventing.Reader.EventLogReader (read) API.

C>EventLogFlusherAndReader.exe read -n 10000 -p 1 -l Application
Did read 10000 messages from log Application in 0.10s, 97,749 Messages/s, Format Repeat Count: 1
Total Formatting time 10.4 s.

C>EventLogFlusherAndReader.exe readoldapi -f -n 10000 -l Application
Read up to 10000 entries from Application log
Did read 10000 message table based event log messages from log Application in 1.45s, 6,899 Messages/s
Formatting time 0.4 s.
Reading and formatting for 10000 did take 1.85s, 5,410 Messages/s

 

C>EventLogFlusherAndReader.exe read -n 10000 -p 1 -l Security
Did read 10000 messages from log Security in 0.12s, 82,068 Messages/s, Format Repeat Count: 1
Total Formatting time 129.4 s.

C>EventLogFlusherAndReader.exe readoldapi -f -n 10000 -l Security
Read up to 10000 entries from Security log
Did read 10000 message table based event log messages from log Security in 1.37s, 7,277 Messages/s
Formatting time 0.6 s.
Reading and formatting for 10000 did take 1.98s, 5,041 Messages/s

 

With the “new” API we are over 20 times slower because formatting is now done in the EventLog Service which causes like WMI a lot of overhead. But what stands out is that reading data from the Security Event Log is > 200 times slower compared to the “old” EventLog API. Since .NET is open source it is easy to debug this and I have opened an issue https://github.com/dotnet/runtime/issues/34568 with a request to fix. As usual there is little motivation to change such old legacy code but I was invited to provide a PR which I did. After a bit of back and forth due to test failures I have landed this single line fix: https://github.com/dotnet/runtime/pull/36188/commits/dd0e212d75218f3b0eef3bc1ab812d7eb763279e to remove a superfluous check which did not serve any real purpose. The underlying issue was that for every message format call a check was done to format all task names. The Security Event Log defines 60 task names which explains the bad parsing performance for the Security Event Log. By removing a single line it did get 26 times faster which is not bad. This change will light up in .NET Core 5.0 so stay tuned. In the meantime you can still recompile the event log reading dll at your own from .NET and use it from a regular .NET application without any issues. It is even so easy to copy the source code from .NET Core and put it into your own project. Here are the results:

 

C>EventLogFlusherAndReader.exe read -n 10000 -p 1 -l Application
Did read 10000 messages from log Application in 0.10s, 99,245 Messages/s, Format Repeat Count: 1
Total Formatting time 2.7 s.

 

C>EventLogFlusherAndReader.exe read -n 10000 -p 1 -l Security
Did read 10000 messages from log Security in 0.12s, 80,601 Messages/s, Format Repeat Count: 1
Total Formatting time 4.3 s.

 

That speeds up reading event logs with .NET between 3-30 times depending on how many task names the log you are parsing has defined. It is still ca. 7 times slower compared to the old API which might be ok if you are not overdoing things.

Conclusions

The Security Event Log can be much bigger than 20 MB because some corporate security departments follow the MS security guidance to audit everything which needs a huge 256 MB or even a 1 GB Security Event Log file. If it would be much smaller it would overwrite the oldest events within hours depending on how your audit policies are configured. To check your current Audit Policies you can run auditpol.exe. On my home machine everything looks good but in a corporate setup things might be different. Especially Success audit messages can fill up your log files …

C>auditpol /get /category:*
System audit policy
Category/Subcategory                      Setting
System
  Security System Extension               No Auditing
  System Integrity                        Success and Failure
  IPsec Driver                            No Auditing
  Other System Events                     Success and Failure
  Security State Change                   Success
Logon/Logoff
  Logon                                   Success and Failure
  Logoff                                  Success
  Account Lockout                         Success
  IPsec Main Mode                         No Auditing
  IPsec Quick Mode                        No Auditing
  IPsec Extended Mode                     No Auditing
  Special Logon                           Success
  Other Logon/Logoff Events               No Auditing
  Network Policy Server                   Success and Failure
  User / Device Claims                    No Auditing
  Group Membership                        No Auditing
Object Access
  File System                             No Auditing
  Registry                                No Auditing
  Kernel Object                           No Auditing
  SAM                                     No Auditing
  Certification Services                  No Auditing
  Application Generated                   No Auditing
  Handle Manipulation                     No Auditing
  File Share                              No Auditing
  Filtering Platform Packet Drop          No Auditing
  Filtering Platform Connection           No Auditing
  Other Object Access Events              No Auditing
  Detailed File Share                     No Auditing
  Removable Storage                       No Auditing
  Central Policy Staging                  No Auditing
Privilege Use
  Non Sensitive Privilege Use             No Auditing
  Other Privilege Use Events              No Auditing
  Sensitive Privilege Use                 No Auditing
Detailed Tracking
  Process Creation                        No Auditing
  Process Termination                     No Auditing
  DPAPI Activity                          No Auditing
  RPC Events                              No Auditing
  Plug and Play Events                    No Auditing
  Token Right Adjusted Events             No Auditing
Policy Change
  Audit Policy Change                     Success
  Authentication Policy Change            Success
  Authorization Policy Change             No Auditing
  MPSSVC Rule-Level Policy Change         No Auditing
  Filtering Platform Policy Change        No Auditing
  Other Policy Change Events              No Auditing
Account Management
  Computer Account Management             No Auditing
  Security Group Management               Success
  Distribution Group Management           No Auditing
  Application Group Management            No Auditing
  Other Account Management Events         No Auditing
  User Account Management                 Success
DS Access
  Directory Service Access                No Auditing
  Directory Service Changes               No Auditing
  Directory Service Replication           No Auditing
  Detailed Directory Service Replication  No Auditing
Account Logon
  Kerberos Service Ticket Operations      No Auditing
  Other Account Logon Events              No Auditing
  Kerberos Authentication Service         No Auditing
  Credential Validation                   No Auditing

 

Reading a big event log by **** Some State Of The Art Security Solution ****  to check for malicious activity might kill interactive user experience since you are now maxing out two cores (the reader process and the event log service formatting your messages) for many minutes. A 1 GB Security Event Log file would need even with the fixed .NET version ca. 10 minutes to read. The current way how .NET handles event log messages would cause it to scan for over 5h through a 1 GB Security Event Log file. I hope such a security “solution” is not in production anywhere but one never knows for sure if some Admin has created a nice powershell script which uses Get-WinEvent to read all messages with a non deprecated up to date API in his Active Directory Forest. This post exists to make more people aware of the performance issues related to event log reading and applying tight security auditing policies. If you see minutes of high CPU spent in svchost.exe (it is a generic service host) which hosts the Event log service you can be pretty sure that someone is reading large event log file/s on your machine via powershell, WMI, or other ways.

image

Most often this is some security related software which is not fully optimized yet. Lets not give up but improve the tools you use and check if the policies which generate most log messages are really needed. If not you can reduce the size of the log and still keep the rollover time for the oldest log messages constant.

With Performance HUD it should be much easier to come to the same conclusions than I did. Try out the new tool and tell me what you have found! And lets hope that someone makes FormatMessage faster.