How Fast is your Virus Scanner?

I am sure everyone has experienced slowdowns due to Antivirus solutions, but very few are able to attribute it to the right component on your Windows box. Most people do not even know what AV solution/s are running on their machine. The end result can be a secure but slow system which is barely usable.

Is your CPU fan always on and your notebook getting hot although you are not running anything? You want to find out what is consuming so much CPU with a command line tool?

Then I have a new tool for you: ETWAnalyzer. It is using the Trace Processing Library which powers WPA (Windows Performance Analyzer). WPA is a great UI to analyze tricky systemic issues, but if you need to check a trace the n-th time if the same issue is there again you want to automate that analysis.

ETWAnalyzer is developed by Siemens Healthineers as its first public open source project. Full Disclosure: I work for Siemens Healthineers. One motivation to make it public was to show how incredibly useful ETW (Event Tracing for Windows) is to track down performance issues in production and development environments. It is one of the best technologies for performance troubleshooting Microsoft has invented, but it is valued by far too few people. ETW is part of Windows and you need no additional tools or code instrumentation to record ETW data! Starting with Windows 10 the Windows Performance Recorder (wpr.exe) is part of the OS. You have a slow system? In that case I would open an Administrator cmd shell and type

C>wpr -start CPU -start DiskIO -start FileIO

Now execute your slow use case and then stop profiling with

C>wpr -stop C:\temp\SlowSystem.etl

This will generate a large file (hundreds MB – several GB) which covers the last ca. 60-200s (depends on system load and installed memory) what your system was doing. If you have a lot of .NET applications installed stopping the first first time can take a long time (10-60 min) because it will create pdbs for all running .NET applications. Later profiling stop runs are much faster tough.

To record data with less overhead check out MultiProfile which is part of FileWriter. You need to download the profile to the machine to a directory e.g. to c:\temp and to record the equivalent to the before command with

C>wpr -start C:\temp\MultiProfile.wprp!CSwitch -start C:\temp\MultiProfile.wprp!File

Getting the data is easy. Analyzing is hard when you do not know where you need to look at.

This is where ETWAnalyzer can help out to ask the recorded data questions about previously analyzed problems. You ask: Previously? Yes. Normally you start with WPA to identify a pattern like high CPU in this or that method. Then you (should) create a stacktag to assign a high level description what a problematic method is about. The stacktags are invaluable to categorize CPU/Wait issues and correlate them with past problems. The supplied stacktag file of ETWAnalyzer contains descriptions for Windows/Network/.NET Framework/Antivirus products even some Chrome stacktags. To get started you can download the latest ETWAnalyzer release from und unzip it to your favorite tool directory and extend the PATH environment to the directory. If you want maximum speed you should use the self contained .NET 6 version.

Now you are ready to extract data from an ETL file with the following command:

ETWAnalyzer -extract all -filedir c:\temp\SlowSystem.etl -symserver MS

After extraction all data is contained in several small Json files.

Lets query the data by looking at the top 10 CPU consumers

ETWAnalyzer -dump CPU -filedir c:\temp\Extract\SlowSystem.json -topN 10

The second most CPU hungry process is MsMpEng.exe which is the Windows Defender scan engine. Interesting but not necessarily a bad sign.

Lets check if any known AV Vendor is inducing waits > 10ms. We use a stacktag query which matches all stacktags with Virus in their name.

ETWAnalyzer -dump cpu -fd c:\temp\Extract\SlowSystem.json -stacktags *virus* -MinMaxWaitMs 10

We find a 1349ms slowdown caused by Defender in a cmd shell. I did observe slow process creation after I hit enter to start the fresh downloaded executable from the internet.

Explorer seems also to be slowed down. The numbers shown are the sum of all threads for CPU and wait times. If multiple threads were executing the the method then you can see high numbers for the stacktags/methods. For single threaded use cases such as starting a new process the numbers correlate pretty good with wall clock time.

The currently stacktagged AV solutions are

  • Applocker
  • CrowdStrike
  • CyberArk
  • Defender
  • ESET
  • Mc Afee
  • Palo Alto
  • Sentinel One
  • Symantec
  • Trend Micro

Lets dump all slow methods with a wait time between 1300-2000ms and the timing of all process and file creation calls in explorer.exe and cmd.exe

ETWAnalyzer -dump cpu -fd c:\temp\Extract\SlowSystem.json  -stacktags *virus* -MinMaxWaitMs 1300-2000 -methods *createprocess*;*createfile* -processName explorer;cmd

Antivirus solutions usually hook into CreateProcess and CreateFile calls which did obviously slow down process creation and file operations in explorer and process creation in cmd.exe.

Was this the observed slowness? Lets dump the time when the file and process creation calls were visible the first time in the trace

ETWAnalyzer -dump cpu -fd c:\temp\Extract\SlowSystem.json  -stacktags *virus* -MinMaxWaitMs 1300-2000 -methods *createprocess*;*createfile* -processName explorer;cmd -FirstLastDuration local

That correlates pretty well with the observed slowness. To blame the AV vendor we can dump all methods of device drivers which have no symbols. Microsoft and no other AV vendor delivers symbols for their drivers. By this definition all unresolved symbols for system drivers are likely AV or hardware drivers. ETWAnalyzer makes this easy with the -methods *.sys -ShowModuleInfo query to show all called kernel drivers from our applications for which we do not have symbols :

ETWAnalyzer -dump cpu -fd c:\temp\Extract\SlowSystem.json -methods *.sys -processName explorer;cmd -ShowModuleInfo

The yellow output is from an internal database of ETWAnalyzer which has categorized pretty much all AV drivers. The high wait time comes from WDFilter.sys which is an Antivirus File System driver which is part of Windows Defender.

What have we got?

  • We can record system wide ETW data with WPR or any other ETW Recoding tool e.g. which can capture screenshots to correlate the UI with profiling data
  • With ETWAnalyzer we can extract aggregated data for
    • CPU
    • Disk
    • File
    • .NET Exceptions
  • We can identify CPU and wait bottlenecks going from a system wide view down to method level
    • Specific features to estimate AV overhead are present

This is just the beginning. You can check out the documentation for further inspiration how you can make use of system wide profiling data ( which can be queried from the command line.

ETWAnalyzer will not supersede WPA to analyze data. ETWAnalyzers main strength is in analyzing known patterns quickly in large amounts of extracted ETW data. Querying 10 GB of ETW data becomes a few hundred MB of extracted JSON data which has no slow symbol server lookup times anymore.

Happy bug hunting with your new tool! If you find bugs or have ideas how to improve the tool it would be great if you file an issue.