MemAnalyzer v2.5 Released

Managed  (and unmanaged) memory leak detection is still not as easy as it should be. MemAnalyzer tries to be a simple command line tool with the goal to help with the easy problems but also with the hard issues which surface only after hours of stress testing. Often you have a mixture of managed and unmanaged memory leaks where you need memory dumps and VMMap information to get the full picture. MemAnalyzer helps to automate these boring tasks. It is open source at Github (https://github.com/Alois-xx/MemAnalyzer). The executable can can be downloaded here: https://github.com/Alois-xx/MemAnalyzer/releases.

If you are looking for a nice UI to look into memory dumps I recommend MemoScope.NET (https://github.com/fremag/MemoScope.Net) which lets you look into dump files without the need to resort back to Windbg. By the way if you have Visual Studio Ultimate you can already analyze managed memory dumps. But to analyze issues in production installing Visual Studio is not an option. PerfView on the other hand is a great tool but the numbers are only approximations which can make it hard to spot handle leaks. The object counts reported by PerfView are often off by the order of a magnitude. MemAnalyzer tries to get exact metrics of the real alive objects with the -live switch which is as good as !DumpHeap -stat -live of Windbg as CSV output.

MemAnalyzer Features

  • Single self contained executable
  • Supports x86 and x64 processes and memory dumps
  • .NET Core on Windows x86 and x64 support (.NET Core 1.0,1.1 and 2.0, …)
  • Create memory dumps with associated VMMap data
  • Analyze managed heap, unmanaged, private bytes and file mappings when VMMap is present
  • Memory dump diff
  • Optional CSV output

Usage – Leak Detection

Why bother with a command line tool if nicer UIs are around? Because MemAnalyzer is capable to track not only your managed memory but also the unmanaged parts. When a managed application leaks memory you need first to find out if the leak happens on the managed heap or somewhere else.  Depending on the leaked memory type one needs to use different approaches/tools to track the leak down.

The memory consumption of a process can be categorized as

  • Managed Heap
  • Unmanaged Heap
  • Private Bytes
  • Page File Allocated Shared Memory (Shareable in VMMap lingo)
  • Memory Mapped Files

Since there are quite a few different memory types inside a process it is important to know where you need to look at.  MemAnalyzer uses VMMap to determine the size of each region and prints them out in a nice summary which can be put into a CSV file to e.g. get a summary after each test run during an automated test.

C>MemAnalyzer.exe -pid 17888 -vmmap

AllocatedBytes          Instances(Count)        Type
4,648,204               105,374                 System.String
918,824                 22                      System.String[]
697,640                 27,607                  System.Object[]
662,424                 27,601                  System.Int32
1,512                   27                      System.RuntimeType
1,072                   2                       System.Globalization.CultureData
830                     5                       System.Char[]
580                     8                       System.Int32[]
432                     2                       System.Globalization.NumberFormatInfo
26,130                  1,087                   Managed Heap(Free)!
6,936,367               160,704                 Managed Heap(Allocated)!
7,158,288                                       Managed Heap(TotalSize)
25,165,824                                      Reserved_Stack
54,398,976                                      Committed_Dll
1,994,752                                       Committed_Heap!
4,177,920                                       Committed_MappedFile!
565,248                                         Committed_Private!
3,825,664                                       Committed_Shareable!
73,629,696                                      Committed_Total
17,499,952                                      Allocated(Total)
  • Allocated managed objects. That is very much similar to !DumpHeap -stat in Windbg only with more options.
    • If you add -live then the metric will contain no temporary objects which were not reclaimed by the GC yet.
  • Managed heap summary which shows an overall metric how big the heap is and how much of it is allocated and free.
  • Additional VMMap information that gives you an overview which other memory types are allocated in the process.
    • MemAnalyzer needs the VMMap tool in the path to get that data.
  • Allocated = Managed Heap(Allocated) + Heap + MappedFile + Private Bytes + Shareable

The Allocated value is important because if this value rises over time you have a leak in one of the memory types of the sum. If you print this value over time and it does not rise you have no leak (warning simplified!). That is simple enough to do it repeated times by e.g. a script to verify that your long running test behaves well. Since repeated measurements are key to detecting a memory leak MemAnalyzer allows you to append the output to a CSV file along with some context e.g. Iteration 1, 100 to get more clues.

Inside your tracking script a more realistic command line would be

MemAnalyzer -pid {0} -vmmap -o leak.csv -dtn 5000;N#200 -live -silent  -context “{1}”

This will append the output of -dtn (Dump Type by Number)  for up to 5K types with an instance count > 200 to the CSV file leak.csv. Each line will get a context column which can be e.g. your test run number or whatever it makes easier to correlate the time when the data was taken. To get additional information you can add automatic memory dumps to the mix with

MemAnalyzer -procdump -ma {0} {1}\PROCESSNAME_{0}_YYMMDD_HHMMSS.dmp

This will take a memory dump of the process with pid {0} with procdump (needs to be in the path) and also gather VMMap information automatically (VMMap needs to be in the path). The big letter words will be expanded by procdump automatically. That way you can e.g. take a full memory dump after 1, 10 , 100, 500 iterations which contains everything but the trending data is contained in the csv file for every iteration which makes it much easier to track down the real memory leaks. Based on personal experience it is pretty easy to be led down the wrong path by a few memory dumps created by coworkers. The first dump might be created before anything was loaded into the application and the last dump might still have the test data loaded which looks like a pretty big leak but it is not the leak you a after when you have lost 500 MB after 100 iterations. Having more data points at hand which can easily be graphed in Excel is a big help to concentrate on the important incidents and to identify stable patterns and trends without the need to take a gazillion of memory dumps.

Usage – Memory Optimization

If you want to optimize the memory usage of an existing application MemAnalyzer is also a big help because you can quickly diff a memory memory dump which is your baseline against the currently running application. To get started you should take a memory dump of your current state.

MemAnalyzer -procdump -ma pid C:\temp\Baseline.dmp

After you have optimized the data structures of your application to (hopefully) consume less memory you can compare the running application against your saved baseline

MemAnalyzer -f2 baseline.dmp -pid ddd

When you use -f2 then 2-1 will be subtracted and you get a nice diff summary output. To keep the output short the diff is sorted by absolute values which makes it easy to spot top memory additions and deletions along with the totals.

Lets do an step by step example what that means for your usual application development workflow. First we start with our memory hungry application and isolate the memory issue into a single reproducer like this:

using System;
using System.Collections.Generic;

namespace coreapp
{
    class DataInstance : IDisposable
    {
        Func<string> Checker;
        long Instance;
        bool IsDisposed;
        DataInstance[] Childs;

        public DataInstance(int instance)
        {
            Instance = instance;
            Checker = () => $"Instance {Instance} already disposed";
        }

        public void Dispose()
        {
            if (IsDisposed)
            {
                throw new ObjectDisposedException(Checker());
            }
	     IsDisposed = true;
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            var instances = new List<DataInstance>();
            for(int i=1;i<=5*1000*1000;i++)
            {
                instances.Add(new DataInstance(i));
            }
            Console.ReadLine();
        }
    }
}

We give MemAnalyzer the process id to create a baseline memory dump. Since MemAnalyzer uses procdump and VMmap you should have both already downloaded and the tools in your path to make it work.

MemAnalyzer.exe -procdump -ma 11324 DotNetCoreApp_1.0.dmp

Ok we have a dump of a .NET Core application. How can we look into it?

 

MemAnalyzer.exe -f DotNetCoreApp_1.0.dmp
Error: Is the dump file opened by another process (debugger)? If yes close the debugger first.
       If the dump comes from a different computer with another CLR version v1.0.25211.02 that you are running on your machine you need to download the matching mscordacwks.dll first. Check out https://1drv.ms/f/s!AhcFq7XO98yJgoMwuPd7LNioVKAp_A and download the matching version/s.
       Then set _NT_SYMBOL_PATH=PathToYourDownloadedMscordackwks.dll  e.g. _NT_SYMBOL_PATH=c:\temp\mscordacwks in the shell where you did execute MemAnalyzer and then try again.

Got Exception: System.IO.FileNotFoundException: mscordaccore_Amd64_Amd64_1.0.25211.02.dll

 

Ups we have got an error. Most people stop reading when an error occurs because the error messages are most often not that helpful. But  this case is different. You need to download the folder of my OneDrive folder of the link in the error message to get nearly all .NET/Core debugging dlls you could ever need. Download them into e.g. C:\PerfTools. Then you need to tell MemAnalyzer where to look for it with the -dacdir option or you can set the environment variable _NT_SYMBOL_PATH=c:\PerfTools to get rid of the need to specify the dac directory every time manually.

MemAnalyzer.exe  -dts -f DotNetCoreApp_1.0.dmp -dacdir c:\PerfTools

AllocatedBytes          Instances(Count)        Type
320,000,000             5,000,000               System.Func<System.String>
240,000,000             5,000,000               coreapp.DataInstance
100,663,368             3                       coreapp.DataInstance[]
24,530                  145                     System.String
33,627,594              139                     Managed Heap(Free)!
660,714,944             10,000,277              Managed Heap(Allocated)!
694,348,008                                     Managed Heap(TotalSize)

We have 660 MB on the managed heap allocated which is quite a lot of data. There are 5 million Func<string> and DataInstance instances. But why are we having 3 DataInstance arrays with 100MB? These look like temp arrays left over from our List<DataInstance> while growing the internal array. To get rid of garbage data you can either do a GC.Collect() before taking the dump or you tell MemAnalyzer to only track objects which are still alive.

MemAnalyzer.exe  -f DotNetCoreApp_1.0.dmp -dacdir c:\PerfTools -live

AllocatedBytes          Instances(Count)        Type
320,000,000             5,000,000               System.Func<System.String>
240,000,000             5,000,000               coreapp.DataInstance
67,108,912              2                       coreapp.DataInstance[]
24,530                  145                     System.String
627,160,448             10,000,275              Managed Heap(Allocated)!
694,348,008                                     Managed Heap(TotalSize)

There is still one array left which does not belong there but the numbers are now better. While looking at the data I decided that we should get rid of the many delegate instances which cost 64 byte per instance which add up to 320 MB alone for the instances itself. But since the DataInstance object also keeps a reference (on x64 8 bytes) we have even more memory to spare. If we get rid of the delegate and remove the class member we should be able to spare 5m*(64+8)=360MB of memory. That’s a plan. Lets measure things.  Our refactored class becomes

    class DataInstance : IDisposable
    {
        long Instance;
        bool IsDisposed;
        DataInstance[] Childs;

        public DataInstance(int instance)
        {
            Instance = instance;
        }

        public void Dispose()
        {
            if (IsDisposed)
            {
                throw new ObjectDisposedException($"Instance {Instance} already disposed");
            }
            IsDisposed = true;
        }
    }

By taking a second dump we can diff both dump files with

MemAnalyzer.exe -f DotNetCoreApp_NoFuncDelegate.dmp -f2 DotNetCoreApp_1.0.dmp -dacdir c:\PerfTools

Delta(Bytes)    Delta(Instances)        Instances       Instances2      Allocated(Bytes)        Allocated2(Bytes)       AvgSize(Bytes)  AvgSize2(Bytes) Type
320,000,000     5,000,000               0               5,000,000       0                       320,000,000                             64              System.Func<System.String>
40,000,000      0                       5,000,000       5,000,000       200,000,000             240,000,000             40              48              coreapp.DataInstance
0               0                       1               1               160                     160                     160             160             System.Globalization.CalendarData
360,000,000     5,000,000               5,000,277       10,000,277      300,714,930             660,714,930                                             Managed Heap(Allocated)!
360,010,320     0                       0               0               334,337,688             694,348,008                                             Managed Heap(TotalSize)

As expected we got rid of 5 million Func<String> instances. After removing one field in DataInstance the instance size did shrink by 8 bytes from 48 down to 40 bytes which saved another 40MB. That is already quite good. But can we do better? The dispose check is an extra bool flag which will need  due to padding 4 bytes anyway. To eliminate the bool field we can reuse the Instance field and negate the Instance count to we keep the stored value which is always > 0. When you look closely you find that Instance is of the type long but we only need an int because we will always assign in the ctor the value from an integer. The revised DataInstance class is now

    class DataInstance : IDisposable
    {
        int Instance;
        DataInstance[] Childs;

        public DataInstance(int instance)
        {
            Instance = instance;
        }

        public void Dispose()
        {
            if (Instance < 0)
            {
                throw new ObjectDisposedException($"Instance {-1*Instance} already disposed");
            }

            Instance *= -1; 
        }
    }

When we diff things again

MemAnalyzer.exe -f DotNetCoreApp_NoFuncDelegate_intFieldNoDisposeFlag.dmp -f2 DotNetCoreApp_1.0.dmp -dacdir c:\PerfTools

Delta(Bytes)    Delta(Instances)        Instances       Instances2      Allocated(Bytes)        Allocated2(Bytes)       AvgSize(Bytes)  AvgSize2(Bytes) Type
320,000,000     5,000,000               0               5,000,000       0                       320,000,000                             64              System.Func<System.String>
80,000,000      0                       5,000,000       5,000,000       160,000,000             240,000,000             32              48              coreapp.DataInstance
33,554,456      1                       2               3               67,108,912              100,663,368             33554456        33554456        coreapp.DataInstance[]
24              1                       1               2               24                      48                      24              24              System.Int32
0               0                       2               2               208                     208                     104             104             System.Globalization.CultureInfo
0               0                       2               2               912                     912                     456             456             System.Globalization.CultureData
433,554,480     5,000,002               5,000,275       10,000,277      227,160,450             660,714,930                                             Managed Heap(Allocated)!
400,011,856     0                       0               0               294,336,152             694,348,008                                             Managed Heap(TotalSize)

Since we compare against the original baseline directly see the improvement in memory consumption by 433MB. That is 65% less memory! Not bad. If you want to keep going fast you can directly compare a memory dump against a running process to check if a temporary optimization pays off. I have found the VS profiler to break when larger x86 applications were profiled because VS seems to load the data also into a x86 process where things break if the more fancy object graph calculation because VS runs out of memory…

VS 2017.3 does not yet recognize CoreClr memory dumps as managed processes which still requires managed heap analysis with PerfView or Windbg or MemAnalyzer.

SOS and mscordacwks, mscordaccore Collection

Even if you are not interested in MemAnalyzer you might stop by for the biggest collection of SOS and mscordacwks debugging dlls for all .NET versions I could get my hands on. When you analyze memory dumps taken from other machines you need to have a close version match within Windbg or an exact version match with PerfView / ClrMd. Inside Microsoft this is a non issue because their symbol servers distribute the matching binaries without any hassle. We outsiders have to copy the corresponding debugging libraries from the original machine or from the corresponding .NET installer. To spare you the time to hunt for the matching debugging dlls I share my collection of mscordackwks files as OneDrive link: https://1drv.ms/f/s!AhcFq7XO98yJgoMwuPd7LNioVKAp_A

Currently it contains the versions

2.0.50727.1434
2.0.50727.3607
2.0.50727.3615
2.0.50727.4016
2.0.50727.4062
2.0.50727.4200
2.0.50727.4408
2.0.50727.4455
2.0.50727.4927
2.0.50727.4952
2.0.50727.5403
2.0.50727.5420
2.0.50727.5444
2.0.50727.5448
2.0.50727.5456
2.0.50727.8009
4.0.30319.01
4.0.30319.1008
4.0.30319.1022
4.0.30319.17379
4.0.30319.18052
4.0.30319.18408
4.0.30319.18444
4.0.30319.2034
4.0.30319.225
4.0.30319.237
4.0.30319.269
4.0.30319.296
4.0.30319.33440
4.0.30319.34003
4.0.30319.34011
4.0.30319.34014
4.0.30319.34209
4.0.30319.46960
4.6.100.00
4.6.1085.00
4.6.127.01
4.6.1586.00
4.6.1590.00
4.6.1637.00
4.6.81.00
4.6.96.00
4.7.2053.00

.NET Core

.NET Core 1.0     1.0.25211.02
.NET Core 1.1     4.6.25211.01
.NET Core 2.0 x64 4.6.25519.02
.NET Core 2.0 x86 4.6.25519.03

It is interesting to note that .NET Core 2.0 has different build numbers between the x86 and x64 version. It looks like one blocking issue needed fixing before they did release it to a wider audience.

Conclusions

Your toolbox just has got a little bigger. As always use the right tool for the job. MemAnalyzer is not the silver bullet for all of your memory problems but it tries its best to give you fast feedback without the overhead of a fancy UI which makes it easy to put it into your existing leak tracking/reporting scripts. If you want to share success stories sound off in the comments. If you want to report bugs/issues it would be nice to open an issue at https://github.com/Alois-xx/MemAnalyzer/issues. Now go and improve the memory footprint of your app!

Advertisements