Windbg is one of the most powerful yet underused tool in every Windows Developers toolbox. Some time ago a new fancier version of Windbg (Windbg Preview) which looks like a todays modern application was released as Windows Store App. So far the UI has got a nice ribbon but under the hood no real changes were visible. Today a new version was released which supports a long time internally used tool called time travel tracer (TTT).
The official documentation https://docs.microsoft.com/de-de/windows-hardware/drivers/debugger/debugging-resources is quite good and here is the blog post announcing Time Travel Debugging https://blogs.msdn.microsoft.com/windbg/2017/09/25/time-travel-debugging-in-windbg-preview/.
If you have a hard to debug race condition you can now attach the Windbg and check the “Record process with Time Travel Debugging”
The trace file can grow many GB in size which can make it cumbersome if you want to reproduce an error which needs some time to surface. Luckily Windbg is, although a Store App, still xcopy deployable. Search with Process Explorer where the debugger exe sits and copy the contents of the directory C:\Program Files\WindowsApps\Microsoft.WinDbg_184.108.40.206_x86__8wekyb3d8bbwe\ to your own tool folder like Windbg_220.127.116.11. Then you can run it e.g. from a memory stick or a network share as usual. After searching a bit in the debugger directories you will find the command line version of the Time Travel Trace tool located at
- x64 Windbg_1.0.13\amd64\TTD\TTD.exe
- x86 Windbg_1.0.13\amd64\TTD\wow64\TTD.exe
Now you can go to your problematic machine where the problem occurs without the need to install a store app which might not be possible due to corporate policy/firewall/isolated test network, …. To record a time travel trace from the command line I normally use it in ring buffer mode with a 2 GB buffer which should cover a few seconds or minutes depending on the application activity.
D:\Tools\Windbg_1.0.13\amd64\TTD>ttd -ring -maxfile 2048 -out c:\temp\test.run -launch D:\cppError.exe
Microsoft (R) TTD 1.01.02
Copyright (C) Microsoft Corporation. All rights reserved.
cppError.exe(x86) (1040): Tracing stopped after 31ms
Ring trace dumped to c:\temp\test.run
You get a 2 GB file although the actually recorded data might be much smaller. If you have a short repro it might be better to skip the ring buffer setting.
Once you have the data it is time to leave the crime scene pack the .run file and analyze it at your favorite developer machine. You can double click the .run file or fire up Windbg and select the Open Trace option. Once you have loaded the trace you can press g to let the application run until the first exception happens or you can set breakpoints. But if nothing is set the application will stop a the first exception with the actual source window and the current values of the local function:
We find that whenever i is 5 we run into an issue which you could also have found also with a memory dump. But now you can travel back in time by entering p- to check the values were just before the crash. This is incredibly powerful to find the root cause how you did get into a situation. If Time Travel Debugging works it is a great tool. Just keep in mind that it makes the application around 10x times or more slower. You should not expect great performance if Time Travel recording is running.
As far as I can tell it looks like Time Travel Tracing is built upon the Intel Processor Tracing feature https://software.intel.com/en-us/blogs/2013/09/18/processor-tracing which enables recording the full instruction stream along with all touched memory which is a really powerful feature.
With every great tool there are things left to be desired.
- Windbg supports no managed source code window
- No managed breakpoint setting in source code
- No managed local variables
- No managed call stack
- SOS.dll seems not to work at all with time travel traces
The debugger shell seems to exist only in the x64 flavor which makes it impossible to load SOS.dll for 32 bit applications into Windbg because the bitness of SOS.dll must match with the bitness of the debugger executable. When I try to get a mixed mode stack inside Windbg SOS.dll can be loaded but it seems to be missing vital information. I really would like to use Windbg with time travel tracing support for managed code (regular .NET Framework and .NET Core) but until now this is a C/C++ fan boys tool only.
Time Travel Tracing is a novel debugging technique which enables us developers to solve otherwise nearly impossible to find data races in a deterministic way. I hope that the Windbg team will add managed code support in the near future which would bring back feature parity between languages. Why should only C/C++ developer get all the great tools?