This afternoon, I pushed out a couple bug fixes for Taptics to Windows Phone Dev Center. I feel like doing a bit of ranting about it, as the fixes are something that could have been done and pushed out a lot earlier were it not for the rudimentary level of detail I got through the crash logs from Dev Center and the bug reports generated by Telerik’s RadDiagnostics component.
What it all boils down to: The Task-based Asynchronous Pattern, known to most of us mortals as async/await. And, to a slightly lesser degree, parallelization.
You see, in Taptics, when scores are submitted in game, I have to submit the score multiple times due to how leaderboards work on the Scoreloop platform (which is what I’m using for that, as well as the upcoming achievements feature planned for the next big release). As well, players can connect Twitter and Facebook to the game and send out a message of their score. Rather than chain together these operations in serial (one at a time, wait for one to complete before starting the next), I start them all right away, and just wait for them all to finish with the help of Task.WaitAll.
So, what happens if an exception is thrown within one of your tasks that you’re waiting on? In the case of await with a single task, it’s more or less the same as an exception being thrown in classic synchronous code. The stack trace is a bit different, though, as the exception has been thrown twice – first in the method or lambda that the task is running asynchronously, and then again in the state machine hidden behind your await call. This was problem number one: the stack traces I was getting from these asynchronous method exceptions let me know where the exception was thrown the second time, but no (or almost no) information from the original throw. That lead to tedious work looking for possible bugs, often with the help of both JustDecompile and ILSpy to peek into the state machine’s inner workings.
What happens when you’re awaiting a group of tasks, though? Here we run into the issue I had with RadDiagnostics. You see, Task.WaitAll doesn’t throw you back any old exception. What would happen if multiple tasks threw? Because of that situation, Microsoft introduced the AggregateException class, which acts as a collection of exceptions. You can enumerate the InnerExceptions property of AggregateException to get the actual exceptions thrown by the tasks. Unfortunately, RadDiagnostics isn’t bright enough on its own to do that, so the reports in that and similar situations where tasks were run side-by-side were practically useless.
Fortunately, that was easier to fix than the earlier case, even though it meant pushing a Taptics update that did nothing but improve bug reporting. RadDiagnostics provides an event that is fired before actually submitting reports, in case a developer wants to transmit them via a method other than email. It can also be used to tack on custom data for error reports, and so I was able to add in some code to handle AggregateException. Now, whenever such an exception percolates up and gets caught by RadDiagnostics, I at least know what’s causing it and where.
By the way, I should note that even if only one exception is thrown by one of the awaited tasks, it’s still wrapped in an AggregateException. I only ever had AggregateExceptions with a single inner exception, but even in that case, you have to go through InnerExceptions (regular InnerException appears to remain null).
In the end, I’ve strengthened my ability to deal with pesky exceptions crawling out of the async woodwork, and hopefully the next update of Telerik’s controls will offer a more AggregateException-friendly RadDiagnostics.
UPDATE March 22: Telerik has listened! Future versions of RadDiagnostics will keep an eye out for AggregateException and give expanded reports when they’re encountered.