As you may recall, I have the great pleasure to participate again in Google Summer of Code this year with Mono.
Since we are nearing midterm evaluation, I thought about doing a kind of status report like last year. Unfortunately, I haven't been as active as I wanted to this time (exams were more time consuming this semester). However there is still some cool stuff that have already landed and which are described next.
What has been done since last year
Actors and Software-transactional-memory goodness
I took some time to implement rudimentary version of those two parallel programming paradigms earlier this year.
See this post which describes in more details the ideas behind them and some examples.
New more efficient scheduler's deque
The scheduler's deque that was used before was quite complex due to the fact that the inherent storage mechanism was based on a doubly linked-list which is rather hard to get right when you add parallel and concurrency constraints (see ABA problem for instance).
The algorithm I was using was mostly designed with C++ in mind where you can mess up with pointers pretty easily and make freely use of CAS on pointers as integers. Since I wanted to avoid any kind of unsafe or native code in the library, I tried to port that algorithm down to C#.
After some mail exchange with a fellow person (hey Susan o/) who was using Mono's ParallelFx on a big box in a laboratory, we started to see some concurrency problems with my code. Turns out that the ABA prevention code wasn't really working with my C# rewrite. Therefore I decided to hunt for another, more C#-friendly, type of scheduler's deque.
Actually, I did find it and it's the one used now under the CyclicDeque name. It's particularly swift because it only do integer manipulations that are particularly fast with the C# Interlocked methods and doesn't suffer of the ABA problem because, using the vast range of values available with 64 bits integers, it's based on a forward-only algorithm.
With the tests I was able to do, this new deque works more reliably and faster than the precedent. It's currently enabled by default but I need to do some 32 bits checks to see if it behaves as expected on those platforms.
Following the new type introduced as part of my first SoC and the two parallel paradigms I described above, I have done some other parallel and concurrent code to be used both internally and publicly.
One of those is a new collection, ConcurrentSkipList that provides a thread-safe implementation of a skip-list (a nice tree-ish list container). This skip-list implementation is also used for the ConcurrentDictionary type.
The other is a stripped down CountdownEvent called Snzi (Scalable non-zero indicator) that basically do the same thing except that instead of keeping a count record, Snzi just tells in a binary fashion if there is or not a count remaining. That weaker semantic opens the door for more scalable and efficient optimizations.
Optimizing, fixing and hardening
The final task that occupied me during the inter-soc period was mostly tuning and bug-fixing the existing parts with a focus on Task reliability and PLinq performance and correctness.
What have I begin to do for this SoC
Currently I'm hard working on the .NET 4 port of Mono's ParallelFx as it comes with a whole lot of new stuff and API changes.
The ParallelFx team over at Microsot has been publishing posts these last months about the new things coming down the pipe (check out their blog if you still haven't do so).
At the moment, the System.Threading.Tasks namespace port is fairly complete and Tasks/Future unit tests are all back to green.
I'm now working a bit on the Collections namespace, adapting some of my code to the new API (notably ConcurrentDictionary) and seeing how to implement the new Partitioner pattern.
What to expect next
First of all, the following weeks are going to be quite more productive as, with big thanks to Alan and Miguel, I'm going to spend a month in Dublin hacking in the Novell offices. Looking forward to this.
As for the next, the plan is to continue porting the existing code to .NET 4, first with the Parallel loops class (with probably some further optimizations on data source partitioning) and then PLinq.
In addition, since Mono recently enabled the .NET 4 profile in SVN, some of the ParallelFx code will also soon transition from the google code repository to official Mono's trunk for early mass consumption.
Finally, last but not least, I'm going to devote the rest of the summer to testing ParallelFx more extensively with, both, improving the existing test suite with harder parallel stress-testing and the development of a Chess-like parallel correctness checker.
See you at the end of the SoC for another full report ;-) .