rss

Being a coding monkey (again)

Category : English, Life, Mono

Monkey Face by Sentrawoods
Monkey Face

Following the informal one-month internship I did last summer, I will be back at Novell in lovely Dublin offices this time for a full-blown seven months session starting in August.

As you can expect, my main area of work will be ParallelFx and other 4.0 goodness but I hope to do some other interesting work on the side.

I very much look forward to this new opportunity of working full time on the ever awesome Mono and with its talented hacker team.

See you around a (Guinness) pint!

Tying Mono and OpenCL

4

Category : C#, English, Mono, Programming

tortoise_hare

One of the cool buzzword these days beside parallel computing is GPGPU as in General Purpose GPU.

The base idea of GPGPU is to allow a programmer to tape in the parallel processing capacities of a GPU to do something different than pixel crushing (even though it mostly boils down to that very application most of time).

GPGPU isn’t new and for a long time vendors have provided their own rather low-level toolkit to program arbitrarily a GPU. As often, someone stepped up one day and proposed a unified API on top of all the vendor specificities. In our case, that API is a standard defined by the Khronos group (think OpenGL) and is named OpenCL.

OpenCL defines both a high-level C API to manipulate GPU (plus other friends) and a C-like intermediate language that is compiled and ultimately run on the device. It’s actually the same model as shader programming.

In the pursuit of Mono world-domination and following the implementation of PLinq that strived to be an easy way for programmer to access the multicore architecture of our processors via Linq, I hereby introduce GLinq (unoriginal acronym for GPU Linq) which this time use your graphic card to execute Linq queries via OpenCL.

GLinq is still much in its infancy and currently only support the Linq’s Select operator but I hope to be able to implement several other operators (at least the classic Where-Aggregate combo). The end goal is also to provide a new set of operator to allow OpenGL code to access the results of the computation without doing a roundtrip between CPU and GPU.

As I said previously, GLinq has the same design goal as PLinq which was to provide a totally new execution model for programmer in the most transparent way possible and with little modification to existing code. In our case, this prerequisite is rather tricky since you are juggling between two totally different worlds.

GLinq tries to provide that seamless experience with two features. The first one is the use of expression trees and C# compiler lambda magic to automatically capture the sense of a C# expression and rewrite it to OpenCL intermediate language. The second one is an automatic mapping of C# types and methods (e.g. Math class functions) to corresponding OpenCL symbols without however sacrificing OpenCL specificities which are also exposed through a C# API (not complete).

Of course, the GLinq API is still a transcription of standard Linq so you use it in the same way via the GpuEnumerable class. At the moment, the only input support are Range and Repeat but that will easily be changed to support any kind of IEnumerable.

Below is an example of the kind of query you can already execute:


var query = GpuEnumerable.Range (-20, 50)
			 .Select ((i) => i * -2 + 3)
			 .Select ((i) => Math.Abs (i))
			 .Select ((i) => ExtraMath.Ldexp ((float)i, 2f))
			 .Select ((i) => (int)i);

foreach (var i in query)
	Console.Write ("{0}, ", i.ToString ());

From what you can see, there are a lot of select operators here. It comes from the fact that the compiler is only able to convert a pure expression lambda to its corresponding expression tree. You can circumvent this limitation by building yourself the expression tree or by using any compiler/interpreter that outputs expression trees (e.g. IronPython, IronRuby, Bechamel).

On a final note, there is an acknowledged bug in at least NVidia implementation of OpenCL which makes Mono doesn’t play nice with it. This is due to the fact that (most) OpenCL implementations use LLVM for code generation/optimization which in turns mess with Mono internal working.

This problem is going to be fixed in the next NVidia driver iteration but as a work-around you can apply this simple patch to mono. I put SIGWINCHSIGXFSZ there because it looks cool and seems generally unused but any other should do the trick.

Code in its proof-of-concept form is living here: http://git.neteril.org/glinq/

PLinq is in

2

Category : C#, English, Mono, Programming

The Delta by ecstaticist under CC by-nc-sa

It's all about splitting work

At long last, PLinq has finally found its way into Mono trunk as what can be considered a preview release. The code has seen a major rewrite compared to the initial implementation done during earlier Google Summer of Code which lacked flexibility and correctness.

Public API is complete except some operator still not implemented. In addition and as implied by the preview title, the current code hasn’t been tested against all possible flavor of query so it might or might not work for you (i.e. deadlock). If that’s the case, don’t hesitate to fill a bug report with a query reproducing the problem. The more query I have, the more robust we can shape future release.

By the way, these PLinq performance tips still hold true.

Mono @ OpenJam.ie

Category : English, Mono, Programming

ubuntu07-ie_logo mono-ireland

The Ubuntu-ie folks are running a slightly tuned Ubuntu Global Jam here in Dublin on Saturday 27th March where basically everyone is welcome to drop by and slackwork on any project he wants.

The event has been dubbed OpenJam and every Mono hacker/enthusiast/bystander is most welcome to register on the Mono Corner of the Signup page.

More informations and day program is available on Laura’s blog.

FOSDEM wrap up

3

Category : English, Mono, Programming

logofosdem2

This is kinda late but better than never.

This 10th edition of FOSDEM was a blast again. Got to see back good friends and meet new people.

This year was a bit special too because we happened to have a dedicated room for Mono thanks to the work of Ruben and Stéphane.

It was even more special for me since I also gave a talk about ParallelFx, your favorite parallel toolkit under Mono. Thanks to the work of Andrius, we have a complete and high quality video coverage of the day and Ruben has actually nicely summed up all the different links to each presentation on his blog.

All talks were great and you have no excuse for not watching all of them. On a personal side and related to ParallelFx, I recommend Alan talk on MonoTorrent (especially the part on threading) and Stéphane talk about Mono.Simd. I also really enjoyed Jim Purbrick talk on Second Life.

See you next year!

Working on Mono with git-svn

Category : English, Mono, Programming

CC by-nd by W. T. L.
CC by-nd by W. T. L.

I won’t go here in depth on how to use git, the official hub has already tutorials to get you started. I’m more interested here in the workflow you can apply and the tools you can use when working on Mono with git-svn (although it can be mostly used in a general way too).

Note: this post is heavily based on the materials written by the accessibility team.

Note²: Prior of everything, your should first read this guide which explains how to setup your git-svn local copy using an existing git mirror. At this point, I expect that you have successfully completed that step.

Mirroring your work

The workflow I will describe here is rather destructive as you are going to end up merging some bits of git history when sending your commit to Subversion. As such, I strongly advise that you have another git mirror of your work somewhere to act as a backup with all your history.

You could either go public with your own server/repo.or.cz/github/gitorious/whatever or simply use a local copy situated e.g. on another drive.

In any case, this repository is not really meant for public consumption (i.e. people directly working on your git tree) as it’s going to be quite unstable with frequent history breaks.

When you are decided on where you want to host your mirror, add a remote to its location with the git remote command :

git remote add name foo@yourdomain.org:mcs.git

When you want to update your remote copy, just issue git push with the remote name and mirror option :

git push --mirror name

Synchronizing with trunk

The first thing to know is actually how to pull Subversion revisions back into your local repository. As with any command that have to deal with Subversion, the common prefix to use is git svn. This action must be done on the master branch so if you are on another one, switch back to master before doing anything else with git checkout.

Here, the command we are interested in is git svn rebase. This command will actually fetch the revisions from trunk, convert them to git commits and replay your local changes (if any) on top of them :

git svn rebase

Making changes

The equivalent of trunk in git is the master branch. This branch should remain clean of any change and should only be used when you are ready to commit your work to Subversion.

In git, branches are everywhere and are the most straightforward way to organize your changes hierarchically. As such, anything you plan to do should be separated in its own branch with a name like feature-xyz. You can also have feature branch that depends on another feature branch, I will talk about these one later and especially how to merge them back.

The command :

git checkout -b feature-xyz

Will switch you to a new branch where you can happily do your stuff.

What works well here is to follow the scheme : code a piece → commit → test → fix → commit. It will especially helps when you have to crawl back through your history at a later point with, for instance, git bissect.

Notice that commit message style at this point don’t matter, use the style that suit you the best because you are the only one who will read them.

During the lifetime of your development branch, it’s likely that some change introduced in Subversion will conflict with what you have done. As such, it’s always a good idea to frequently sync your local copy with Subversion (see above) and then rebase your development branches on top of master.

The git rebase command exactly serves that purpose and allows to rewrite partially or totally the history of a branch. In practise, this is as simple as issuing from the development branch :

git rebase master

As you see, git rebase is powerful tool but it’s also a destructive one in the sense that it rewrite your history. Rewriting history is the sort of thing that make git crazy when you are trying to pull from a repository modified that way. This is why your git backup should be considered unstable and why you have to always use a mirror or a force option with git push.

Merging work and sending it to Subversion

Let’s say your are happy with what you did, now you would like to send all that stuff to Subversion. Normally, your branch should be filled with small commits with message mainly consisting of « Ooops », « Added part foo »  or « Debugging » which you definitely don’t want to see in the Subversion commit.

That’s why we are going to do a clean summary of the change you made, update the ChangeLogs accordingly and then use a pretty automagically generated log message for the Subversion commit.

First of all, switch back to master branch. Then issue the following command :

git merge --squash feature-xyz

What this do is that it takes the most recent version of the branch tree, generate a diff and then apply it to your master tree without committing. If you check with git status, you will see that these change are notified as « Going to be committed ».

That’s when we use our first tool : clng.py.

I suggest making an alias to it or putting it in a PATH directory to be able to invoke it directly from the command line.

As you can notice, the script expects some environment variable to be set. EDITOR tells which editor you want to use to edit ChangeLog (e.g. emacs), CHANGE_LOG_NAME contains the name that should be used in the ChangeLog (e.g. John Doe) and, finally, CHANGE_LOG_EMAIL_ADDRESS contains an email address that will be put in the ChangeLog next to your name (e.g. john.doe@foobar.com).

When you have set up those environment variable properly, then from the root of the repository (i.e. where the .git directory is), just call the script with no parameter which will prompt you to edit the ChangeLog corresponding to the files you have changed. That time, use a meaningful entry.

When all ChangeLog have been edited, you still have the option to fine tune them (if you made a typo for instance). When you are finished, call the following command to validate the ChangeLogs :

git add -u

Here now comes the second script, clm.py, that will gather and pretty print what you added to each ChangeLog. It uses the same environment variables than previously.

Then, simply type git commit and copy&paste the commit message given by the script.

The two last steps are, first, to run again git svn rebase to make sure that no change got in between while we were doing the merge and, finally, launch the dcommit command to issue your commit to Subversion :

git svn dcommit

Damn, I stumbled upon a bug

It’s quite usual that during development time, you will notice several bugs in the code that is already on Subversion. Of course you would like to commit the fix right away because it’s an easy enough one. Only problem is that you are, most of time, stuck in the middle of something else with several changes in your working tree that haven’t been committed yet !

Enter git stash. This command will create a temporary branch were all your current changes are committed and then clean your working tree. That way you can painlessly switch back to your bug fixing branch, make some commits to solve the problem, and then push the fix to trunk with the same procedure as described above.

Now, it would be nice if what you were doing before take advantage of that fix (maybe it’s even a prerequisite). Good news is that you can use the same trick as in the sync section with git rebase to make your branch starts from the commit you just created.

Working with a Subversion branch

When you commit a fix, you probably want it to also live in the current stable branch of your software (Mono in our case). That require two things. First, you have to tell git-svn where the branch live and, second, you have to backport the fix.

Normally when you clone the repository in the guide above, you also get all the remote branches with it. If that’s the case then all is good and you can work from that point. If you don’t have the remote branch you are interested in, two options. Either you use git fetch to retrieve all the missing symbols (can take a good deal of time) or you directly put in the config the path of the branch you are interested in. Here is for instance the section to setup your Mono 2.6 branch :

[svn-remote "mono-2.6"]
url = svn+ssh://jlaval@mono-cvs.ximian.com/source

fetch = branches/mono-2-6/mcs:refs/remotes/git-svn/mono-2-6

[svn-remote "mono-2.6"]
  url = svn+ssh://foo@mono-cvs.ximian.com/source
  fetch = branches/mono-2-6/mcs:refs/remotes/git-svn/mono-2-6

[branch "mono-2.6"]
  remote = .
  merge = refs/remotes/git-svn/mono-2-6

The only step remaining is to duplicate the change you made to trunk to this maintenance branch which can be easily achieved with the git cherry-pick command. After the (eventual) conflicts are resolved, just issue git svn dcommit to validate this change.

Merging branch of branch

Sometimes, it happens that you are developing two things at the same time and that one of it is based on the second which translate by the fact that one of the branch depends on the other branch.

In that case, you are certainly going to end up committing the first branch first, continue a bit polishing the second one and ultimately commit it too. Problem is that, when the first branch get committed, the second one should in turn follows trunk/master happening.

Fortunately, git rebase comes again to the rescue with the onto switch. Simply merge and commit the first branch as described in the section above and then, from the second development branch, issue :

git rebase --onto master

It will move your second branch to depend on master which should happen flawlessly since the first changes are now mainline.

Conclusion

This post is of course far from exhaustive and if you have any more tip, share it in the comments.

A FOSDEM talk primer

1

Category : English, Google Summer of Code 2009, Mono

(Shamelessly inspired from Stéphane)

Since image processing is both trendy and a good candidate for parallel optimizations, I took the time to implement a little program that compute a part of the Mandelbrot set (a well known fractal) in a fancy way :

Lolipop

Now for the facts & numbers :

Sequential generation : 26.5s
Parallel generation : 13.7s
Effective speedup : 2 times faster (dual core computer)
# changes between sequential and parallel : 3 lines

Oh, and this was done using the ParallelFx bundled with Mono 2.6 that you can already use today in your applications.

More informations and tips on Sunday 7th @ FOSDEM in Mono room. Don’t miss it !

PS: Also, don’t forget Mono Hackaday on Monday February 8th.

Mono happening @ FOSDEM

Category : English, Mono, Programming

Mono room

fosdem_brain

So if you weren’t aware of it yet, Mono is going to have its own dedicated room at FOSDEM. In order to spread Mono awesomeness, submit talks here before the 20th. You can even decide yourself how much time you are going to use so don’t hesitate to speak about a cool software you are working on, a nice hack you have done or a how-to on a library for instance.

Mono Hackaday

banner

As a side event, there will also be a Mono Hackaday the day after FOSDEM, i.e. Monday 8th, at Hacker Space Brussels (HSB) with all the vital hacker facilities (location details here). Everyone is welcome to drop by from 10am to 19pm. There is no precise goal for the hackaday, it’s just enjoying your normal and random hacking with other Mono fellows.

Conclusion

Anyway, in all case :

FOSDEM, the Free and Open Source Software Developers' European Meeting

See you there (and bring your Rupert too) !

Wicd support patch for Banshee

Category : C#, English, Mono, Programming

Shameless plug to tell I’m alive :-) .

If you are using Wicd and usually stare at Banshee trying to download cover art or post to last.fm while you are disconnected, the following patch add just the support to fix this.

Here is the associated bug report to get the patch integrated.

In other news, school restarted with its bunch of new responsibilities, limiting significantly free hacking time.

How to get the max out of your PLinq query

Category : C#, English, Google Summer of Code 2009, Mono, Programming

Here are some tips you should follow if you want to get the maximum performance out of a Linq query parallelized with PLinq (at least with upcoming Mono version) :

  • Use an indexed data structure as your source like an array, a list or anything which implements the generic IList<T> interface.

    You can also use ParallelEnumerable.Repeat and ParallelEnumerable.Range as input. Notice the ‘Parallel’ word in front as it’s not the same as Enumerable.(Range|Repeat)

  • Never use an ordering operator like OrderBy and never assume that the query should be ordered.

    The overloads of some operator that provide an index integer are also to avoid.

  • As a general rule, try to stick to the general scheme of Select-Where-Aggregate with any number of Select and Where (like the famous MapReduce).

    The syntactic sugar operators based on Aggregate like Min, Max, Average, etc… can also be used.

  • If you can manage it in your code, use the ForAll method with an action delegate instead of iterating over the query with foreach

  • Of course, forbid any synchronization (locks, semaphore, …) inside the operator selectors/predicates. The purest your lambdas are, the better.

  • The functions used with operators should be predictive and stable i.e. not depend on something uncertain time-wise like a network call and should yield approximately the same execution time with each input.

    Sure, this is not always possible but if it is, it does prevent the engine from having to repeatedly balance the query execution itself.

The PLinq MSDN page contains some additional tricks and explanations if you are interested.