UPDATE: 12 April 2011 – added /nodeReuse info.
A few months ago I wrote a blog post on Running Targets in Parallel in MSBuild. This presented a simple sample which illustrated how to run a target in parallel over some files. I recently looked at whether an MSBuild automated development build process could be made faster by executing parts of it in parallel.
The build process consisted of around 85 targets in total, with various combinations which could be run together (via MSBuild Explorer) to accomplish different goals. Although a full build was rarely required, it would take around 30 minutes. The goal was to try and reduce this time and in so, the time of the partial builds.
The following process was implemented
- Define buckets of targets which could be executed in parallel and buckets which could not
- Create temporary sets of projects to execute
- Execute the projects in the applicable order, choosing whether to execute in parallel or not
- Repeat for additional top level targets.
In code, this would look something like this:
<!– Do your real work on each target here. –>
<!– Define sets of parallel and non parallel targets –>
<!– Create an itemgroup of temporary projects using the current MSBuild file and passing in the files –>
<ParallelProjectSet01 Include="$(MSBuildProjectFile)" >
<ParallelProjectSet10 Include="$(MSBuildProjectFile)" >
<ProjectSet01 Include="$(MSBuildProjectFile)" >
<ProjectSet10 Include="$(MSBuildProjectFile)" >
<!– Execute the temporary projects in the applicable order –>
<MSBuild Projects="@(ParallelProjectSet01)" BuildInParallel="true" Targets="TheWork" />
<MSBuild Projects="@(ProjectSet01)" BuildInParallel="false" Targets="TheWork" />
<MSBuild Projects="@(ProjectSet10)" BuildInParallel="false" Targets="TheWork" />
<MSBuild Projects="@(ParallelProjectSet10)" BuildInParallel="true" Targets="TheWork" />
Remember, you must use the /m switch when calling MSBuild.exe for parallelism to be enabled.
So that’ sounds easy enough. The build process has now been optimised to run in parallel and all is good.
If you have a fairly basic build, with few dependencies and large bottlenecks which could run in parallel, then you may see a big improvement in your build times. If you have a more complex scenario with multiple buckets, you may not see a considerable improvement. Remember that the end execution of the buckets is synchronous. You may want to take this a level further and identify buckets which could execute in parallel, however I would urge caution as even this first level of parallelism may yield unexpected results.
Here are some potential downsides to using BuildInParallel in this way
- Your computer may not cope – depending what products your are using, you may hit race conditions or simply max out your system. A typical error would be deadlocked transactions in SQL server.
- Delayed failure – if one target fails, all those in the bucket will continue to run and MSBuild will only exit once it has completed the directing target. If you are watching the console you could always terminate it manually, but be aware.
- Locked files – by design, multiple MSBuild processes hang about for re-use. I found that these locked various files during builds which caused the builds to fail. If you experience this, try setting /nodereuse:false as covered in Node Reuse in MultiProc MSBuild.
- Varying workstation specification – if you don’t max out your workstation, you may max out a colleagues. It’s typical to have varying machine specifications. If you are going to do this, try it on the slowest spec you need to support.
- Complex logging – if you are used to reading a nice sequential log file, then the format of the parallel logging may surprise you. It’s not a big issue, but be aware that the log is not as easy to follow as a sequential log.
- Maintainability, Supportability, Complexity – it may take some a few minutes to understand what is actually going on in the scripts. If you add new targets to your solution, it may also not be immediately clear where they should go. Be sure to document your process.
- Random failures – this is the worst downside. Unexplainable random failures. It works, then it doesn’t, then it does. There may be an explanation, but do you have the time and resource to investigate and resolve it?
Here are some potential upsides to using BuildInParallel in this way
- Speed – pure and simple speed. You may be able to save a lot of time on MSBuild automated systems by using this process. It’s definitely worth investigating.
In the end this approach was not adopted and the synchronous process was maintained. The main driver being
- Reliability – over 50 developers and testers working together need a stable and reliable build process. The build engineering team will typically be a small outfit with limited resource. Because parts of the build could be executed quickly (via MSBuild Explorer), this full build time is not such a major issue.
I’d be interested in hearing about your experiences. Do you have a different pattern which is working well for you?