All Your Base Are Belong To Us

Mostly .NET internals and other kinds of gory details

Browse by Tags

All Tags » ParallelFX (RSS)
SDP 2009: Parallel Programming with .NET 4.0 and Visual Studio 2010
A few days ago, at the Sela Developer Practice , Eran and I delivered a session titled “Parallel Programming with .NET 4.0 and Visual Studio 2010”. In this session we wanted to highlight the new features for parallel programming in .NET 4.0 – the Task Parallel Library and PLINQ – as well as the new Visual Studio 2010 features in the debugging and profiling areas. We started with what we call explicit parallelism – manually creating tasks and specifying what to do when they execute and when they complete...
Fairness is Highly Overrated
Fairness with respect to synchronization mechanisms is a highly overrated property. When I talk about concurrency, parallelism, Windows synchronization and similar subjects, I’m often asked whether the specific algorithm, mechanism or feature is fair in some respect. First, let’s define fairness. I’ll use a simplistic yet rigorous definition to define a fair lock . (Other synchronization mechanisms may have fairness defined in a similar fashion.) To begin with, a lock is exactly what you think it...
PDC 2009 Day 3: DirectCompute: Capturing the Teraflop
Chas Boyd’s session on DirectX11 DirectCompute is going to focus on bringing the power of the GPU for general-purpose computing (and not necessarily graphics applications). A modern CPU would have 4 cores, run at 3GHz, 4 float-wide SIMDs, peak theoretical performance of 48-96GFlops, 2x hyperthreaded capability, 64KB L1 cache, a memory interface of about 20GB/s, and take about 200W out of the wall at a cost of about $200. A GPU is usually constructed from 32 cores, each 32-float wide, at 1GHz, giving...
PDC 2009 Day 3: Developing Applications for Scale-Up Servers Running Windows Server 2008 R2
Pedro Teixeira is going to talk about processes and threads in systems with more than 64 logical processors as well as user-mode scheduling. Surprisingly for some people, NUMA is not an esoteric hardware architecture. Even high-end gaming rigs today are NUMA; Pedro is going to use a loaned machine by HP that has 256 processors with 1TB of physical memory. Processor Groups Adding support for more than 64 logical processors required a breaking app compat change, because CPU masks were represented in...
PDC 2009 Day 3: Lighting Up Windows Server 2008 R2 Using the ConcRT on UMS
Dana Groff, Senior Program Manager on the ConcRT team is going to talk about the new Concurrency Runtime – an abstraction on top of the underlying operating system, supported from Windows XP through Windows Server 2008 R2. The ConcRT Resource Manager is an abstraction over the hardware that allows vendors like Microsoft and Intel (OpenMP, TBB) to program at a higher layer and compose these platforms, as well as coming up with one set of concepts for providing parallel code such as tasks, task groups...
PDC 2009 Day 2: The State of Parallel Programming
Burton Smith ’s session on the state of parallel programming was standing-room only – I’m sitting on the floor with some chairs blocking my view of the presentation :-) Generally, Burton Smith lays out a theory of parallel programming that I tried to cover in the notes below. Imperative languages prefer putting values in parameters, and they are prone to data races which are rather hard to detect considering the amount of possible paths. Pure functional languages avoid variables – they compute new...
PDC 2009 Day 1: Data-Intensive Computing on Windows HPC Server with the DryadLINQ Framework
Dryad is a distributed execution environment and framework (by Microsoft Research) that strives to optimize data flow across nodes in a computation network and to optimize the computation itself across nodes. A Dryad job is a directed acyclic graph (DAG) of inputs, processing vertices and outputs. This is very similar to Unix pipes, only in 2D (scale-out). It’s very natural to attempt to run LINQ queries and operations on a distributed cluster of machines. PLINQ is the first, intra-machine step for...
PDC 2009 Day 1: Future Directions for C# and Visual Basic
I just learned that the C# compiler is being ported to C# and the VB.NET compiler is being ported to VB.NET. (Right now they are both written in C++, of course :-)) The beginning of the presentation focused on the three trends – dynamic, declarative, concurrent – that are prevalent in the latest releases of the .NET languages. Luca Bolognese showed an example of converting an imperative for-loop and converting it to a LINQ query which is very declarative. Next, he used the AsParallel() extension...
Parallelism in Visual Studio 2010: Demos Updated to Beta 2
As you probably know, Visual Studio 2010 has reached the Beta 2 milestone, with a go-live license available to start coding your production applications using this release. There have been some minor changes around the System.Threading.Tasks namespace and classes, with the most significant change being around the cancellation model for tasks. Instead of explicitly reaching out for a Task object and then calling its Cancel method (which has been removed, along with the static Task.Current property...
Parallel Programming in Visual Studio 2010: MSDN Event Deck and Demos
A couple of days ago I delivered a session on Parallel Programming in Visual Studio 2010 at the Microsoft offices in Raanana. The deck and code will appear on the official website (of Israel MSDN Events) shortly, but for now you can download them from my SkyDrive: Deck Demo code This session complements my earlier session on Concurrent Programming fundamentals , in which I provided a theoretical overview of some of the architectural and implementation issues in modern concurrent applications. In...
Concurrent Programming MSDN Event
Last Monday (March 30) I had the pleasure of presenting an MSDN event at Microsoft Raanana on the subject of Concurrent Programming.  The idea was to show the design patterns, methodology and fundamentals of concurrency and parallelism in applications. An opening line (which I also used for the summary) which I really liked was along the lines of “we’ve been resisting object-oriented programming 20 years ago, so it’s only natural that we resist concurrent programming now”.  I really think...
Why Concurrency Is Hard (Or: TimedLock Can Get You in Trouble)
I’ve just noticed a post by Guy Kolbis discussing a possible solution for deadlocks – ensuring that all locks are taken with a timeout.  To do so, Guy cites the TimedLock struct , originally introduced by Ian Griffiths . The general idea is that instead of using a standard lock{…} block, you wrap your critical section in the following statement: using ( TimedLock .Lock(...)) { } Ian even introduces a refinement of this idea by including a sentinel reference type in the struct, in Debug builds...
PDC Pre-Conference: Concurrent and Parallel Programming
Today I finally started the business-related (but still entertaining!) part of my trip - the PDC pre-conference day. I attended the track on concurrent and parallel programming, which was led by Stephen Toub, David Callahan and Joe Duffy. All in all, there was nothing astonishingly new in the talk - they covered parallelism and concurrency in general (and established a common vocabulary), threads in Windows, in .NET in particular, discussed the Asynchronous Programming Model (APM - a.k.a. Delegate...
Practical Concurrency Patterns: Immutability (Freezables)
Another very simple pattern builds on the foundation of the Safe-Unsafe Cache pattern .  What is the easiest way to protect data from multi-threaded access and to incur the minimal performance cost while doing so?  Making it read-only! Read-only (immutable) objects are extremely convenient to use because there’s no need to enforce thread-safety – all accesses to read-only objects are thread-safe.  Additionally, immutable objects decrease complexity – it’s impossible for one part of...
Practical Concurrency Patterns: Spinlock
Previously in the series we have examined the performance differences between concurrency patterns based on kernel synchronization (critical sections, events, mutexes etc.) and concurrency patterns based on wait-free synchronization (such as the interlocked family of operations). Kernel synchronization tends to be expensive if locks are frequently acquired and released because of the associated context switches, introducing anti-patterns such as lock convoys. Wait-free synchronization tends to be...
More Posts Next page »