Tendencies in Multi-core computing
They've produced healthy discussion upon the topic:
- Mr Rossum, tear down that GIL - request of Juergen Brendel
- It isn't Easy to Remove the GIL by Guido van Rossum
I was pointed to a good writeup upon Hardware/Software overview of tendencies in multi-core computing. It sheds some light upon progress made in the area touched by Guido in different languages/systems/platforms.
If we’re going to live in a world where CPUs are cheap and parallelism is the norm, then we need to think in those terms. If we need Perl/Python/Ruby/etc. programs to interact with parallelized libraries written in C/C++/Fortran, where’s the harm in spawning another process? Let the two halves of the program communicate over some IPC mechanism (sockets, or perhaps HTTP + REST). That model is well known, well tested, well-understood, widely deployed and has been shipping for decades. Plus, it is at least as language-agnostic as Parrot hopes to become. (+2 points if the solution uses JSON instead of XML.)
Fourth, there’s Patrick Logan, who rightly points out the issue simply isn’t about a multi-core future, but also a multi-node future. Some applications will run in parallel on a single machine, others will run across multiple nodes on a network, and still others will be a hybrid of both approaches. Running programs across a network of nodes is done today, with tools like MapReduce, Hadoop and their kin.
If you have a grid of dual-core machines today, and need to plan out how to best use the network of 64-core machines you will have a decade from now, here’s a very simple migration plan for you: run 32x as many processes on each node!
With that said, here is my recipe for taming the multi-core dust bunny:
- Determine what kind of parallelism makes sense for you: none, flyweight, fine-grained or coarse grained.
- Avoid troublesome low-level concurrency primitives wherever possible.
- Use tools like GHC’s Nested Data Parallelism for flyweight concurrency (one program, lots of data, spread out over multiple CPUs on a single machine).
- Use tools like GHC’s Software Transactional Memory for lightweight concurrency (many interoperating processes managing shared data on a single machine).
- Use tools like MapReduce and friends for heavyweight concurrency (work spread out across multiple cooperating processes, running on one or many machines).