mapped out

The last couple of weeks I slaved over refactoring some of the code-base of one of our costumers. The task at hand was to take the elaborate spaghetti code, build with way (waaaay) too many multiprocessing queues and  turn it into a functioning, debugable, decently performing  piece of software.

As it turns out, most of the work is embarrassingly parallel1 but with all the different processes handling all the different tasks the code is unmaintainable. So off we went, trying to find more subtle routs to multiprocess  our way out of the mess.

It was clear. we needed a map. Not the of the paper kind, not even of the google kind2, but of the parallel kind.

that kind of map
from: Distributed and Cloud Computing: From Parallel Processing to the Internet of Things by Kai Hwang, Jack Dongarra and Geoffrey C. Fox

Very well, a map. the code is in Python and Python has one. Should be pretty simple, doesn’t it?

No.

The following gist is the skeleton of the code we used to move the existing code to its (hopefully)  finale map-using form. It is, as you can see less than trivial, which leads me to think the whole concept of “mapable” code not as solid as you might expect in a mature language as Python.

link to code

Feel free to use, abuse, share or disregard  this. I wish I found something like this before we ever started on that task, and now it’s here, but the damage was done. The major lesson is that refactoring into a parallel code is incredibly more complex than starting from a clean slate and reusing only what you must. Next time I’ll make sure to try that.

 


1. if you don’t know what this term mean read this

2. depends how you define “google map”

Adam Lev-Libfeld

A long distance runner, a software architect, an HPC nerd (order may change).

Latest posts by Adam Lev-Libfeld (see all)

mapped out

Leave a Reply

Your email address will not be published. Required fields are marked *