Pages

Sunday, April 3, 2011

My understanding of parallel and functional programming

I just want to write down what is on my mind today. Maybe I'd say what a stupid in the future.

Days ago, I got complaints from a team member who wanted me to reduce the usage of LINQ because it is hard to debug. First of all, that is not true. Debugging a function intensive program is as easy as traditional program. When you use for (i = 0; i < 10; i++), you think about how to make this thing happen ten times. When you use Enumerable.Range(0, 10).Select(...), you might want to think about the input sequence is being pushed into the system. The system accept the input sequence and this sequence will be transformed by several functions, just like what is happening in an chip assembly factory. You pump in sand and get a CPU after several process. Nothing left on these machine exception the garbage nobody wants and care about. I do not really how people switch from for to parallel, but the input sequence way provide me some advantages.

If you want to generate more chip, what can you do? build new factories.
partition the data sequence into small chunk. Pump each small chunk into functions set and will get parallel. Actually you do not have to chunk the data, applying a function using LINQ will provide you the same functionality. Because we only use function which does not have memory about states, we are pretty safe to make the whole system parallel. That's why sometimes I hate to use class and complex design patterns to solve a problem, all of these are designed when parallel is not the main concern of the system design.

When you use class, most likely you will use field to keep some state information. Sooner or later, you will find these information is the enemy of parallel unless you keep them all non-public and non-static. I am not that disciplined to follow the rule, so.. :-)

Let me come back to this topic. This morning when I woke up, I realize "a program = data structure + algorithm" or "data structure + function". If I could partition the data, I can also partition the functions. For example, I have 10 element in an array, instead of thinking of 10 data elements, I can make up a new function list (length = 10) and apply this function list to the data set. People might say, yes, it is same. I would not agree. Once you manipulate the function, you are in the functional programming more, which is there I really want to explore more.

No comments: