On Feb 8, 2004, at 10:55 AM, sr ferenczy wrote: > true, inter-machine communications is a result of infiniband and some > custom software, but intra-machine communication is still by the OS. > if OS X couldnt get above 60% efficiency with SMP, then the whole > cluster couldnt. They probably have kernel level controls with their > software, but i would think intra-machine, all code would have to go > through the X kernel... maybe not though...? > > sandor Actually, it doesn't really mean anything to say the OS gets X% scaling efficiency. If the threads are highly decoupled, you can get near 100% scaling benefit with any fabric and OS. If the threads are highly coupled, it can quickly become an intractable software problem for any cluster. The factors that determine scalability are how much synchronization is required by the application, and the synchronization performance across nodes. The kernel overhead will generally be a minor factor, sometimes orders of magnitude less. Think about a system like SETI at home. Each node in this system works independently on a block of data, and the time spent retrieving and sending data is a relatively small part of the overall effort. This system achieves nearly 100% scaling performance even with a slow dial-up fabric; 10 equivalent machines will do 10 times the work of one machine, and it really doesn't matter what kind a machines they are, or what OS they are running. On the other hand, systems that depend on entirely Symmetric multi processing (SMP) usually break down with more than a handful of processors. That's why SMP machines are usually limited to 2-4 processors. Unlike a cluster, an SMP machine must share resources almost all the time. In particular, the memory bandwidth can easily become choked if more than about 4 processors are sharing it. That's why big iron SGIs and Suns, and even the bigger Xeon machines use asymmetric multi processing to squeeze as many as 256 processors into one system. Bottom line is that for ideal applications Clusters will always scale perfectly. Of course there are no ideal applications that do much of anything useful. The challenge is to architect your problems in a way that makes the least use of sychronization, and at the same time make your system synchronization as efficient as possible. The range of applications that can be tackled on massive cluster grows with both efforts.