[G4] OS X SMP
Jonathan Armstrong
karlarmstrong at mac.com
Sun Feb 8 14:53:48 PST 2004
On Feb 8, 2004, at 10:55 AM, sr ferenczy wrote:
> true, inter-machine communications is a result of infiniband and some
> custom software, but intra-machine communication is still by the OS.
> if OS X couldnt get above 60% efficiency with SMP, then the whole
> cluster couldnt. They probably have kernel level controls with their
> software, but i would think intra-machine, all code would have to go
> through the X kernel... maybe not though...?
>
> sandor
Actually, it doesn't really mean anything to say the OS gets X% scaling
efficiency. If the threads are highly decoupled, you can get near 100%
scaling benefit with any fabric and OS. If the threads are highly
coupled, it can quickly become an intractable software problem for any
cluster. The factors that determine scalability are how much
synchronization is required by the application, and the synchronization
performance across nodes. The kernel overhead will generally be a minor
factor, sometimes orders of magnitude less.
Think about a system like SETI at home. Each node in this system works
independently on a block of data, and the time spent retrieving and
sending data is a relatively small part of the overall effort. This
system achieves nearly 100% scaling performance even with a slow
dial-up fabric; 10 equivalent machines will do 10 times the work of one
machine, and it really doesn't matter what kind a machines they are, or
what OS they are running.
On the other hand, systems that depend on entirely Symmetric multi
processing (SMP) usually break down with more than a handful of
processors. That's why SMP machines are usually limited to 2-4
processors. Unlike a cluster, an SMP machine must share resources
almost all the time. In particular, the memory bandwidth can easily
become choked if more than about 4 processors are sharing it. That's
why big iron SGIs and Suns, and even the bigger Xeon machines use
asymmetric multi processing to squeeze as many as 256 processors into
one system.
Bottom line is that for ideal applications Clusters will always scale
perfectly. Of course there are no ideal applications that do much of
anything useful. The challenge is to architect your problems in a way
that makes the least use of sychronization, and at the same time make
your system synchronization as efficient as possible. The range of
applications that can be tackled on massive cluster grows with both
efforts.
More information about the G4
mailing list