| > - How costly would be, in terms of performance penalty, to access
> from a CPU another's CPU cache either whitin a single system or withing
> two TruCluster members ?
When you talk about databases (OPS on TruCluster in your example), it
is important to define what you mean by cache. To a database, cache is
the buffer pool in memory- as opposed to the rest of data, out on disk
somewhere. We're NOT talking about accessing CPU onboard caches.
An Oracle instance on a node in a TruCluster has its database cache in
shared memory- so all CPUs in that instance/on that node can access it
equally. No performance penalty.
An Oracle instance on another node in the TruCluster, if it needs
access to the data in the first instance's cache, must access it via
the TruCluster-provided communications mechanisms- over memory channel,
ultimately, using Distributed Lock Manager and some aspect of BSS/BSC
(Block Shipment Server and Client pieces of DRD) (I'm not intimately
familiar with this part of the machanism.) This will be slower than
accessing the shared memory on the same node would be. How much
slower, depends upon a large number of factors- how much MC bandwidth
is consumed by other DLM and other BSS traffic, mainly, which is in
turn dependent upon data partitioning to attempt to keep as much data
localized to one particular instance so these internode transfers don't
have to happen most of the time.
> - How do we compare such a performance hurt versus our competitor's
> hybrid SMP/MPP servers ?
That's an evolving science. In the general case, we refer to
scaleability- measure performance (X) on one node, compare it to
performance (Y) on N nodes; if NX = Y, you've achieved 100%
scaleability, and you ask the other guys what they can do. Our
achievement for scaleability in specific benchmarks so far ranges
widely; one year ago, at OPS announcement, TPC-C numbers on a 4-node
cluster were roughly 30K, compared to 11K for a single node; 100%
scaleability would have been 44K for 4 nodes, so we achieved 30K/44K
or a little better than 67% (I repeat, a year ago, on this one
application). Each application must be carefully tuned. This area is
getting a lot of attention right now, in OPS' case, and I'm not one of
the people doing the work, so I'm not in a position to comment further.
DougO
|