Line

 

 

 

 

 

 

 

 

 

 

NumaConnect Adapter Card N323

 

→Download the NumaConnect N323 Flyer

 

 

 

NumaConnect Adapter Card N323


→Download the N323 Technical Data Sheet

 

 

 

 

 

 

 

 

 

 

 

 

 

 

oOo

 

 

 

 

 

 

 

NumaConnect Adapter Card N323
Cable Connect

Scalable Cache Coherent Shared Memory on your Cluster Budget

Numascale's SMP Adapter is used with commodity servers with AMD processors with cable connection to HyperTransport.

 

NumaConnect Adapter Card N323

 

N323

The N323 is connected to the server motherbpard via a cable and pick-up module.

 

Adapter set is available for IBM x3755 server.

 

NumaConnect in IBMx3755

 

N323 in a IBM x3755

 

NumaConnect Pick-Up Cable

 

NumaConnect Pick-Up Cable

 

NumaConnect Pick-Up Card

 

NumaConnect Pick-Up Card

NumaConnect Essentials

ccNuma and Numa low latency shared memory interconnect

Virtualizes Everything, Including Memory and IO

>10x price/performance benefit over proprietary solutions

Seamless Scaling of Application Size and Performance - NO Porting Efforts

Scalable, Cache Coherent, Shared Memory System Interconnect
AMD processor nodes with Coherent HyperTransport
Based on field proven design
Enables commodity cost level for high-end servers

 

NumaConnect Features

Converts between snoop-based (broadcast) and directory based coherency protocols

Write-back to Remote Cache

Non-coherent transactions (for optimized MPI)

Pipelined memory access (16 outstanding transactions + 16 non-coherent)

Remote Cache size up to 4GBytes (remote data)

 

NumaConnect RAS Features

ECC for single bit correction and double bit detection

Automatic scrubbing after single bit error detection

Automatic background scrubbing to minimize probability of soft error accumulation

Flexible micro-coded coherence processing engine

Watch-bus for internal activity observation in real-time

Built-in Performance Counters

 

NumaConnect Specifications

Bandwidth to the node-local CPUs
- 1 cHT link (16+16) @800MHz DDR = 6.4GB/s over HT Proprietary Cable

 

Latency for remote accesses
- Short time in node by-pass FIFOs
- Few "hops" on an average access patterns
- Only one or two dimension switch delays worst-case for 2-D or 3-D Torus topologies


Link Speed and capacity
- 4 lane SerDes 4Gb/s per link, 6 links => 96Gb/s = 9.6GBytes/s x2 => 19.2GBytes/s
- Net average throughput on a ring is ≈1.4 times, unidirectional link speed with random access patterns less link and fabric overhead, total for 6 links => 26.9GBytes/s (multiple senders can be active simultaneously)

 

Remote Cache (RMC)
- 2 or 4 GBytes per node, configurable
- System Performance expected to be more dependent on large size rather than faster access time => use of DRAM
- RMC access time will be close to neighbor CPU node-local memory access time


Address Range
- 12 bits Node ID = 4k nodes max. (Multiple sockets per node possible)
- 48 bits address (256 Terabytes)

- Local Node address range:

• N323-22 56 GigaBytes

• N323-44 112 GigaBytes

• N323-48 240 GigaBytes

 

NumaConnect OS Support

Linux, Windows Server, Solaris, Unix

 

 


The Change is on

 

From Cluster

cluster

shared memory with ccnuma by numachip

To Scalable ccNUMA SMP

 

 

Highlights

 

  • Product No. N323-xx

 

  • Scalable, Directory Based Cache Coherence Protocol

 

  • Write-back cache for Remote Data: 2-4 GigaBytes options, standard SDIMMs

 

  • ECC protected with background scrubbing of soft errors

 

  • 32 outstanding memory transactions

 

  • Mix of coherent and non-coherent memory transactions

 

  • Support for single-image or multi-image OS partitions

 

  • 7-way on-chip distributed switching for 1D, 2D or 3D Torus topologies

 

  • 19.2 GigaBytes/s link capacity per card

 

  • HTX connected - 6.4GigaBytes/s

 

  • <20W power dissipation
Cache Coherence - ccNuma - Clusters - Coherent - Directory Based Cache Coherence - Hypertransport - InfiniBand - Numa - NumaChip - Numascale - Snooping
Line