Memory Modules DIMM, RDIMM, LRDIMM performance


This post addresses memory architectures for Intel Xeon SP (CoffeeLake, Skylake) and Intel Xeon (Haswell, Ivybridge, and Sandybridge) CPUs.

Overall server performance is affected by the memory subsystem which can provide both high memory bandwidth and low memory access latency when properly configured. Balancing memory across the memory controllers and the memory channels produce memory configurations which can efficiently interleave memory references among its DIMMs producing the highest possible memory bandwidth. An unbalanced memory configuration can reduce the total memory bandwidth to as low as 16% that of a balanced memory configuration.

Dual In-Line Memory Module (DIMM):

When Intel came out with E5-2600 v2 CPUs, they introduced a new memory type called Load-Reduced DIMM (LRDIMM). At the time, servers had the ability to accept three different types of memory LRDIMM, Registered DIMM (RDIMM) and Unbuffered DIMM (UDIMM). Later, UDIMM memory was no longer used due to its lower bandwidth and capacity capabilities.

Which DIMM type is faster?

Registered DIMMs (RDIMM) improve signal integrity by having a register on the DIMM to buffer the address and command signals between each of the Dynamic Random-Access Memory (DRAMs) modules on the DIMM and the memory controller. This permits each memory channel to utilize up to three dual-rank DIMMs, greatly increasing the amount of memory the server can support. With RDIMMs, the partial buffering slightly increases both power consumption and memory latency.

What is “Dual-Ranking”

The rank of a DIMM is how many 64-bit chunks of data exist on the DIMM. You can think of single rank DIMMs as having DRAM on one side of the chip and thus have one 64-bit chunk of data. DIMMs with DRAM on both sides typically have a 64-bit chunk of data on each side and thus are dual rank.  There is even a quad rank DIMM where each side of the DIMM has two 64 bit chunks of data.

There are a number of factors that influence memory latency in a system. Comparing the access latency of RDIMMs .vs LRDIMMs at 2400MHz using an Intel E5-2600 v4 server,  the loaded latency for the RDIMMs with a single rank (1Rx8) is higher than RDIMMs and LRDIMMs that are higher capacity. Single rank DIMMs do not allow the processor to parallelize the memory access the way DIMMs with two or more ranks can.

DIMM Speed

Faster DIMM speeds deliver lower latency, particularly loaded latency. Under loaded conditions, the greatest factor increasing access latency is the time memory requests queue waiting to be executed. The faster the DIMM speed, the more quickly the memory controller can process the queued access. For example, memory running at 2400 Mega transfers per second (MT/s) has about 5% lower latency than memory running at 2133 MT/s.

Ranks

While more ranks on a channel give the memory controller a greater capability to parallelize the processing of memory requests and reduce the size of request queues, it also requires the controller to issue more refresh commands. The benefits of greater parallelizing outweigh the penalty of the additional refresh cycles up to four ranks. The net result is a slight reduction in latencies for two to four ranks on a channel. With more than four ranks on a channel, there is a slight increase in latency.

CAS Latency

CAS (Column Address Strobe) latency represents the basic DRAM response time. It is specified as the number of clock cycles (e.g., 13, 15, 17) that the controller must wait after issuing the Column Address before data is available on the bus. CAS latency is a constant in latency measurements (lower values are better).

Utilization

Increased memory bus utilization does not change the low-level read latency on the memory bus. Individual read and write commands are always completed in the same amount of time regardless of the amount of traffic on the bus. However, increased utilization causes increased memory system latency due to latencies accumulating in the queues within the memory controller.

The actual throughput of the memory remains pretty consistent unless you use three DIMMs per Channel (DPC) or move to 128GB LRDIMMs

LRDIMMs Provide Increased Capacity

LRDIMMs use memory buffers to consolidate the electrical loads of the ranks on the LRDIMM to a single electrical load, allowing them to have up to eight ranks on a single DIMM module. Using LRDIMMs you can configure systems with the largest possible memory footprints. However, LRDIMMs also use more power and have higher latencies compared to the lower capacity RDIMMs.

More or Less, Which is faster?

Assuming your DIMM installation is balanced, always use the fewest number of DIMMs possible at the highest freq. Remember, mixing DIMM freqs and CAS latency is allowed but the Memory Controller(s) will allow clock down to the lowest supported DIMM level.

Why are more DIMMs slower?

The short answer is Bank Interleave AKA Rank Interleave enhances the performance of the server but as more DIMMs are added, more loading on the memory controller(s) occurs. In addition, memory frequency and/or CAS timings can be “relaxed” to address all of the DIMMs.

For the details, I recommend two sources:

Installing Memory DIMMs for the best performance

Configuring a server with balanced memory is important for maximizing its memory bandwidth and overall performance. It is important to understand what is considered a balanced configuration and what is not.

Intel Xeon SP (Scalable Processors – Bronze, Silver, Gold & Platinum) CPUs, (sometimes called CoffeeLake & SkyLake) use 6 memory channelseach supporting up to 2 DIMMs per channel (DPC) allowing a maximum of 12 DIMMs per CPU.

Intel Xeon CPUs have 4 or 8 memory channels per processor and allow up to three DIMMs per channel.

In the following document published by Lenovo Press, balanced and unbalanced memory configurations are presented when a server contains Intel Xeon CPU (not Intel Xeon SP, this will be a future update) along with their relative measured memory bandwidths to show the effect of unbalanced memory. Suggestions are also provided on how to produce balanced memory configurations. Please note, the Intel memory controller does not allow mixing of LRDIMMs with RDIMMS.

Download (, Unknown)