[Review]: HPE Advanced Memory Protection Technologies
Advanced Memory Protection
HPE offers Advanced Memory Protection (AMP) technologies for HPE ProLiant servers to increasing service availability for critical services. In addition of ECC (Error-Correction Code) and advanced ECC, these technologies are available to configure by system administrators.
Three types of AMP are available on HPE ProLiant servers:
- Online Spare memory
- Mirrored Memory
- Non-hot plug Mirrored Memory mode
- Hot Plug Mirrored Memory mode
- Hot Plug RAID Memory
Online Spare Memory
Online Spare memory mode is a higher level of memory protection that complements Standard Memory mode with Advanced ECC. With Online Spare mode, a DIMM with a rank of memory at least as large as the other ranks in the system, or memory board, is designated as the Online Spare rank.2 If one of the other DIMMs exceeds a threshold rate of correctable memory errors, the affected rank of memory within that DIMM is taken offline and the data is copied to the Online Spare rank. This capability maintains server availability and memory reliability without service intervention or server interruption.
The DIMM that exceeded the error threshold can be replaced at the customer’s convenience during a scheduled shutdown. Online Spare reduces the chance of an uncorrectable error bringing down the system; however, it does not fully protect the system against uncorrectable memory errors.
To find more information, administrators should read QuickSpecs document for the server.
Mirrored Memory mode is a fault-tolerant memory option that provides a higher level of availability than Online Spare mode. While Online Spare mode protects against single-bit errors and entire DRAM failure, Mirrored Memory mode provides full protection against single-bit and multi-bit errors.
There are two types of Mirrored Memory mode: non-hot plug and hot plug. Non-hot plug Mirrored Memory mode is beneficial to sites that cannot afford unscheduled downtime. Hot-plug Mirrored Memory is beneficial to sites that cannot risk waiting until scheduled downtime to replace degraded memory modules. No operating system support is required for this option and all software and drivers are in the system BIOS.
Hot Plug RAID Memory
Hot Plug RAID Memory protects the server against uncorrectable memory errors that would otherwise result in a server failure. Hot Plug RAID Memory allows the memory subsystem to operate continuously, even in the event of a complete memory device failure.
The Hot Plug RAID memory implementation in ProLiant servers requires four memory boards. The implementation is conceptually similar to Redundant Array of Independent Disks (RAID) Level 5 in that the north bridge uses an exclusive-OR engine to generate a parity check line for every three cache lines. The north bridge interleaves the cache lines and the parity check line across all four memory boards.
Because the cache lines are striped across the memory boards, all four boards must have the same total amount of memory. If an uncorrectable memory error is encountered, the server can re-create the proper data using the parity information and the information from the other memory boards that contain no failed DIMMs.
Advanced Memory Protection Comparison
|Advanced ECC Technology||Online Spare Memory||Not-Hot Plug Mirrored Memory||Hot Plug Mirrored Memory||Hot Plug RAID Memory|
|Device Failure Protection||Yes||Yes||Yes||Yes||Yes|
|Failed DIMM Replacement||Offline||Offline||Offline||Online||Online|
|Additional Memory Expense||0%||10%-50%||100%||100%||25%|
These Advanced Memory Protection technologies will help administrators to increased service availability but it has more cost. Of course, cost of unplanned down time is more expensive than buy additional hardware.