For example, while a server with error correction technology can continue to function after a soft error, a PC would need to be rebooted. A hard error would also be corrected each time a processor attempted to read from a bit on a server card, but the DRAM in a PC, because it has no error correction, would need to be replaced because it would cause a system or application using the memory to crash, Handy said.
"The study shows hard errors are more common than soft. That means modules are running and running and running in servers and every time a hard error bit is encountered, it's corrected so the memory module never gets replaced," Handy said. "If that happened to a PC user, the machine would stop working."
If an error is uncorrectable, as in the case of multiple bits exceeding the limit of what the ECC can correct, a server will shut down.
"In many production environments, including ours, a single uncorrectable error is considered serious enough to replace the dual in-line memory module that caused it," the Google report read.
Handy said such problems often result in system downtime and service outages.
The study states that memory errors are expensive in terms of the system failures they cause and the repair costs associated with them. They can also open the door to security problems.
"In production sites running large-scale systems, memory component replacements rank near the top of component replacements and memory errors are one of the most common hardware problems to lead to machine crashes," the report stated. "Moreover, recent work shows that memory errors can cause security vulnerabilities."