From: www.itworld.com

Windows Tip: Tools for testing for bad memory

by Mitch Tulloch

April 23, 2008 —

 

I recently upgraded memory in half a dozen of our office systems, and of course I had problems: two of the machines wouldn't boot afterward. I ran some memory tests on them but without results. So I ended up following the Monte Carlo approach, which in my case involved randomly swapping memory between the two machines to see what happens. Eventually I got both of them working, but it got me thinking a bit.

Surely there must be better tools available for testing memory integrity. Vista has one built into it: just press F8 when your computer is booting to bring up the Windows Boot Manager menu. Then hit the ESC key, press TAB, hit ENTER, and the test commences. By default the "standard" memory test is performed; you can change this to an "advanced" test by pressing F1, selecting ADVANCED, and pressing F10. Let the test run to its completion and don't interrupt it. Unfortunately, Vista's test while good doesn't always help in identifying problem memory. So I asked around with my colleagues (including people working at Microsoft) if they knew of anything better. One tool that came up several times was memtest86, which I've used before and which is pretty good. Development of this tool seems to have stalled however, and some colleagues indicated they prefer Prime95 (Note: This is an .exe file), which tests both your CPU and memory by stressing them in various ways. The cool thing about Prime95 is that it also works with multicore CPUs, and you can use it to perform pure research concerning a special type of prime numbers called Mersenne Primes. In brief, a Mersenne Prime is a prime number equal to 2n - 1 for some positive number n. There are 44 such primes currently known, with the largest being 232,582,657-1 and the smallest 22-1.

Now maybe Number Theory doesn't get you all that excited. Be that as it may, Prime95 is a cool example of a computing tool that was intended for doing pure research but which also has a spin-off use i.e. stressing CPU/memory to test for problems. I'm even told by a couple of colleagues that when you run Prime95 you need to make sure your system's cooling system is functioning well, as the stress can drive the CPU to heat up to the point of permanent failure-boom! Well, sizzle maybe. Here's a useful thread to read before you try using this tool.

Hardware troubleshooting can be fun -- if living for you means facing constant challenges and difficult puzzles. Another colleague recently told me a co-worker he knew whose computer was having random, intermittent RAS issues with VPN connections being dropped without warning. After mucking around for awhile with the software, he finally tried something simple: removing and reinserting the LAN cable in the back of the system. That solved it. It turned out that the cleaning person had tried to move the computer to clean behind it, and this had stressed the cable causing intermittently poor contact with the Ethernet interface.

Go figure.

Got hardware troubleshooting tips or stories of your own you'd like to share? Email me and I'll include them in a future column.