The day a software bug almost killed the Spirit rover

The Spirit rover’s Mars mission almost ended before it really got going due to a DOS-related software bug, which wasn’t caught due to a rushed development schedule

spirit_mars-600x450_0.jpgNASA/JPL
The first picture from Spirit after its memory malfunction

This Friday, January 25th, marks the ninth anniversary of the successful landing of the Opportunity rover on Mars, where it’s still rolling, digging and exploring to this day. However, this past Monday, January 21st, marks the ninth anniversary of a less-happy event on Mars, the near premature end to the mission of Opportunity’s twin rover, Spirit, due to a software bug, the “flash memory management anomaly.”

[Want to be a Mars One colonist? Here's what you need to qualify and Even in the world of science, men are such big liars!]

Spirit, like Opportunity, had landed and deployed successfully on Mars a few weeks earlier, on January 4th, 2004, and for those first couple of weeks, had begun to happily tool around and explore the red planet. However, on January 21st, Sol (solar day) 18 of the mission (one solar day on Mars is a little over 24 and a half hours), the mission team at NASA’s Jet Propulsion Labs didn’t get a signal as expected from Spirit.

After some initial testing, the operations team was able to get a response from the rover, but nothing more than a beep, confirming it was alive. While they were able to rule out problems with the rover’s antenna, they were unable to get any telemetry data from Spirit. By the end of the day, while they knew the rover was alive, they also knew the problem was either with the interface card to the radios or a problem with the flight software (FSW). “Panic started to set in for the operations team,“ wrote Glenn Reeves and Tracy Neilson in an official JPL report on the incident.

Over the next two solar days, the operations team was able to finally coax some diagnostic data from the rover and figure out what was happening, if not why. Basically, the flight software was stuck in a continuous reboot cycle. Each time it tried to restart, it was encountering an error, which would trigger another restart. They suspected the problem was with the rover’s flash memory, where the DOS-based file system was stored. By the end of Sol 20, while the operations team didn’t know the root cause of the problem, they knew that, since the rover couldn’t properly shut down, as it was meant to do nightly, its battery power was getting low and it was in danger of overheating - and ending the mission before it even really began.

1 2 Page
What’s wrong? The new clean desk test
Join the discussion
Be the first to comment on this article. Our Commenting Policies