August 02, 2012, 8:38 AM — Some Linux sever administrators found out that time was not on their side yesterday, when an errant signal from some Network Time Protocol (NTP) servers broadcast a new leap second that adversely affected many servers unprepared for the change.
If that lede seems like a bit of déjà vu, don't worry, it's still August, not July. But Leapocalypse apparently had one more present to deliver to Linux servers, even a month later. Some are speculating that this fake leap second may have been part of a coordinated attack.
Just before midnight UTC July 31 (1900 EDT), servers polling some NTP servers for the usual time and date information started announcing that client applications should apply a single leap second.
This immediately caused consternation among humans and computers alike. Initially, server admins discussion the presence of kernel: Clock: inserting leap second 23:59:60 UTC in their logs suspected that something was wrong with their local ntp daemons. It soon became apparent that the cause lay in the actual servers themselves.
"This is just to warn you that there are now some NTP servers around the globe spreading a leap second announcement for tomorrow 00:00:00 UTC (so, basically, in a few hours now)," NTP team member Marco Marongiu announced to the [ntp:questions] mailing list about four hours before midnight in Greenwich.
For servers, the cause didn't matter: the effect for pretty much the same. Any Linux machine that was patched to correct last month's leap second day bug would be unaffected. But any server that was not patched with the hrtimer fix that was released on July 17 for Linux 3.0.38, 3.2.24, 3.4.6, and the development branch of Linux 3.5 by kernel developer John Stultz would be affected by the same problems that brought servers down last month when a planned leap second was added to the NTP servers on June 30.
Leap seconds are added to the world's official clocks from time to time to keep clocks in sync with the various changes in the rate of the Earth's rotation. Typically those changes are scheduled and announced well in advance. But last month, a bug in the high-res timer (hrtimer) choked on the leap second and sent signals to sleeping applications to wake them up one second in the future, relative to a system's clock. All those applications waking up and running at one ultimately locks CPUs up.
Another bug in Cassandra had the leap second not pausing Java processes, so those processes would eventually build up and overload the processors, too.
Yesterday's problems were not as far-reaching as the end-of-June leap second, but it was clear that the fake leap second was still causing problems for administrators that failed to patch their machines last month. On the one hand, this would be understandable--after all, no leap seconds were scheduled, and they are typically only added at the end of June and December.
But vulnerabilities left open because of human error are still the bane of IT, and this was a patch that--like all the others--should have been caught.
No one is sure yet what the cause was for the fake leap second, but Marongiu raised a chilling possibility on Wednesday.
"I tried to collect some information around the globe, but with scarce/no feedback. I am suspecting that this could be a rather imaginative attempt to DOS worldwide," Marongiu wrote.
Imaginative indeed. If this was a deliberate attempt to tamper with servers based on NTP signals, it leaves a new vector for attack wide open. If NTP servers start sending time data that's very far off, it is not clear what would that do to servers that depend on NTP to keep time. Given the effect one second had both last month and this, it's hard to imagine it would be good.
Read more of Brian Proffitt's Open for Discussion blog and follow the latest IT news at ITworld. Drop Brian a line or follow Brian on Twitter at @TheTechScribe. For the latest IT news, analysis and how-tos, follow ITworld on Twitter and Facebook.