How to build wireless apps that fail three times in a trillion (and a wireless bike brake)

Brake experiment reveals the key to building super-reliable wireless applications

German computer scientists working on a product that seems either doomed or useless to its theoretical customers have actually done more than create a bicycle brake controlled using a wireless network.

They have highlighted, quantified and laid out in clearly definable terms the false assumptions, poor decisions and sloppy systems design that makes building or using wireless applications chancy.

In building a wireless bike brake no one will buy, they've provided all the evidence you'll ever need on how to build wireless apps that work quickly and reliably no matter how thin the bandwidth, or intolerant the users or other applications are of any delay.

If you're a cyclist, or interested in cycling gear, read the press release here, because it may be the only time you ever see anything about wireless brakes again .

If you're interested in wireless application development or deployment, go straight to the research paper itself around (PDF). The wireless brake may be destined to fail, but the things its inventor learned about how to make machines talk efficiently over wireless networks is definitely worth knowing.

"Wireless brake" and "hit by a truck" sound the same to a cyclist

Despite the impression you may have gotten from Lance Armstrong's obsession with gear and all the new designs, fashions and colorful stuff packed into your local bike store, the cycling industry and cyclists themselves are not quick adopters of new technology.

Sure, bike companies manipulate carbon fiber with the best of them and obsolete their own products so customers can buy replacements nearly as quickly as computer-industry vendors do.

They don't change the basics much, though.

The basic double-diamond shape of the bicycle itself hasn't changed in more than a hundred years. Design and functions of the components evolve slowly. New generations often look identical to old generations, with a few percentage points of improvement in performance or reliability built in or a few grams of weight shaved off.

So word that a group of computer scientists at a German university have built a set of brakes controlled using a small motor for a braking mechanism and wireless signaling device to tell it when to brake and how hard, is unlikely to cause cyclists to line up to try it.

Even its inventor only wanted to teach his wireless brake to talk

Making a popular set of bike brakes wasn't really the point of the project, however.

The project was to find out how to make the wireless connections between two components of a system that has to operate in real time – with milliseconds of difference between success and failure – more reliable than systems normally are that are connected by a wire.

Bike brakes are small and cheap, compared to controls for a locomotive or chemical plant, for example. So they're easy to work on.

And the timing requirements make for a pretty demanding obstacle to overcome.

On a bicycle – which gives the rider no protection at all from obstacles and on which even the most expert riders frequently crash – a brake that responds precisely as you expect it to exactly when you need it is not an option. There is no time to wait for for a lag caused by static or conflicting radio signals or magnetic interference or the million other things that make your cell phone or laptop freeze up on a WLAN once in a while.

The need to slow suddenly from 30 MPH to zero provides so small a window of opportunity that the brake has only 250 milliseconds to engage from the time the wireless control is pressed, according to the functional analysis done by a team of German researchers trying to figure out how to build wireless controls that work quickly and respond accurately within tiny slivers of time that make up the normal operating requirements of real-time computer-driven control systems.

The Germans don't really care about bikes, let alone bike brakes. They care about stopping trains, cranes, airplanes, drawbridge motors, industrial machinery and every other bit of technical appliance or machine being designed with wireless controls to make them more convenient and up-to-date, according to a paper published by IEEE called A VerifiedWireless Safety Critical Hard Real-Time Design.

ProTip on adding reliability to wireless: Add a wire

"Wireless networks are never a fail-safe method" for controls of any kind because of the limitations and difficulties of broadcasting complex digital commands via radio, almost all new complex industrial systems are being designed with wireless controls, according to Holger Hermanns, chairman of the Dependable Systems and Software department at Saarland Univ.

Everything from pacemakers to chemical-plant controls are going wireless; freight and passenger locomotive systems that rely on wireless for brakes and other controls are being tested in Europe already, and will be in commercial use within half a decade, Hermann said.

Making wireless applications like that reliable is far more important than bulletproofing wireless brakes for a bicycle. The quick response time, inability to tolerate failure and even stark limits on size and energy make bikes an ideal test bed for experiments in making wireless controls more reliable, Hermann said.

If a bike brake fails during a test, someone's probably going to bounce off a tree or wall. Even if it's kept out of live-traffic situations, failing to stop a moving locomotive has much weightier consequences.

"The wireless bicycle brake gives us the necessary playground to optimize these methods for operation in much more complex systems ," according to Hermann who, with his research team, tested the brakes using quality assurance processes and algorithms normally used for aircraft or chemical factories.

The goal was to build the radio-frequency send/receive systems that were as reliable as possible, test configurations that should deliver the best performance, and examine the process and protocol involved in sending commands from brake lever to brake, to spot errors that might stretch out stopping times.

The end result was a system that responded quickly and accurately enough every time but three out of a trillion. That's a reliability rate of 99.999999999997 percent . It's also 13 nines , just so you don't have to count, significantly higher than the data-center quality test of "five nines," or 99.999 percent uptime.

Repetition is not the answer; repetition is not the answer; repetition is not the answer

While the result was good, the way Hermann and his team found to get there was exactly opposite the one they thought would work.

As you'd expect, they assumed with wireless, the main reason a wireless signal is the failure of signal to reach the receiver in time.

In this case that meant a radio signal sent from a hand-operated controller on the handlebars of a cruiser bike to a brake on the wheel.

Rather than rely on just one point from which commands could be broadcast, researchers put five senders on various points of the bike, each of which would send the same message several times. With all of them sending the same message over and over, all at once, the chances that the signal would not go through in time would have to be divided by the number of senders and the number of times each sent the message, right? Assume three repeats of each command and you cut the likelihood of failure by 15x?

Sender and receiver communicated using the gMAC networking protocol; sender and receiver communicated using TDMA – a call-and-response system in which each component gets to send just one data point before having to stop and wait for a response from the other.

Each round-trip data exchange made up one slot in a frame of TDMA requests; the length of the frame itself was determined by how long it took the completed message to arrive.

The command language allowed slots in a TDMA frame to be assigned to sender and receiver randomly, using a scheme called Dynamic Slot Allocation (DSA), or it could hand out a seating chart that would tell both sender and receiver which got to speak when and in what order each should sent the bits of an overall message. The scripted process was called Fixed Slot Allocation (FSA).

DSA is easier for programmers to use because it doesn't require them to decide every detail about which slot to fill when, is far more common in wireless systems than FSA

Experiments don't always turn out the way you expect; that's why you do them

The team started with a single sender and receiver, but realized they had a problem from the get-go. The quickest round-trip response they managed to get from the brake was 125 milliseconds – 25 milliseconds longer than they wanted to average.

The reason had nothing to do with the radios or interference. The receiver wasn't getting half the messages the sender put out.

That's where redundancy should come in. More broadcast points = fewer lost messages, and it worked, kind of.

The number of messages completely lost dropped by 25 percent – which put the message-loss in the still-unacceptable 37 percent range overall.

Actual response was worse, though. Response times were longer and failures more common because the messages that were getting through were too old – they repeated bits of message the receiver already had, or were part of commands that were already out of date. They were sending 'slow down,' when the delay had caused the message to become 'SSSTOOOOOOOOPPPP.' Not a good characteristic in a brake.

What's the problem

The slots allocated in a TDMA frame boil down to opportunities to talk. If the sender is allocated slots 1, 3, 5, and 7, then it says its piece during those slots of time, and the receiver answers during slots 2, 4 and 6.

Dynamically allocating those slots didn't result in an orderly conversation in which both sender and receiver waited its turn so they could both say what they needed to most quickly. No, they fought over who got to talk when, often both talking at the same time, with no one listening.

The result is a lot like the collision of packets in a local-area Ethernet network. If the port onto an external network is narrower than the number of packets trying to squeeze through, packets bump into each other and both have to wait their turn.

IT's simplest solution to that is to expand the bandwidth so there are fewer collisions.

What Hermann and his team found is that it's much better to avoid collisions in the first place.

While DSA let both sender and receiver shout without listening, FSA told each when to speak. The result was that, even with only one sender rather than five, commands nearly always got through with little or no delay.

Average response times dropped far below the 100 millisecond limit, and the percentage of dropped or failed messages fell from between 35 percent and 50 percent to...0.003 percent.

Once the messages are actually getting through, then you can test for the kinds of things you assumed were the problem in the first place – external sources of static that cause problems with the radio signal itself.

You might not need to, though.

The key to arrive at a safe [reliable] design is to drastically reduce the individual message loss probabilities," Hermann and his team concluded. "for the [wireless brake] system this is achieved – maybe not surprising – by avoiding randomness in slot assignment, using the fixed slot allocation scheme...this twist results in a design with very high reliability guarantees, far beyond the "five-nines" yardstick."

The dryly worded research paper doesn't drive the point home too sharply, but for developers and networking engineers, the message should be clear: redundant signals and overabundant bandwidth do not deliver rapid response times or eliminate lag on their own. Adding more bandwidth is an inefficient way to fix a bottleneck within an application, especially one that is very time sensitive.

The best way to make a wireless network an efficient and reliable medium for time-sensitive command-and-control signals is to make sure the messages being sent are as clearly defined as possible and that the developer or the application itself determines ahead of time seemingly trivial variables like whether client or server gets to talk first and when it's acceptable for them to just talk over each other.

I hate to tell you this if you work in IT, because you're almost certainly working on one set of wireless applications or another, whether that means supporting WLANs, building apps to run on handhelds and communicate via G3, or working on more complex system-programming jobs like teaching locks, RFID cards and building controls to talk using Near Field Communications, Bluetooth or some other short-distance wireless protocol.

Whichever it is, and whatever limits you face because you're using commercial software, not something whose root you can grab sufficiently to change the way it creates data frames or packets, the best way to make a networked application run fast and reliably is to pare down the options it has to communicate, give it the shortest most concrete commands and ways to communicate them, and don't bother with signal replicators until you know for sure the unacceptable lag times you're trying to troubleshoot aren't due to bits of the application contending for space to talk and not giving any other component a chance.

It's a critical difference that will become more important as wireless networks become the norm for corporate networks, rather than an add-on that's considered a benefit, but not a reliable resource.

Holger Hermann and his unmarketable wireless bike brake, circumlocutory, academic writing style and abstruse choice of topics may seem an odd source from which to get confirmation of what seems like a core principle of good wireless application design, but you take insight where you can get it.

This seems like a good one. Much better than the assumption that the world need another failed high-tech bicycle component, anyway.

Read more of Kevin Fogarty's CoreIT blog and follow the latest IT news at ITworld. Follow Kevin on Twitter at @KevinFogarty. For the latest IT news, analysis and how-tos, follow ITworld on Twitter and Facebook.

ITWorld DealPost: The best in tech deals and discounts.
Shop Tech Products at Amazon