Benchmarking wireless LANs

Farpoint Group –

For the past few weeks, I've been locked in various dark rooms (OK, it's not as bad as that) performing benchmark tests on a number of wireless LAN systems. Some of the products tested were residential in nature (router/access point combinations), and some were aimed more at the enterprise (standalone fat APs and systems based on wireless switches). I was (and, for that matter, remain more than ever) interested in various aspects of performance, including throughput, time-bounded (or isochronous) behavior critical to real-time services like voice and streaming video, load balancing and optimization, and ease of use. While I'm not going to cover the specific results here (they'll appear in various magazines and other articles), I would like to present you with a little background on some of the core issues, subtle and otherwise, surrounding the benchmarking of WLANs in the event you ever need (or just want) to spend hours or even days on this often-revealing task.

First of all, benchmarking has been around for a long time. The core idea is to present two different configurations of equipment that perform the same function with an identical set of conditions and then see how each does. To use a simple example, consider drag racing (the legal kind, mind you). Take two cars, have their respective drivers hit the gas at the same time and the first one to reach the finish wins. But note the variables - the driver, for example; some drivers may have slower reaction times than others, and thus the potentially superior performance of a given car is masked by a variable that, when and if addressed, could yield a better result on the next run. Thus, the core principle behind benchmarking is to eliminate all variables wherever and whenever possible.

And therein lays the biggest problem for anyone benchmarking WLANs - the rather large number of variables. Here are the important ones you need to be aware of:

  • The radio environment: WLANs operate in the unlicensed bands, often called the "kitchen sink" or even "garbage" bands. This is because the spectrum we use is already occupied by other services and systems, and we need to put up with any radio interference that might be present - whether from other WLANs, an increasingly-likely phenomenon, or other types of devices that use the same frequency bands. Because WLANs use various forms of spread-spectrum communications, we are somewhat protected, but not completely. If there's a lot of other traffic of any form, or just transient but stronger signals, the benchmark may be impacted. We can deal with this problem by monitoring the radio environment with a spectrum analyzer (still too expensive for most casual users), and by doing multiple (or longer) benchmark runs and averaging the results. Really anomalous (completely unexpected) results are suspect until interference is eliminated as a possibility.

  • Antenna orientation: Radio propagation often produces dead spots, which are actually deep fades resulting from the geometric relationship of transmitter and receiver, as well as the specific frequency being used in a particular case. You've experienced, this, no doubt, while listening to the radio while driving. You might notice, while stopped at a traffic light, that the quality of the radio program degrades. Pull forward a meter or so, and the problem goes away. The same thing can happen with any radio signal. To combat this, we've started using turntables that allow the client end of the connection to revolve at roughly a one-RPM rate. The turntable is the guts of a motorized display case, often used for jewelry or watches in retail stores. The net result is that the rotation minimizes the effect of dead zones. You can build your own turntable inexpensively. Just make sure you turn off any power-conservation features in the notebook computer, since you'll be running on batteries. The power reductions will often have a negative effect on radio performance.

  • Distance between endpoints: It's universally true that radio performance varies inversely with range - the farther you go, the slower you go. So I usually run the same benchmark tests over a number of distances so as to get a better idea of what real-world performance will be like. In general, I like to put the AP and the client very close to one another (perhaps three meters or so, and certainly in the same room), and then test over more realistic distances, especially between rooms and floors in residential settings. Note that building construction and materials can also be factors here. We've also seen issues when positioning multiple clients too closely to one another, or when using adjacent 802.11a channels. It's best to spread things out a little in both cases.

  • Driver settings: I almost always use whatever the default driver settings might be. I set the SSID, turn off security (at least initially), and otherwise strive for as plain a vanilla configuration as possible. You might want to experiment with manually setting driver parameters in some cases, but carefully note what you're doing and avoid making too many changes at once so as to understand the effect of each change individually. In fact, it's best to set one parameter at a time - and note that a simple, five-minute benchmark exercise can easily turn into an all-day (or even multi-day) affair once you start digging in. You must carefully document your work and check and re-check your settings!

  • Clients and loads: And, of course, the number of client devices (typically notebook computers) and the synthetic workloads they present to the network (in terms of both total volume and transaction frequency) will obviously have significant impacts on the results obtained. The most important thing here is to present precisely the same set of conditions to all configurations being compared as part of the benchmarking exercise.

Again, the key to a good benchmark is to hold the conditions as constant as possible during the test. The test should run at least a couple of minutes so as to average out the effect of any transient interference (including interference caused by you walking around). I will occasionally run tests over several hours, but I think that's less important than it used to be, given the improved performance of contemporary radios.

The next step is to decide on the actual benchmark to be used. I don't think it's possible to come up with a single figure of merit, given all of the above issues, so you need to pick a test that's indicative of what you want to do. For raw throughput, I suggest ixChariot from Ixia or Iperf from the National Laboratory for Applied Network Research. The latter is freely distributed but not as robust (or as complex) as the former.

For time-boundedness benchmarking, it's a bit more complex. Since a human is in the loop in the actual use of the device, it makes sense to have one in the loop during benchmarking. When testing voice or video performance, perceived quality is what matters the most. Hence, what most researchers use is something called mean opinion score, or MOS. This is a subjective rating of the quality of a connection on a scale of 1 to 5, with 5 being best and anything above a 4 considered quite good. Since any given individual will have a different idea of just exactly what each number of the scale means and corresponds to, a large sample is required. But, of course, test conditions will vary for the reasons noted above, so, well, benchmarking isn't perfect.

And it's for that reason that benchmark results need to be taken with a grain or two of salt. I don't pay much attention to the specific numbers, but I do note the broad trends which usually appear after many test runs. Is the throughput of a given product usually higher than that of another (note my emphasis on the relative, not the absolute)? Do certain combinations of gear work better than others? Are some products just easier to use or more robust or functional than others? Without regard to other factors, do some products have features which might recommend them even if they fall short in other areas?

The rapidly-evolving nature of the wireless-LAN industry makes benchmarking a transient activity at best. Given the steady stream of new products, features, enhancements, etc., it's a safe bet that whatever you test today won't have a lot of relevance in a few months. Still, with so many firms now contemplating the installation of WLANs, I expect to be doing a lot more testing over the next couple of years. And it's also become evident to me that a vendor-neutral resource for WLAN testing and benchmarking is required. I'm working on this, and I hope to have progress to report to you on this front in a few months.

ITWorld DealPost: The best in tech deals and discounts.
Shop Tech Products at Amazon