Bottlenecks in information processing

February 3, 2004, 12:00 AM —  ITworld — 

I have recently finished reading an interesting novel. It is a thriller
based around the concepts of cost accounting. No, I am not joking. It
really is a thriller - a novel - and the main topic really is cost
accounting.

The book I'm referring to is called The Goal[1] by Eliyahu M. Goldratt.
It uses a passable, attention holding domestic story line as a hook on
which to hang a very interesting exploration of manufacturing processes
and how best to manage them for financial gain.

As a software engineer, I found the book fascinating from two very
different angles. First, it is interesting to think how information
technology can best support the process of physical goods manufacturing
through areas such as robotic automation and telemetry for decision
support. Secondly, it is interesting to think of software systems as
examples of manufacturing systems in which the raw material is data and
the "product" is information.

Goldratt's book brilliantly illustrates how the behavior of bottlenecks
in a manufacturing process impact every other part of the process in a
fundamental way. You need to be intimately aware of all aspects of your
bottlenecks as the health of your entire operation depends on them.

Reading about manufacturing bottlenecks in the book, caused my mind to
wander to the software-system-as-manufacturing-process analogy and to
ask "what are the bottlenecks in software applications?"

In any software system with a lot of data to process, the obvious target
for attention as a possible bottleneck is CPU/RAM. After all, this is
the part of the assembly line through which all data must pass at some
stage. From that conclusion, it is a short step to the follow-on
conclusion that speed of CPU/RAM and by extension efficiency of
processing algorithms executed on that CPU/RAM combination make up the
core of the bottleneck.

Before we pat ourselves on the back and declare the bottleneck found,
let us switch back to physical manufacturing for a quick reality check.
We have machines - computers - that are pretty cheap in comparison to
the cost of most manufacturing equipment. We have lots and lots of data
to process with these computers. Typically many orders of magnitude more
than one machine can process at any one time.

We could either optimize every last scintilla of performance out of one
of those machines or we could get lots of them working on the data in
parallel. The former route costs us lots of time and money in terms of
labor costs (developers) and capital costs for a small number of
top-of-the-range computers. Also, the outcome of the investment in
terms of improved throughput is uncertain. The latter route - lots and
lots of cheap "throwaway" machines - will cost us a fixed amount of
money (low labor costs as we are not optimizing any algorithms) and we
can accurately measure the improvements in throughput we expect to see.

Looking at the problem this way, as a form of manufacturing, it is
pretty much a no-brainer to conclude that splitting inventory into
chunks and getting cheap machines to process the stuff in parallel is
compelling. Common sense right?

If so, how come this approach is so uncommon in application software?

Another book I read recently provides clues. Here is I.L. Auerbach,
quoted from a paper he wrote in 1970[2]:

"The problem [in information systems] has been compounded by laying too
much stress on what poses for efficiency as a design criterion; namely,
speed of computation..."

In the same paper, he says:

"Too frequently, too many designers approach too many problems as
exercises requiring original creativity. The predictable result is that
the system built is unique and is (a) not adaptable to different problem
domains, (b) not easily maintained, and (c) not economic."

There is food for though there. In the thirty four years since Auerbach
wrote that paper, very little has changed. We still think the cure for
the bottleneck of software lies in speeding up the computation part and
using the creativity of expensive engineers to get it.

Maybe we should turn our gaze away from CPU/RAM and way from
hyper-efficient-algorithms in our search for the bottlenecks. Maybe the
bottlenecks lies elsewhere? If we conceptualize the problem as a true
manufacturing process, then splitting the processing into batches and
doing the work in parallel on cheap machines jumps off the page as
obvious thing to do.

Alas, we don't do it. We software engineers just do not conceptualize
what we do in manufacturing terms.

Perhaps the real bottleneck lies between our ears?

[1] "The Goal"
http://www.starvingmind.net/detail/0884270610/The_Goal_A_Process_of_Ongo...

[2] "The Skyline of Information Processing"
http://www.bookfinder.com/dir/i/The_Skyline_of_Information_Processing/04...

» posted by ITworld staff

ITworld

I like it!
Post a comment
The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
Free books

Essential JavaFX
Get started building rich Web apps quickly with an introduction to the power of JavaFX key features -- scene node graphs, nodes as components, the coordinate system, layout options, colors and gradients, custom classes with inheritance, animation, binding, and event handlers.Enter now!

The Nomadic Developer
Consulting can be hugely rewarding, but it's easy to fail if you are unprepared. To succeed, you need a mentor who knows the lay of the land. Aaron Erickson is your mentor, and this is your guidebook. Enter now!

Featured Sponsor

AISO founders envisioned a Web hosting company that was environmentally friendly. While the company employed energy-efficient innovations like solar panels, its infrastructure produced unacceptable power and cooling requirements. Find out how AISO leveraged AMD technology to overcome their challenge in this case study white paper.

In this whitepaper, Scalar explores the opportunity to change the landscape with respect to mission critical databases built around Oracle. Leveraging technologies such as Linux, high-end commodity processing power and Oracle RAC technology to architect, design, build and maintain database infrastructure that delivers maximum availability, reliability and performance at a fraction of traditional cost.

On a typical day, weather.com, the Web site for The Weather Channel in Atlanta, serves up between 15 million and 20 million page views. But in September 2004, when back-to-back hurricanes ransacked Florida, the peak traffic on one day more than tripled: over 70 million page views by more than 7 million unique visitors. Read the full success story now.

Marketplace