APIs Considered Harmful

I have an uneasy relationship with APIs. Part of me thinks that the

concept of an API is the most useful abstraction since binary logic.

Part of me thinks that APIs are a root cause of a wide range of ills,

including vendor lock-in and exploding software maintenance costs.

This week, I am firmly in "I don't need no stinkin' API" mode. I have

been doing some Web application development that involves uploading XML

instances to a Web server programmatically by performing HTTP POST

requests. Basically, my task was to do, programmatically, what a

browser does when it submits an HTML form.

My first port of call was the truly indispensable Ethereal

(http:\\www.ethereal.com). This is a free network protocol analyzer

that allows you to monitor what is happening on a TCP/IP network.

Using Ethereal, I was able to launch my browser and get to the point

where I was submitting the "Upload XML" form. I then started recording

HTTP traffic with Ethereal.

When the upload was done, I was able to look at the HTTP POST request

and response pairs to figure out what my program needed to do to

emulate what the browser had done. So far, so good. I dug out my trusty

development tools, which positively ooze APIs for doing this and that

on the Web, and set to work.

It was all downhill from there. Each HTTP API I tried had a different

conceptual model describing what was going on underneath. Some allowed

me to accumulate HTTP headers by tacking them together. Others provided

a hashtable interface that enforced the uniqueness of HTTP header

names. The former type did not support MIME-encoded payloads; the

latter supported MIME, but in using them I could not control the order

in which the headers were omitted, making it *very* difficult to know

if the stuff I was generating was the same as the stuff recorded in the

Ethereal traffic log.

Some APIs blurred the distinction between URL-encoded parameters and

body-encoded parameters. Some handled redirects transparently, others

did not. Every one of them had a different view on cookies, ranging

from "cookies are just another HTTP header" all the way to "cookies are

the one true reason for living."

After a few days of this, I gave up and went back to basics. "How hard

can it be?" I said to myself. "I have a URL, I can create a TCP socket,

I can send stuff over the wire in plain text and read stuff back in

plain text."

So, I created a socket and started to send and receive stuff "by hand,"

as it were.

A couple of interesting things happened:

1) I had the back of the problem solved in under an hour.

2) I got to see my kids before they went to bed in the evening

because I no longer had to work till 10 p.m.

I was truly shocked at how simple it all really was -- once I got rid

of the APIs and took a look at what was really going on under the hood.

This experience taught me a number of lessons.

Firstly, APIs are no substitute for actually understanding what is

going on in any system. In some cases -- such as complex calculations

and the like -- I can see why APIs are vital. But for things like HTTP -

- protocols -- I have my doubts. The APIs get in the way of

understanding the concepts, which are really quite simple. When the API

hits the wall -- as happened in my case -- your productivity hits the

wall too, unless you can think past the API to the underlying reality.

Top amongst my doubts about APIs for things like HTTP is that HTTP,

when all is said and done, is a document exchange protocol. I send you

a document, consisting of various headers and an optional body, and you

send me back a document, also consisting of various headers and an

optional body. That's it. That is all there is to it.

Secondly, I have learned that APIs can play into the hands of those who

don't really want you to understand what is going on underneath, as

that would threaten their control over your conceptual model of how the

system works. I can think of numerous examples over the years where

APIs have had this effect in the industry.

Thirdly, I have learned to really appreciate the power of

the "distributed systems as stateless document exchange," which I

believe lies at the heart of what makes HTTP so stunningly successful.

Fourthly, I have learned to be even more skeptical than before about

the slew of APIs doing the rounds in the XML development community. An

XML instance is just a documents, guys; you need to understand the

document structure and document interchange choreography of your

systems. Don't let some API get in the way of your understanding of XML

systems at the document level. If you do, you run the risk becoming a

slave to the APIs and hitting a wall when the APIs fail you.

With an "everything is a document exchange, choreographed over time"

view of the world, one can make some interesting connections between

XML and the technologies that preceded it, such as email. Take RFC 2821

(http://www.faqs.org/rfcs/rfc2821.html) for example. It describes the

SMTP protocol. Look inside and what do you see? Essentially, an

augmented BNF (Backhaus Naur Form) description of email *documents*.

And a BNF description is? Well, it's a grammar -- a schema, if you

like, not a million miles removed from XML Schema, DTD, or Relax NG.

Formal descriptions of document structures, along with details of how

such documents are exchanged over time...

The biggest conclusion I have come to is that APIs are fundamentally

good as shortcuts to getting your work done, but *only* if you fully

understand what is going on underneath. Using APIs as a substitute for

understanding what is really going on under the hood is a bad idea

which will come back to haunt you.

ITWorld DealPost: The best in tech deals and discounts.
Shop Tech Products at Amazon