Normally in enterprise application integration, when we say "send X to
Y", there are two distinct aspects to be ironed out. The first is the "how"
part. The second is the "what" part. When thinking of the "how"
part, thoughts turn to things like interfaces and APIs and function calls and
so on. When thinking of the "what" part, thoughts turn to things like
strings and tables and integers and data structures.
All successful examples of interconnecting A and B in EAI feature both "how"
and "what". Unfortunately, the two can get terribly inter-twined leading
to integration strategies that are difficult to develop, deploy and debug.
A useful pattern exists for combining the "how" with the "what"
in a way that keeps the two separate and leads to clean, manageable integrations.
It is best illustrated with a worked example.
Imagine that system A has some data that needs to be regularly sent to system
B. System B has an API that can be used to send it data -- i.e. it allows the
development of custom client applications. The obvious first thing to do is
to program right to this API and thereby inject the data from A directly into
B. The trouble with this obvious approach is that A and B end up temporally
coupled.
This is fine and everything will work fine as long as (a) system B is never
down for maintenance, (b) system B never slows down to the point where your
injecting program times out and (c) there are never debugging nightmares in
which A says it has sent something, B claims not to have got it and you are
sitting in the middle scratching your head...
Temporal coupling bites and when it bites it bites hard.
Let us go back to basics for a moment. A sends data to B. Okay. We can split
the problem by pumping the data out of A into some persisted file format. We
can develop and debug this piece standalone - without having a B running. Now
the problem reduces to injecting into B - not from A - but from the data previously
serialized out of A. And guess what, by being clever about it, we might find
we don't have to inject at all! It could be that we can target a file format
that B already understands and use an existing "import" facility to
load it up. Now we have successfully sent data from A to B in two, independently
debuggable hops.
Now let's imagine that both hops are working fine and we want to go the whole
hog - making the two hops behave as one. We write data out of A as before but
now, at the end of that process, we also use B's API to invoke the data import
into B. We achieve the full effect of seamless integration from A to B but with
some major benefits.
Common scenario. A is ready to send data to B but darn it, B is down at the
moment! What to do? Just send the data to disk and queue it up. When B comes
online, do the import. A can go on about its business.
Common scenario. A says the data was sent to B but there is no sign of it in
B. Start your diagnostics by looking at the data serialized out of A. If all
is well there on the disk then the problem is downstream of A. If not, the problem
is in A. There are few things more frustrating than bug reports that claim A
is broken when the problem really is B and visa versa.
Common scenario. You wish to stress test A's ability to output a gazillion
files for B. By targeting the data exported from A, you can stress test and
validate all of A's processing independently of B. The reverse is also true.
Common scenario. B has changed and now the integration with A is no longer
working. Management are worried because there is no planned future development
of A. It is an unwitting victim of the way B was upgraded. With the data-centric
approach, it is often possible to leave the A part alone, transform the data
exported from A perhaps and just change the B side. It needs to change anyway
because B has changed.
The moral of the story is this: just because you can envision a single hop
strategy to integrating A and B does not mean that it is best implemented as
a single hop.