In last month's column, we explained how scripting languages can manage subprocesses written in C. We explained that the examples we offered were indifferent to language, and we demonstrated how to wrap a program written in C++, Fortran, or any other language with a fresh GUI face. In fact, with typical scripted pipes, you don't even have to know what language your legacy application was written in. All that matters is its interface behavior as a command-line application, which communicates through the standard input-output channels.
P>
That's an important realization. Students of reuse theory recognize that the best, least expensive, and safest components are those defined through their interfaces, not by their internal mechanisms. For an existing program, the difference between having to make zero changes to prepare it for wrapping and having to make even one can be enormous.
Reader Katja Cremer points out one common pitfall of pipes. Several scripting languages with UNIX heritages emphasize the parallelism of pipes and conventional files by using open to launch the former. They mark the difference with a subtle syntax that's easy for a beginner to overlook. In Perl, for example:
open(CHANNEL, "program");
reads the contents of the file program, while
open(CHANNEL, "program |");
launches program as a subprocess and reads its output.
Sometimes it's necessary to break open black boxes such as program. Our last column showed that pipes are a response to the complaint, "It's working, but it waits and prints all the results at once." Even with proper pipes set up, developers sometimes observe that a GUI shows intermediate but spastic results -- nothing at first, then 20 lines, then another pause, and so on.
That symptom usually hints at output buffering. It is easiest to see with a small example written in C:
#include
#include
int main()
{
int i;
for (i = 0; i < 100; i++) {
printf("%i\n", i);
sleep(1);
}
}
If you generate an executable from that source and run it from the command line, you'll see it evenly count to 99, with each integer printed on a separate line after only a second's delay.
Launch that as a subprocess from a scripting language, though, and you might get bursts of results after delays of dozens of seconds. The OS thinks that a machine process is observing the count rather than your eyes, and so it tries to make the operation more economical by buffering the data transfer through the pipe.
What can you do about that? It's almost beyond the ability of the scripting process. In many cases, managing the pipe with a scripting extension from the Expect family convinces the OS that it shouldn't buffer data. Expect intimidates some newcomers to GUI scripting, though. A more common fix for that situation is to change the source code of the piped process slightly:
#include
#include
int main()
{
int i;
for (i = 0; i < 100; i++) {
printf("%i\n", i);
/* This is the only change. */
fflush(stdout);
sleep(1);
}
fflush() explicitly tells the C language runtime library, "send those results now; don't wait." Most languages have their own spellings of fflush(), so this technique is available for almost any program.
Decorations for the pipeline architecture
Once you have results flowing smoothly between your piped legacy subprocess and scripted GUI wrapper, you can quickly polish the result's appearance. A progressbar is a common ornament for piped subprocesses. Although progressbars don't yet seem to be common core widgets, there are many available as open source add-ons for various GUI toolkits and language bindings. Tcl engineer and University of Manchester researcher Donal Fellows gives an example of how easy it is to create and use a progressbar in a page available through the resources below.
In our last column, we presented event-based pipes as a way to keep a GUI responsive during a long-lasting legacy subprocess. But what if you have a problem going the other way? That is, what if your pipe is behaving properly, but the scripting process is performing poorly and starving the C-coded program of processor time?
That might happen if the wrapper script needs to read an external file and compute a result. If the file is big enough, it might take several seconds or even minutes to read. During that interval, a simple-minded implementation typically locks out any other operations.
That isn't strictly a problem of GUI wrappers of legacy applications. It can arise in a single-process application coded entirely in a pure scripting language, that is, a scripting language with no C in sight. However, as the problem often turns up when developers try to get programs to cooperate, we mention the solutions here.
Concurrent alternatives
Scripting languages that support threads, coroutines, or comparable concurrency constructs have an obvious solution: put the long-running function inside its own thread, coroutine, or so on.
More consistent with the event-based structure of most GUI toolkits, though, is an event-based recursion. Transformation between iteration and recursion is a basic engineering competence. How does it apply in that case? Think of the operation of reading a large file as an iteration over the bytes or lines of the file. The way to keep an application responsive is to rewrite the iteration as a recursion through the event loop. In pseudocode, that looks something like the change from:
subroutine big_job:
foreach line ...
to:
subroutine big_job
schedule process_one_line;
subroutine process_one_line:
if (eof(...))
return;
read_one_line(...
schedule process_one_line;
That multiplication of line count impairs comprehensibility, of course. Correctly coded threading is often even trickier than event-based recursion, though, and the latter frequently performs better than a threaded solution. The idea of event-based recursion is important enough that we offer a couple of executable instances in Tcl.
The first just runs a collection of counters. If you naively program several counters, they'll run in sequence and print out:
first counter: 1
first counter: 2
...
first counter: 10
second counter: 1
...
second counter: 10
third counter: 1
...
To get them all to run concurrently, recurse through the event loop:
# Run compactly creates a proc named "$name," which counts a global
# variable indexed by the same name from 1 to $limit, then
# launches the procedure. The procedure $name also reschedules
# itself after $delay milleseconds. $name is iteratively taken
# from each element of $args.
proc run {delay limit args} {
foreach name $args {
set ::count($name) 0
proc $name {} "
global count
incr count($name)
if {\$count($name) > $limit} {
return
}
puts \"count($name) is \$count($name).\"
after $delay $name
"
$name
}
}
# Start two counters, named "a" and "b", each running
# to 10, with a one-millisecond delay.
run 1 10 a b
# Start a single counter that runs as fast as possible.
run 0 10 c
catch {vwait done}
When you run that, you'll see each of the three counters taking turns in reaching 10.
Fellows shows off the performance of that technique in his realization of a Frogger game. Another longtime Tcl contributor, Jeffrey Hobbs, does much the same with Tetris, which cleverly squeezes dense geometric calculations between display events.
Next time
In the third and final installment of this series, we'll survey interprocess communication methods other than pipes as well as in-process communication between C and scripting languages.