Are scripting languages the wave of the future?

ITworld.com –

When Robert Martin talks, programmers listen.

XP! Patterns! UML! Java! C++! OOD! SE! For most of the last 30 years, Martin has worked full time at the center of modern programming practice. His specialties and concerns have been exactly those of the silent majority of professional developers: How do I use STL? Should our team practice extreme programming? Which analysis methodology is effective? Throughout his career, Bob has been a working programmer himself, informing his "outreach" as an author, editor, publisher, manager, entrepreneur, pundit, and trainer.

Last week, Martin talked to ITworld.com's Cameron Laird and readers in the ITworld.com Interviews forum about the programming languages of the coming decade and other topics. Here's an excerpt:

ITworld.com: Does language make a difference? Your emphasis on code hygiene and expressiveness is welcome. Why, then, do we spend so much time working in C and C++? If an organization comes to you with a coherent set of requirements, and says it's willing to trust you about what language will help it best meet its goals, what advice are you likely to give?

Robert Martin: Language certainly makes a difference; but any language can be written well.

I don't mind C or C++. There are certainly cases where those are the best languages for the job. DSP code, if not written in assembler, is probably best written in C. Hard embedded realtime apps are well done in C++. And many Web apps or MIS/IT apps are nicely done in Java.

However, I think there is a trend in language that will become more and more evident as the decade progresses. I think we are seeing an end to the emphasis on statically typed (type-safe) languages like C++, Java, Eiffel, Pascal, and Ada. These languages force you to declare the types of variables before you can use them.

As this decade progresses I expect to see an ever increasing use of dynamically typed languages, such as Python, Ruby, and even Smalltalk. These languages are often referred to as "scripting languages." I think this is a gross injustice. It is these languages, and languages of their kind, that will be mainstream industrial languages in the coming years.

Why do I think this? Because these languages are much easier to refactor. What's more, they have virtually zero compile time.

As an industry we became enamored of type-safety in the early '80s. Many of us had been badly burned by C or other type-unsafe languages. When type-safe languages like C++, Pascal, and Ada came to the fore, we found that whole classes of errors were eliminated by the compiler. This safety came at a price. Every variable had to be declared before it was used. Every usage had to be consistent with its declaration. In essence, a kind of "dual-entry bookkeeping" was established for languages. If you wanted to use a variable (first entry) you had to declare it (second entry). This double checking gave the compiler vast power to detect inconsistency and error on the part of the programmer, but at the cost of the double entries, and of making sure that the compiler had access to the declarations.

With the advent of agile processes like extreme programming (XP), we have come to find that unit testing is far more important than we had at first expected. In XP, we write unit tests for absolutely everything. Indeed, we write them before we write the production code that passes them. This, too, is a kind of dual-entry bookkeeping. But instead of the two entries being a declaration and a usage, the two entries are a test and the code that makes it pass.

Again, this double-checking eliminates lots of errors, including the errors that a type-safe compiler finds. Thus, if we write unit tests in the XP way, we don't need type safety. If we don't need type safety, then its costs become very severe. It is much easier to change a program written in a dynamically typed language than it is to change a program written in a type-safe language. That cost of change becomes a great liability if type safety isn't needed.

But there is another cost: The cost of compilation and deployment. If you want to compile a program in C++, you must give the compiler access to all the declarations it needs. These declarations are typically held in header files. Each header file containing a declaration used by the program must be compiled by the compiler. If you have a program with N modules, then to compile just one of them you may have to read in all N header files. With a little thought you'll realize that this means that compile time goes up with the square of the number of modules.

As code size increases, compile time rises in a distinctly nonlinear manner. For C++, the knee of the curve comes at about half a million lines. Prior to that point, compiles are pretty short. After that point compile times start to stretch and can get absurdly long. I know of one company that compiles 14.5 million lines of code using 50 SPARC-20s overnight.

This massive compile time can be mitigated by making good use of dependency management. By carefully controlling the coupling of your modules you can reduce compile time to be proportional to NlogN. But even this can be very long for large projects. Java found a way out -- sort of. In Java, declarations are compiled and stored in classfiles. To compile a module requires only that the declarations be read from the classfiles of the used classes. Transitive dependencies are not followed. Thus, Java compile times can be drastically smaller than C++ compile times.

Still, I often run into clients who have Java compile times that run close to half an hour or more. By using good dependency management, and very fast compilers, I think this can be drastically reduced. But the problem still exists.

Dynamically typed languages, however, have virtually zero compile time. There are no declarations to hunt down. Compilation can occur while the code is being edited. As projects get larger and larger, this ability to be instantaneously compiled will become ever more important. Finally, there is the issue of deployment. In C++ or Java, if I change a base class, I must recompile and redeploy all classfiles for classes that are derived from that base. (Java folks, you can try to get away without doing this, but you'll be sorry.) Thus, even a small change to a single class can cause large redeployment issues. In dynamically typed languages, redeployment problems are not eliminated, but they are vastly reduced. So, Bob Martin's prediction for this decade: Keep an eye on languages like Python, Ruby, and Smalltalk. They are likely to become extremely important.

***

Want to read the whole interview? Click here to see the conversation with Martin and other IT pros.

From CIO: 8 Free Online Courses to Grow Your Tech Skills
Join the discussion
Be the first to comment on this article. Our Commenting Policies