Ruby, DSLs and the free lunch that was not so free
About a decade ago, the world of IT underwent a period of language creation hype. At the center of that hype was XML. The power attributed to XML stemmed in large part from the fact that it was not a language in the ordinary sense. It was a language for creating languages - known as schemas or DTDs - for any domain you liked: recipes, stock quotes, washing machine documentation, and so on.
Fast forward to today and the world of IT is embarking, I suspect, on a another period of language creation hype. This time, the hype epicenter is Ruby and the facilities it provides to create custom languages. The word "schema" now exits stage left and the acronym "DSL" (domain-specific language) enters stage right. As a language-oriented (as distinct to mathematically-oriented) IT person, I am as happy as a clam about this. I have got the popcorn, the gallon drum of soda and I'm settling into my easy-chair for what should be a fascinating show. The question on the game show is this: DSL or schema? Is the right answer, "DSL", "schema", "neither", "both" or the classic : "it depends"? Fingers on buzzers...
As an incurable predictor of IT futures, I also cannot help but predict the outcome. I think the answer is that all time classic "it depends". However, before the world settles on that answer, it will need to pass through a trough of disillusionment about DSLs. I think that trough will come soon. I think it will come hard and steep; I'm not sure that Ruby - an excellent general purpose programming language - will come out well at the and of it all. But, hey, I have been wrong before! Herewith is my reasoning. I welcome all constructive feedback and discussion. I have split the discussion into a number of necessarily overlapping areas...
1 - the inevitable language learning curve and when you need to climb it
Most successful general purpose programming languages of recent times have had a flexible verb/noun system and a fixed sentential form system. By that I mean that most common programming languages allow the creation of new nouns e.g., objects, variables, record structures and verbs e.g. functions, procedures. Object oriented programming languages allow the creation of verbs in the context of nouns: methods. However, for the most part, the sentential forms of most heretofore successful general purpose programing languages have been fixed. That is, if you need to repeat something; do something based on a condition; or decompose your program into manageable pieces; the language only provides a fixed set of sentential forms. You cannot roll your own.
For example, a typical conditional sentential form might look like this: "if [cond] then [block]". You may have great flexibility in how you create your conditions and create flexibility in how you structure your blocks, but the conditional sentence form is fixed by the language. This is just a simple example but similar strictures apply to creating modules, organizing modules into programs, organizing programs into libraries and so on.
When a developer says they "know language X" for any given value of X, what they really mean, in my opinion, is that they are familiar with all the sentential forms and (generally) have a good grasp of the common idioms, nouns and verbs (i.e. libraries) used with the language X. In other words, to know a programming language is, first and foremost, to know how to wield the built-in sentential forms.
DSLs allow you to create new sentential forms. This begs the question. If I create a DSL in Ruby or in Lisp or whatever and I expose it to my users, am I expecting them to learn a new language? I believe the answer is yes. This creates a linguistic learning curve. I'm not saying that that is good or bad. I'm just saying it is there. I believe that the phrase "it is just a Ruby DSL underneath" is understating the learning curve that custom languages create and we need to be careful when downplaying it. To be blunt about it, language learning curves cost money.
2 - the rise and rise of GUIs as Visual Domain Specific Languages
IDEs have long since ceased to be mere text editors. To pick just one example, what is Microsoft Visual C++ really? Is it a convenient editor for C++ or is it more than that? Let us look at what it does. It understands the sentential forms of the C++ standard but it has many many other useful concepts for C++ programmers – especially those that are targeting Microsoft Windows. It adds new concepts on top of the C++ language. Concepts like "projects", "event handlers", "doc/view architectures" and so on.
I think of these as sentential forms. It seems to me that the sentential forms that Visual C++ adds on top of the core C++ language are so significant that the phrases "I know how to program C++" and "I know how to program Visual C++" are two very different statements. I would go so far as to suggest that they are really, when all is said and done, two very different languages.
Language, it seems to me, does not have to fit into a text file to merit being called a language. Microsoft Visual C++ creates and gives life to a set of abstractions over C++ that make it, from my perspective, a domain specific language. If Microsoft chooses to give alphanumeric syntax to the DSL, document it and store it in a text file it is clearly a domain specific language. However, if the model is in some proprietary binary format, is it now, somehow not a domain specific language any more?
3 - the "good enough" power of the simplistic pre-processors
Language parsing, when all is said and done, works with syntax. The parser cares not how the sentential forms it knows about came to be as long as they are there. Over the years, language designers (and inventive programmers) have created all sorts of ways to allow developers to create their own sentential forms yet tricking the parser into thinking that the finite set it knows about is all there is. An early example was the C pre-processor. A common example on unix platforms is the M4 pre-processor.
Today, the programming language pre-processor concept is alive and well but doing most of its living inside modern IDE's. These IDEs allow developers to think in terms of higher level sentential form concepts like project, event handler, Model-View-Controller and then generate the lower level set of sentential forms automatically. Granted, the code generated may be ugly and difficult to maintain but it does seem to be wildly popular in practice. This begs the question: if a proper pure DSL is better, is it sufficiently better to cause a generation of developers (and developers working on developer tooling) to change tack? I have my doubts.
4 - the inevitabilty of the custom GUI
The reason I doubt that proper in-a-text-file DSLs will supplant the arguably inferior pre-processors is that developers and end-users alike are becoming more and more GUI-driven in their interactions with computers. Let us start with the ultimate end-user of some business application. What do I know for sure? I know that my application will need a custom GUI. I know that because telling business users that they need to crack open Notepad and type sentential forms into a text file tends to go down like a lead balloon. Now, if I have to write a custom GUI to help the end-user interact with my abstractions, what difference does it make if I spit out pure Ruby or Ruby corresponding to the DSL I created? My big time sink in development will be the GUI, not the back-end stuff. So how much does the custom DSL really benefit me? Is the delta sufficient to merit it?
Switching now to the developer side. What do I know for sure? It is becoming increasingly sure that developers sit in IDEs all day long. They expect their IDE's to know about sentential forms so that they don't have worry about grammar. IDEs that understand Java syntax, IDEs that know how to construct Ant build scripts, EJB deployment descriptors, resource files...Again, my big time sink in development will be the GUI, not the back-end stuff. So how much does the custom DSL really benefit me? For DSL's that you yourself will use, probably a lot. For a mass market of developers? I'm not so sure. They will demand GUIs and that is where most of the semantics of your DSL's sentential forms will ultimately have to live.
5 – the brain hurt of the meta-programmer
When XML came along, lot of folks thought it was a new thing in the world. A small coterie of folks, generally with notable flecks of gray in their hair, knew about its familial relationship to SGML and GML before that. To this day, a lot of folks find XML a bridge too far. The level of abstraction it uses – one step above a normal language – moves it beyond the pale for many people.
Now, as Ruby DSLs come along, lot of folks appear to think it is a new thing in the world. A small coterie of folks, generally with very notable flecks of gray in their hair, know about its familial relationship with Lisp and in particular the Lisp macro system and can be traced back at least to the Sixties. Lisp evangelists (a wonderfully persistent group of people) often argue that Ruby's DSLs are limited in comparison to what 1960's Lisp could do but I digress... To this day, a lot of folks find the concepts in Lisp macros to be a bridge too far. The level of abstraction it uses – one step above a normal language – moves it beyond the pale for many people.
As soon as language goes "meta" it crosses a rubicon. History has shown that introducing meta-linguistics into computing – however beneficial – is no easy matter. Furthermore, it seems to me that we are already absolutely swimming in DSLs. Every GUI tool that auto-generates "low level" code from a UML model, or allows you to drag-n-drop a relational data mapping or click'n'drag an event handler onto a form...DSL's everywhere.
So what exactly would Ruby DSLs do differently? I think it boils down to this. Ruby – just like Lisp macros before it – provide text-editable access to the high level sentential forms. i.e. you don't need to run a custom GUI to create your domain specific sentential forms, you can just type them into vi or emacs or notepad or whatever.
DSLs – in the pure sense of a meta-language you can type into a plain old text editor – have been around for a long long time and have had quite limited success if you define success in terms of mass market uptake. On the other hand, GUIs that allow you to work in high level sentential forms and generate the low level sentential forms seem to be going from strength to strength. Is that trend going to change? I don't think so. If anything, I think it will accelerate.
As I said earlier, I am a language-oriented IT person and I value the bits-on-the-wire syntax greatly. Statistically speaking, I am very likely to use a tool like Lisp macros or a Ruby DSL because it corresponds to how I think. However, I don't consider myself in any way representative of the mass of programmers on the planet. I'm too fond of the very concept of language for that.
There is no free lunch. If you create a text-editor-based DSL and use Ruby to express it, its important not to think that you are not creating a learning curve in doing so. If the language is aimed at a set of users who are less technical than you are, it might be the case that a GUI the embodies the concepts of your domain specific language and spits out clean, straight ahead Ruby is a better way to go. Especially if your user community is going to insist that you provide an easy to use GUI editor anyhow.
Finally, circling back to the XML hysteria of the last decade. Look at what has happened in the intervening decade. XML made it possible to create any number of domain specific languages. I remember seeing lists a mile long covering everything from HR records to stock tickers. Where are they now? Only a handful have crossed the chasm. The same will, I suspect, prove true for Ruby-based DSLs. When this happens – and I believe it is inevitable – I fear that Ruby will bear the brunt of the backlash and its many excellent attributes as a general purpose programming language, will not get the attention they deserve.