Review: 4 Java clouds face off
CloudBees, Google App Engine, Red Hat OpenShift, and VMware Cloud Foundry reveal the pleasures and perils of coding on a public cloud platform
At the movies, almost every thriller seems to include a moment when a character says, "That was easy ... a bit too easy." Then everything falls apart.
When I set out to test some of the top Java clouds on the marketplace, I found myself repeating this script. The bottom never dropped out on me. Nothing ever stopped working. Nothing fell apart. But time and time again, I kept fretting about how easy it was -- too easy. When the whole thing exploded, where would my Web app be?
Enterprise developers need to be a bit more paranoid about these possibilities than others. The average computer user is thrilled when a new package in the cloud makes life easier. They can embrace cloud-based email, and if the email gets lost, they can just shrug because email always gets lost and can sometimes be a blessing.
[ Also on InfoWorld: Java 7: What's in it for developers | Bossie Awards 2011: The best open source application development software | Top Java programming tools | For more on Java, subscribe to InfoWorld's Enterprise Java newsletter. ]
Enterprise developers can't be so sanguine. All of the fancy tools take as well as give. Every slick configuration option that's activated with one push of a button locks us in forever with the same push of the button -- or so it feels to those who worry. If we adapt to the cloud too readily and let it do too much for us, it may not be possible to go anywhere else.
The danger of lock-in seems to lurk around every corner, and that's not necessarily the worst part. What if we're happy with everything about our cloud except we need one missing feature that the cloud's masters either can't or don't want to deliver? The cloud can be a one-size-fits-all world.
If it's any consolation, cloud developers also appear flummoxed by the trade-off. They know customers want one-touch solutions and plenty of automation to make life easier. But it means delivering interfaces that won't be as standard or as flexible as customers would like. The cloud builders must figure out whether the marketplace wants everything done by the cloud or whether customers want to do enough for themselves to avoid lock-in.
To see where things stand, I set up some accounts on four leading Java clouds, built a few toy Java applications, and watched the gauges turn. Even among the four I tried -- Google App Engine, Cloud Foundry, CloudBees, and Red Hat OpenShift -- there is a wide variety of approaches. Some of the clouds rely upon standard tools that take standard WAR files and deliver their information to the world. Others have so many proprietary twists that you might as well tattoo the code on your arm -- it's going to be with you for the rest of your life.
The cloud experiment: Java versionThe Java cloud offerings are steadily growing better and more sophisticated, but they're far from a finished set of products. Several of the tools here are perfectly open about their half-baked state. The sign-up forms often insist that we understand the cloud is just a beta application, for development only and not for production work. In fact, it might be more accurate to call the clouds postalpha or prebeta.
Even the more established clouds are constantly shifting because this is all something of an experiment. No one really knows how the loads and the costs will add up, so the prices seem to be changing, sometimes in dramatic ways. The cloud sellers don't really know how their costs will shake out, so they're guessing when they say it costs X dollars for Y million transactions. As the old joke goes, they're losing money on every click but hoping to make it up in volume.
Pricing may be the most difficult and challenging issue for both buyers and sellers for years to come. People are already cheesed off at the way Google stopped subsidizing its App Engine. Some users are complaining that their costs doubled or tripled with, irony of ironies, one click of a button. But who can blame Google? While the company has excellent financial engineers, I'm not sure if they can know the fair price for a round trip to the Big Table data store. It probably fluctuates with the rainfall in the northwest, where the hydropower is the cheapest power source for some of Google's newest data centers.
Perhaps I'm overthinking it. Things can go wrong anywhere. Prices will fluctuate. The cloud can be more flexible and automated, thus saving us money on people who will minister to the racks and make sure the data is flowing smoothly. If that Web 3.0 application turns out to be a big hit but the cloud is too expensive, it will still bring in enough revenue to pay for all of the reprogramming required to move the app to a set of in-house servers. If it's one of those Web things where the revenues never scale with the costs, well, the price of experimentation couldn't be lower. That's what clouds are ultimately about: They simplify experimentation and change.
Just choosing a cloud can involve plenty of experimentation. The simplest option is to turn up a raw machine from the Amazon or Rackspace cloud, but these don't offer much of what the cloud marketeers promise. Sure, I pushed the button and started up a new machine in just a few seconds, but then I spent more than a few hours logged in as root installing the JVM and the rest of the stack. Once I finally got a machine configuration I liked, I was so proud of it that I wanted to put a picture of it on the fridge. I made sure to store it away so I could start it up as many times as I like.
If you've got the time and the inclination to build up a machine image with the software you like, raw cloud machines can offer you most of what you want from the cloud with few problems of lock-in. Both Amazon and Rackspace make it easy to store an image and hit the replication button again and again. You choose the software and you decide how many machines you want. In theory, there are more machines there whenever you need them. I experimented with spinning up new machines for the daily housekeeping work, and it was nice to spend only 1.5 cents per hour for them. After the work is done, they're gone.
Of course, you've got to do all of the thinking yourself. Do you want 100 machines or 102? Yes, you control your costs but you don't have time to react unless you build more intelligence on top of it.
Java clouds: Google App Engine There's something warm and comfortable about using Google's App Engine. What began as a fairly radical tool has slowly matured into an asset that's easier to understand and use, if only because the world has adopted many of the ideas.
The basic architectural themes have remained the same. You upload a small kernel of code with your business logic, and App Engine deploys enough instances to satisfy the demand. If you want to store data or synchronize your work between sessions, you have to use Google's proprietary data stores and caching, but everything else feels fairly standard. The first versions of App Engine used Python, but now you can push up Java WAR files filled with JSPs, servlets, and server-side logic. The administration is handled through a separate Web interface. Command-line issues are pretty much relegated to the past.
I think the biggest challenges for programmers will be adjusting to Google's nonrelational data stores. When App Engine first appeared, there weren't so many NoSQL projects around, and the idea of storing collections of name-value pairs was more of a novelty. Anyone approaching App Engine with a bit of experience with NoSQL won't be shocked at all by the simple solution that App Engine forces upon everyone who wants to keep data around. But anyone who still thinks of JOINs and normalized data will need to break from the table-oriented, relational past and adjust to a new way of doing things.
App Engine offers two classes of the data store, so the architect must decide whether to pay for additional power. The basic model makes one data center the master and all others a slave. If the data center fails or starts up a scheduled maintenance, your data can't be stored. You must be ready to live with a "planned read-only period." Many modern Web applications (think Facebook) can easily survive these kind of glitches, but applications requiring banklike levels of availability and consistency will need to look elsewhere.
The low-rent, master-slave configuration is supposed to be about one-third the cost of the "high replication" version, with each low-rent write costing about five-eighths of the high-rent equivalent. The low-rent version may be twice as slow on writes as the high replication cloud; then again, it might not. You have to be careful with some of these numbers because the mechanism includes lots of hidden overhead. Each name-value pair, for instance, includes all the names as overhead because there's no schema. That's the price you pay for the flexibility to store any old thing in any old row.
For some reason, I found myself fretting about the lack of access to the file system. I realize that the idea is to store the bits as blobs so that App Engine can optimize everything, but I still like the ability to write to the file system for some projects.
One of the biggest changes in App Engine is the appearance of economic reality. While Google heavily subsidized the first crop of applications by offering so much for free, it has slowly been squeezing the free apps and pushing people to pay for what they get. Free apps can now access the data store 50,000 times a day -- which adds up quickly if you keep more than a few items in the data store.
The new price list includes a long set of quotas and rates. All seem tiny and reasonable, but I have no easy way to compare them. Is 1 cent per 10,000 reads from the data store a good price? Should I hold out for 1 cent per 12,000 reads? Somehow it seems easier to go to the boss and ask for a box with two quad-core processors, a ton of RAM, and some fast disks, then cross our fingers. It's not very scientific, but it's so much easier than thinking through all of the details.
One interesting detail is that the App Engine sort of strips away your ability to tune your application's performance to handle peaks: 100 emails will always cost one penny. You can't save money with a cheaper processor and a queue that delays the email.
Java clouds: Cloud Foundry Spring has always been one of the cleanest frameworks in the Java enterprise world. It makes sense that someone would use it as the foundation for a Java cloud. That someone in this case is SpringSource, one of the Cloud Foundry project leaders and a division of VMware. It should come as no surprise that this cloud is built on top of VMware virtual machines.
(The VMware-hosted Cloud Foundry is a bit of a departure from past offers. The first version deployed Spring apps to the Amazon EC2 cloud. It's still available from classic.cloudfoundry.com if that's what you want.)
The easiest way to use Cloud Foundry is to create a Spring project from a template with SpringSource's customized version of Eclipse called the SpringSource Tool Suite. I tried installing some of SpringSource's tools into my own version of Eclipse, but the right collection of libraries was not easy to find. The SpringSource Tool Suite is simpler.
The Cloud Foundry is not limited to Spring. There's support for Rails, Sinatra, Scala, Grails, and Node.js. It's all running on the JVM even if you don't write any Java. Cloud Foundry just announced PHP and Python/Django support as well. The VM image that you get also comes ready with MySQL, MongoDB, and Redis databases waiting to suck up your information.
VMware has kept mum about the pricing. The product is still in beta, and VMware has been kind enough not to charge for it. Will the rates be too high? How can you plan? You can't, but the Cloud Foundry virtual machine is fairly open. You can download the Micro Cloud Foundry -- a portable virtual machine image of the Cloud Foundry environment -- and run it on your own system with VMware Player. The core code is open sourced at cloudfoundry.org and largely covered by the Apache license.
Java clouds: CloudBees One joke floating around the Net is a list of lies that companies tell potential hires to seem more with it. Running a continuous integration server is near the top. Everyone likes the idea of constantly checking the code to make sure it works, but no one wants to do all of the work required to both maintain the code and keep the continuous integration server up and running.
CloudBees would like to change this. Not only does the company offer a cloud for deploying your applications, but it provides a cloud to build them too. Your account is more than just a way to serve your data to the masses. There's a code repository (Git or Subversion) and a Jenkins server watching every piece of code you check in.
After a bit of fiddling, I was able to check in code and wait for Jenkins to build it, test it, compile the documentation, and deploy it to the server. If I needed more, there were plenty of other services, plug-ins, and switches to flip.
The theory is that CloudBees has plenty of high-end boxes working in parallel to build your huge pile of code. Instead of waiting for your desktop machine to page in the right libraries, you can let Jenkins parcel out your build to the racks at CloudBees.
I didn't see this advantage, but my Web application had just one class and one JSP. The Web interface to Jenkins comes with a neat progress bar and a flashing blue ball that made it clear that my local machine could build these few files faster than the leviathan in CloudBees' data center.
Even if small projects won't tap the power of the cloud, they'll still be able to use the discipline of Jenkins. It took me a few minutes of poking and prodding to get the code to flow all the way through the build pipeline, but after that I was golden. It's nice to let someone else worry about keeping Jenkins running.
The CloudBees cloud is essentially Tomcat and MySQL, but other databases are available, including some from third parties with tight integration. Cloudant, for instance, offers CouchDB services, and MongoHQ serves up MongoDB.
CloudBees offers servers as "app cells," a unit of power that's roughly one-eighth of a standard Amazon EC2 server. The memory and compute cycles are tied together, so you essentially buy the servers by the eighth.
CloudBees offers a generous set of free services, but the constraints are tight. Only the casual developers will be happy within them. Anyone engaged in serious work will quickly need to upgrade to a paying service.
Java clouds: Red Hat OpenShift Red Hat was never content to be just a collection of Linux tools. Its new foray into the cloud, called OpenShift, offers a quick way to deploy Java, Python, PHP, or Ruby apps to a collection of machines waiting to accept them. When you're done developing, the Red Hat cloud offers a collection of tools for deploying your app on Amazon EC2.
OpenShift is not Java-centric by any means. Whether you create a Java app or another kind, it handles much of the deployment issues. The standard Java application is a JBoss Application Server 7 stack built by Maven. This is a fairly new option, and I didn't find it listed in the fancy HTML documentation. Instead, I stumbled on it by hitting -h on the command line.
Yes, OpenShift is a good tool for those who like to use the command line. I typed a few lines, and boom! A JBoss application was deployed, running, and ready for customization. Updating is also simple. After you add lines, you commit to Git and push to the main server. This is more than a typical push, though, because you can watch the Maven build executed automatically as the push triggers a deployment. Using a version control system to run a deployment is more and more common, especially because it makes rolling back easier. Choosing Git is a modern choice.
Once you hit the "push" button with Git, the code ends up in the Amazon EC2 cloud. You provide the account information and the Red Hat tool called Flex handles the deployment issues. You have 30 days of free trial if you want to experiment on Red Hat's dime. These tools are all said to be in beta and strictly for development work.
Clouds versus hosting services Long before there was the cloud, there were the hosting services such as Mochahost.com, DailyRazor.com, Javaservlethosting.com, and HostJava.net. They're still here. All will take a WAR file and connect it with a domain name, usually giving you a running copy of Tomcat and as much memory as you care to buy. Often they toss in the cPanel control panel and other features from their PHP and HTML stacks.
Despite the fact that these services began before someone coined the word "cloud," it's worth ending with a comparison to them because they all offer fairly cloudlike features. It usually takes just a few seconds to start up a machine. If you have your credit card number handy, then Tomcat and the database are ready and waiting.
The biggest differences from the clouds will be in the data store and the scaling. These services are only for WAR files that are happy on one machine with one database. While you can build out the intermachine communication, you're left to do all of the work. If you want your database backed up, you get to push a few buttons.
This is fine for small and even not-so-small experimental sites as long as you don't use too much memory. In other words, they're good options for fledgling startups and proof-of-concept projects. The prices are predictable, and the setup is easy. The downside: Scaling can't be done easily without moving to a new platform.
But the more you play with the hosting services, the more the differences help you realize what the cloud services are trying to offer. It's not just the ability to start up a machine in seconds. It's not the preconfiguration. These precloud services offer all of that. It's the chance to buy computers by the minute or by the transaction. The hosting services don't come close to such offers, and that's where the current trend is going.
The more I played with the cloud and the precloud machines, I recognized there will be a wide range of opinions about clouds. Those with a fairly predictable and steady demand will wonder what's so great about the cloud. But those who suddenly need big blocks of compute cycles for short periods of time will be excited by the new options.
This article, "Review: 4 Java clouds face off," was originally published at InfoWorld.com. Follow the latest developments in application development and Java programming at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.
Read more about application development in InfoWorld's Application Development Channel.