No, HealthCare.gov doesn’t require half a billion lines of code

New data confirms that a previous claim about the size of the Obamacare portal’s code base was way off the mark

By  

Screenshot of HealthCare/.gov website

Does this really require 500 million lines of code?

Image credit: REUTERS/Mike Segar

Whether you’re a fan of Obamacare or against it, there’s no denying that it’s a topic that gets people worked up. Even the technical aspects behind the implementation of it can generate a lot of discussion. Take, for instance, the claim made to the New York Times last year by an anonymous specialist who reportedly worked on HealthCare.gov, the federal government's online marketplace for health care under the Affordable Care Act, that the website consisted of 500 million lines of code.

This claim immediately caught the attention - and derision - of software developers who felt that it was not only unrealistic, but flat out impossible. Sure, the site is complex, but not that complex and, besides, it just wouldn’t be possible to generate that much code in the amount of time it took to build the site.

To put that 500 million number into perspective, two years ago I did a little research into the number of lines of code behind some well-known software over the years. At the low end, the guidance system for the Apollo 11 spacecraft used 145,000 lines of code. The Mars Curiosity rover uses 2.5 million lines of code. At the upper end, there’s Mac OS X Tiger (version 10.4) which had 86 million lines of code.

500 million lines of code for a transactional website - more than five times as much code as that behind OS X - just didn’t pass the sniff test. But just how many lines of code does it take to generate HealthCare.gov?

This question came up on Reddit again last week and it appears that we may now have answer. One commenter who goes by the handle agenaille and who claimed to have worked on HealthCare.gov as part of the post launch clean-up crew at the end of 2013, provided counts of the lines of code behind HealthCare.gov, broken down by programming/markup language. There’s no way to know if this person is telling the truth, but the Reddit community certainly seems to believe him or her; one redditor awarded agenaille Reddit gold for the post. For the sake of argument (and fun), let’s assume the numbers provided by this person are correct.

Here’s the breakdown:

HealthCare.gov Lines of Code by Language

Language File Count Lines of Code
Java 13,481 2,399,683
HTML 1,635 515,494
JavaScript 1,631 322,192
XSD 5,227 156,696
XML 659 136,827
CSS 205 109,815
Maven 275 47,449
XSLT 383 21,624
Bourne Shell 248 8,830
SQL 28 8,487
JavaServer Faces 35 3,770
DOS Batch 48 849
Ant 8 810
Perl 18 646
Visualforce Component 39 626
Groovy 4 361
Python 5 263
Visual Basic 1 25
DTD 1 17
JSP 3 13
ASP.Net 1 11
Totals 23,935 3,734,488

Source: Reddit/agenaille

You see that the total lines of code count provided by agenaille is 3.7 million, nowhere near 500 million. Agenaille notes that this doesn’t include code used for administrative tools related to the site. In the end, s/he guesstimates the total lines of code behind HealthCare.gov to be somewhere between 5 and 15 million. Again, way less than half a billion, in any case.

I took the numbers and generated the following chart to demonstrate the percentage breakdown of lines of code by language behind HealthCare.gov.

Pie chart showing the percent of total lines of code behind HealthCare.gov by programming language (excluding blank lines and code comments). Java is 64%, HTML 14%, JavaScript 9%, XSD 4%, XML 4%, CSS 3% and other 2%.

Image credit: ITworld/Phil Johnson; Data source: Reddit/agenaille

As you can see, two-thirds (64%) of the code behind HealthCare.gov is Java. Another 14% is HTML markup, followed by JavaScript (9%), XSD (4%), XML (4%) and CSS (3%).

These data, for what they’re worth, support the belief that HealthCare.gov doesn’t, as many people thought, require anywhere near 500 million lines of code. Still, nearly 4 million lines of code seems like quite a bit for this kind of thing.That’s still 8 times as much code as was required for the space shuttle’s primary flight software. So, apparently, running HealthCare.gov is harder than rocket science. Who knew?

Read more of Phil Johnson's #Tech blog and follow the latest IT news at ITworld. Follow Phil on Twitter at @itwphiljohnson. For the latest IT news, analysis and how-tos, follow ITworld on Twitter and Facebook.

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Answers - Powered by ITworld

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Ask a Question
randomness