I always find it interesting when people use GitHub for sharing things other than software code. Whether it’s music, laws or campaign platforms, people are increasingly using GitHub to open-source all sorts of non-code things. Not surprisingly, then, the following recent tweet caught my eye.
I just open sourced my DNA https://t.co/FeXnaiq2gE— Baishampayan Ghose (@ghoseb) June 13, 2014
Wow! In this age when we’re all hyper-concerned about the privacy of our online data, here was somebody choosing to share some of his genetic data on GitHub. In his GitHub repository, Baishampayan Ghose says that he decided to share his data because “.. it might help researchers, programmers who are looking for real genetic data.”
A quick search of GitHub for “DNA” turned up a number of other people who’ve similarly chosen to share some of their genetic data on GitHub. Most of these people are obtaining portions of their genetic data from companies like 23andMe. They seem to be motivated to share their data out of a desire to be helpful, by providing scientists and researchers with a sample of real genetic data to study.
However, a gent name Manu Sporny, who posted some of his genetic data on GitHub about three years ago, had a different motivation, which he wrote about in detail at the time. Genetic data, even for one person, tends to be large and complex, so Sporny sees a need for software tools to allow anybody to better query and understand their own DNA. He posted his data in the hopes that software developers will create open source tools for just that purpose.
Are developers using these data from GitHub to create such tools? I was able to identify some code repositories on GitHub that use genetic data, such as gql, which is a genetic query language for querying your genetic data once it’s in JSON format (an accompanying repository, dna2json, will convert the data you get from a company like 23andMe to JSON). There’s also 3dna, which will generate a 3D printable version of your genome, though I’m not sure how useful that would be to the average non-scientist.
If you really want to contribute your genetic data for scientific research, though, is putting it up on GitHub the best way to go? Probably not. There are other sites out there devoted to collecting genetic data for research purposes you should probably consider first, such as Harvard’s Personal Genome Project and openSNP. Your data is probably more likely to get into the hands of researchers through one of those sites instead of GitHub.
Still, if you’re willing to share your genetic data with the world in the first place, you may as well also put it up on GitHub. Maybe a sharp software developer will use it to build a tool for the greater good (or for your own good). Just promise that you’ll refrain from the “fork me” jokes.
Read more of Phil Johnson's #Tech blog and follow the latest IT news at ITworld. Follow Phil on Twitter at @itwphiljohnson. For the latest IT news, analysis and how-tos, follow ITworld on Twitter and Facebook.