August 05, 2008, 12:34 PM — Snapping away recently with my lovely new digital camera, it occurred to me that the many megabytes of data I was chewing through as I shutter-bugged my way through the vineyard, would be so much more useful with the addition of some small number of bytes of well chosen meta-data. A grape-sized lump of meta-data alongside the magnum-sized lumps of "real" data would add so much value to the data itself...if only there was some way to create them easily? What is this picture about? What is it of? Were was it taken? Why was it taken? What are those people in the background doing? Who are they? And so on.
A great tragedy lies herein. A sad law of this universe seeps forth like an Einstinien nightmare. A law that goes something like this:
"the chances of any normal human being taking the time to add incredibly valuable meta-data to the great wads of digital data they
create daily, approximates zero." An alternative formulation - using the classic dentistry analogy goes like this: "Most people would
prefer root canal work than the utter tedium and ambient feeling of futility that accompanies meta-data creation. Besides everyone is too
busy. Oh, and besides that again, it always seems to be more fun to create new stuff than to create stuff about old stuff."
This all sounds a bit depressing. Here is a much more upbeat formulation of the same law: "Any gadget, technique or service that
adds even vaguely useful meta-data without requiring significant human intervention will find a very receptive market out there."
Two examples from digital photo-land spring to mind. The first is the question of "where?". Given the general availability of GPS, it is
surely only a matter of time before digital cameras automatically record the "where" as well as the "when" without the shutter-bug having to do anything. The "when" is already a standard piece of meta-data as long as you take the time to teach your camera the date and time. Both of those dimensions : time and space are perfectly suited to the poster child of Web 2.0 : mashups. I suspect we will see some amazing applications of this on social networks in the years ahead.
Another "w" word that is regularly relevant in photography: "who" is in the photo? Now this is a tough one. An entirely more complex issue than automatically recording time and space meta-data.
Face recognition is, not to put too fine a point on it, insanely complicated. And yet there is hope I think, thanks to the way Web 2.0 facilitates crowd-sourcing.
Most people - I suspect - take multiple photos of the same people. Family being the obvious grouping. Grandma and little sister
Sue at the Bar Mitzva. Uncle Bob and Aunt Sally on Thanksgiving.
Now, imagine a service whereby you - as the owner of a burgeoning set of photos - add met-adata to a representative subset of your people photos. This is Uncle Bob, this is Aunt Sally and so on. At this point your online photo service provider lets its merry band of face
recognizers (people), working for a fee, add meta-data to your people photos. You don't need to know Uncle Bob to spot him in photos once you have one or two known examples of Uncle Bob. The infrastructure to do this sort of thing is already in place. Mechanical Turk is an example (although I do hope Amazon change the name to something less subject to unfortunate connotation).
One of the things I like about this human-generated meta-data service is that it exploits the volume asymmetry of data and meta-data
nicely. Broadband services tend to have much higher bandwidth on the downlink than on the uplink. Annotating photos is gigabytes down and kilobytes up. Perfect.
Okay. So where are we now with respect to the famous list of "w's" - who/what, where, when and why. I can see ways to get the where, the when and the who/what. That leaves the "why". I think the originator of the photo is stuck with having to create that one.