One way to improve document searches is to put a human expert -- who knows the topic and terminology -- in the loop, says Gordon Cormack, coordinator of TREC's legal track and a professor of computer science at the University of Waterloo.
"You can do a lot better job of searching for relevant documents if you use a combination of an expert who knows the data set working with individuals who are actually running automated search queries," agrees Jason Baron, director of litigation at the U.S. National Archives and Records Administration and a founding coordinator of the TREC legal track.
Meanwhile, vendors are scrambling to provide alternative search technologies to overcome the limitations of today's tools. E-discovery software vendor Clearwell Systems Inc., for example, has developed what it calls "transparent search," which lets users select specific variations of keywords to reduce the likelihood of false positives.
With Clearwell's tool, a user could conduct a keyword search for, say, the word false that catches derivations such as falsify and falsifying while excluding irrelevant terms, such as falsetto, that a standard keyword search might include.
Another alternative is "concept search" technology that retrieves information related to a concept rather than a keyword or phrase. For example, a concept search based on the word oil would recognize that documents about petroleum are also relevant.
"Conceptual technology is a much more effective tool than keyword searching," says Oleh Hrycko, president of H&A eDiscovery in Toronto. H&A offers a search engine called eExamine Conceptual that groups together documents that address related concepts. The company says the tool cuts search time by up to 70%.
Other search technologies rely on taxonomies of industry terms, or mathematical techniques (such as clustering and latent semantic indexing) that determine the probability that a document has a particular term or concept.
Ultimately, the best approach might be a combination of the new search technologies. "A startling statistic one of our studies revealed is that 25% of relevant documents were found by Boolean search, while 75% were found by using other methods combined," says Baron.
Despite the increasing availability of more advanced search technologies, the new tools aren't being snapped up by old-school law firms. "In terms of comfort, the more senior practitioners might pine for the good old days when they had boxes of documents [to search through]," says Richard Braman, executive director of The Sedona Conference, an e-discovery think tank in Arizona.