The tool will soon be available for end-users via an online web interface. But research into improving text mining will continue, said Korhonen, with one of the biggest current challenges being the development of adaptive technology that can be ported easily between different text types, tasks and scientific fields.
"Although still under development the system can be used to make connections that would be difficult to find, even if it had been possible to read all the documents," said Korhonen. "In a recent experiment we studied a group of chemicals with unknown modes of action and used the CRAB tool to suggest a new hypothesis that might explain their male-specific carcinogenicity in the pancreas."
The Cambridge development comes as IBM is making a play to help manage patient care with its Watson (http://www.computerworlduk.com/news/applications/3313622/ibms-dr-watson-...) data analytics computing platform. Watson is designed to understand natural language in unstructured data and is now being applied to medical diagnostics.
Watson's capabilities allow medical histories of patients to be overlaid with their symptons and their family histories of past illnesses, to allow clinicians to reach what is hoped is the more accurate diagnosis of patients.