July 19, 2005, 10:13 AM — New search tools promise to make web multimedia as easy to search as text.
Blinkx and Podscope have announce services that index the audio content of multimedia files. The services use proprietary audio-to-text conversion processes to create a text version of the speech within media files. The text version of the file can then be indexed to make the media file searchable.
Podscope
Podscope describes its site as "the first search engine that actually allows you to search for spoken words within any audio or video file." While the site has started by indexing podcasts, it plans to add other types of multimedia in the future. The site is currently in public beta testing.
According to Dave Seltzer, Podscope Systems Architect, "The beta is to evaluate the system under load to determine areas needing improvement. Believe me when I say, we are very interested in feedback. So if you're a listener or a podcaster, please, tell us what you like and what you think should be changed."
The site was created by a company called TVEyes. They also offer a service for radio & TV, making it searchable by keyword, phrase or topic.
Blinkx
blinkx is a search engine that is optimized for rich media. It uses voice recognition software to transcribe the content of audio and video segments, so people can find the content they're looking for.
"The volume of rich media content online continues to explode, but traditional search engines such as Google and Yahoo were developed for text-based keyword searches, not for audio or video content," said blinkx founder Suranga Chandratillake. "The prevalence of broadband and multimedia is driving demand for next generation search capabilities."
blinkx's podcast spider crawls the Web in search of rich media content, automatically identifying and processing podcast and video blog data, indexing 100s of hours of casts every hour.
blinkx TV combines speech recognition and transcription techniques with intelligent Context Clustering Technology (CCT) and other technologies to analyze the content (spoken words) of an audio/video file. The output of these analytical sub-processes are stored as further metadata tracks, alongside the digitally encoded content.
As more users adopt broadband, and as more individuals create and publish personal media content to the web, services that make this content searchable will be more and more important.
ADDITIONAL RESOURCES













