ITworld.com
  Search  
Menu Changing the way you view IT
How Will you Store Your XML Data? By Mark Leon
Sign up for XML IN PRACTICE
More Newsletters
 
 

XML IN PRACTICE --- 07/26/2001



Customers say they want them, vendors are scrambling to provide them, and opinions vary as to how to set them up correctly. They are XML databases, a way to store, search, and retrieve all that mission- critical business data that is finding expression in XML format. Currently, XML rivals HTTP, HTML, and SQL as one of the big hits on the top 10 chart of information management standards.
Advertisement
On this topic




But XML's strength, its great capability of facilitating the flow of semistructured data among applications and heterogeneous systems, also introduces several new problems. One of the more pressing problems is how to store and manage XML data.

"There are really three ways you can do this," says John Matranga, CTO of Omicron Consulting, in Philadelphia. "You can store XML in a database designed specifically for XML, in a modified object database, or in a relational database."

Matranga goes on to say that because the relational database is still the undisputed king, most people will probably choose this option. "But if the relational database does not have XML extensions, you will need to 'teach' it how to handle all the hierarchies associated with an XML document," he says.

In fact, Microsoft, Oracle, and IBM have already added XML extensions to their relational databases, but these efforts will not satisfy everyone for a number of reasons.

"I think most people will want to stick with their relational technology when it comes to XML storage," says Josh Walker, an analyst at Forrester Research, in Cambridge, Mass. "But XML has breathed some new life into the niche market of specialty databases."

A second chance for object databases?
If you are over the age of 25 you may recall that only a few years ago the object database was hailed as the next big thing in data storage.

But outside of some very specialized markets -- high-end science research applications, for example -- the object database never really caught on.

The reason, according to Deborah Hess, senior analyst at Gartner in Stamford, Conn., was complexity. "An object database requires you to learn a whole new language," Hess says. "That is one of the reasons the object database market started to die in late 1995."

It is also why Object Design, an object database vendor in Burlington, Mass., changed its name in 1999 to eXcelon. This was actually the culmination of a project begun in 1997 by then-CTO Larry Alston to make Object Store, the company's database, into a repository for XML documents.

Alston has since left the company, but the commitment to XML remains strong. "The relational database design does not easily support indexing or searching XML," says Satish Maripuri, president and COO of eXcelon. "An object database such as ours offers a more natural way to store, search, and retrieve XML data. This is why we took a bet with XML."

Hess agrees: "An object database stores data in hierarchical form; this enables it to handle all the classes and inheritance properties of objects. Now XML documents are themselves hierarchical, so the two fit very well together."

And Maripuri says the complexity issue is no longer the barrier to entry it once was. "We have built a graphical XML interface into the product," he explains. "You don't need complex development expertise to use it."

Analysts give eXcelon high marks for what it has been able to accomplish both in simplifying its product and in adapting it to XML, but few are willing to predict the bet will pay off.

"I like their technology," Forrester's Walker says. "But I really see it as something that will be built into a larger infrastructure, and I am cautious about their long-term prospects."

Martin Marshall, managing director at Zona Research, in Redwood City, Calif., doesn't think the company will be able to make it alone. "I think they are an acquisition target," he says.

No longer an academic issue
Meanwhile, folks such as John Conte, director of IT at Wesco Distribution in Pittsburgh, will tell you that analysts aren't the only ones debating these questions.

"We distribute electrical products -- things like lightbulbs and switches -- primarily to construction firms," Conte says. "About a year ago our small-to medium-sized customers started asking IS to provide XML integration."

The reasons were the by-now-familiar advantages of XML: It is more flexible than EDI (electronic data interchange) and cheaper because it can, via the Internet, bypass expensive, private VANs (value-added networks).

So Wesco hired Keane, a services company in Boston, to help implement an XML-based system to process orders.

Conte says the project has been a success. "But," he explains, "there is one thing that gives us pain on a daily basis. When we started this project, there were not many database tools for storing XML, so we tried to create a standard relational database schema that would support all the various flavors and formats of XML we could anticipate."

The combinations of different purchase order formats and XML vocabularies, however, proved impossible to anticipate. "It means that we frequently need to change our schema," Conte says. And this, as any DBA (database administrator) will tell you, is something you want to do as seldom as possible.

So Conte is very interested in finding a better way to store XML. "Six months ago it wasn't such a big deal, but now it is obvious we need a solution," he says.

Wesco's database is Microsoft SQL Server, but Conte says he is open to any number of new solutions. "From what I have seen, I think the XML extensions Microsoft and Oracle are adding will be enough to do what we need to do. We could switch to Oracle if it makes sense, but we are also interested in some of the new XML database products."

As noted, the big players of data storage have not been sitting on their hands. Oracle, IBM, and Microsoft have added XML extensions to their relational offerings.

"The object database folks thought XML would give them a new lease on life," says John Magee, senior director of product marketing at Oracle, in Redwood City, Calif. "But it hasn't panned out."

Magee says Oracle currently offers XML support in the 8i release. "We are going to offer additional support for XML as a data type in the 9i release due out in the next few months," he adds. New 9i SQL functions will also include new operators designed specifically for querying XML.

XML at IBM is part of the WebSphere e-commerce infrastructure. And so is the database DB2. "We have an XML extender for DB2 that allows you to query the database directly for rich XML content," says Scott Hebner, director of marketing for WebSphere.

WebSphere's primary competition is BizTalk Server from Microsoft. And SQL Server with XML extensions is an important component of the BizTalk architecture.

"We didn't have any XML functions built into SQL Server 7," says Jeff Ressler, lead product manager for SQL Server at Microsoft. "But with the SQL Server 2000 release last September, you can load XML documents directly into the database and retrieve them with a simple 'Select' statement."

European invasion?
This kind of action from the heavies might be enough to scare even the most optimistic of newcomers, but another database giant is getting into the act.

Not exactly a household name in the United States, Software AG of Germany owns a substantial share of the global database market with its Adabase product. And now the company, with U. S. headquarters in Reston, Va., thinks it has an edge in the XML storage space.

"We released Tamino in September of 1999," says John Taylor, director of product marketing at Software AG. "Tamino is not a relational database, nor is it an object database modified for XML. It is, rather, a database built from the ground up specifically for XML."

The interface for Tamino is HTTP, and Taylor says his company is working with the World Wide Web Consortium (W3C) to develop the next XML query language.

"The issue of query and retrieval is key," Taylor says. "You can use extensions to SQL for this, but to do that you need to break the XML hierarchy into a set of relational tables. This means queries will necessarily contain a complex set of join statements. With XPath, our query language, we can replace all that with one line."

Taylor says that more than 280 customers are currently using Tamino. One of these is the California Board of Equalization in Sacramento, Calif. The board collects about $37 billion in taxes (primarily sales tax) for California.

"We started looking at XML to facilitate the electronic filing of taxes," says Larry Hanson, data architect for the board. "Before long we also realized XML would be the best way to store tax returns, tax schedules, and tax-related messages."

The board was already a big Adabase shop with ties to Software AG, a fact that helped drive the adoption of Tamino.

Hanson says the product works as advertised. "With the query language, XPath, you can retrieve all of the information in a given XML document. There is a bit of a learning curve: XPath is similar to SQL but you need to be more aware of the structured hierarchical nature of a given XML document."

Taylor says that Software AG has no plans to challenge the relational vendors on their own turf. "We are not going to make the mistake that the object database vendors did in the mid-'90s," Taylor says. "We know that transactions will still be relational, and we know we are in a niche market. We just think it is a very big niche."

He could be right. Until recently most of the vendor action around XML was focused on using it as a transport mechanism. And users, such as Wesco's Conte, were also concerned mainly with XML connections.

But the issue of storage is rapidly moving to the forefront. "At some point," Conte says, "any customer using XML in significant volume will need to store the documents."

 



Sponsored links
Top 5 Reasons to Combine App Performance and Security
KODAK i1400 Series Scanners stand up to the challenge
Locate Hidden Software on business PCs with this free tool
Bring harmony to your mix of UNIX-Linux-Windows computing environments
www.itworld.com    open.itworld.com     security.itworld.com     smallbusiness.itworld.com
storage.itworld.com     utilitycomputing.itworld.com     wireless.itworld.com

 
Contact Us   About Us   Privacy Policy    Terms of Service   Reprints  

CIO   Computerworld   CSO   GamePro   Games.net   IDG Connect   IDG World Expo   Industry Standard   Infoworld   ITworld   JavaWorld   LinuxWorld  MacUser   Macworld   Network World   PC World   Playlist  

Copyright © Computerworld, Inc. All rights reserved

Reproduction in whole or in part in any form or medium without express written permission of Computerworld Inc. is prohibited. Computerworld and Computerworld.com and the respective logos are trademarks of International Data Group Inc.