ITworld.com
  Search  
 Home  Newsletter Archive  XML IN PRACTICE
DTDs Limited for Data
Sign up for XML IN PRACTICE
More Newsletters
 

XML IN PRACTICE --- 06/28/2001



Mark Johnson

A Document Type Definition (DTD), as you probably know, constrains what elements and attributes may appear in an XML document and how they relate to one another. The DTD expresses rules about how elements must be ordered, elements that contain other elements, the possible values of attributes, and so on. A program that uses XML can apply a DTD to an XML document to ensure that the document contents follow the rules.

For example, imagine your veterinarian has a brand-new clinic management system that uses XML. Now imagine you have a balding cat. The veterinarian diagnoses the cat with alopecia and enters the diagnosis into the system. The clinic management system creates the following XML document and sends it (as a message) to a billing system somewhere on the network:

<.Episode ID = "5" PATIENT = "804591">
<.Date>2001-06-22T14:31:00.000-06:00<./Date> <.Diagnoses> <.Diagnosis> <.ICD9>704.00<./ICD9> <.Desc>Alopecia, unspecified baldness<./Desc> <./Diagnosis> <./Diagnoses> <./Episode>

This document describes an "episode of care" for your cat, including the date of the episode, the ICD9 code (a standardized vocabulary for diagnoses), and an English description of the condition.

The billing system can use a DTD to ensure that an incoming Episode follows the rules for Episodes. The DTD for an Episode might look something like this:

<.!ELEMENT Episode (Date, Diagnoses)>
<.!ATTLIST Episode ID CDATA #REQUIRED PATIENT CDATA #REQUIRED> <.!ELEMENT Date (#PCDATA)> <.!ELEMENT Diagnoses (Diagnosis)*> <.!ELEMENT Diagnosis (Code,Desc?)> <.!ELEMENT Code (ICD9)> <.!ELEMENT ICD9 (#PCDATA)> <.!ELEMENT ICD10 (#PCDATA)> <.!ELEMENT Desc (#PCDATA)>

This DTD provides the structure for the XML document, including element nesting, attributes, possible attribute values, and number of occurrences of each element. However, the DTD "language" isn't powerful enough for use with data like these, for a several reasons:

  • No way to constrain data. For example, there's no way to ensure that ICD9 diagnostic codes match a specific format.
  • No primitive data types. No way to indicate the Episode attributes ID and PATIENT are numbers.
  • No sophisticated data types like date.
  • Non-XML syntax. The DTD is written, not in XML, but in its own peculiar syntax. So there's one more language to learn, and one more thing to get wrong.
  • Weak namespace control.

Next week, you'll see an example of XML Schema, recently made a recommendation by the W3C. XML Schema solves the above problems and more, but at a cost of great complexity – one of the problems XML was supposed to solve.

 

Mark Johnson is president of Elucify Technical Communications, a Colorado-based training and consulting company dedicated to clarifying novel or complex ideas through clear explanation and examples.

www.itworld.com    open.itworld.com     security.itworld.com     smallbusiness.itworld.com
storage.itworld.com     utilitycomputing.itworld.com     wireless.itworld.com

 
Contact Us   About Us   Privacy Policy    Terms of Service   Reprints  

CIO   Computerworld   CSO   GamePro   Games.net   Industry Standard   Infoworld   ITworld  
JavaWorld   LinuxWorld  MacUser   Macworld   Network World   PC World   Playlist  

DEMO   IDG Connect   IDG Knowledge Hub   IDG TechNetwork   IDG World Expo  

Copyright © Computerworld, Inc. All rights reserved

Reproduction in whole or in part in any form or medium without express written permission of Computerworld Inc. is prohibited. Computerworld and Computerworld.com and the respective logos are trademarks of International Data Group Inc.