ITworld.com
  Search  
Menu Changing the way you view IT
The Golden DTD Rules
Sign up for XML IN PRACTICE
More Newsletters
 
 

XML IN PRACTICE --- 02/15/2001



Mark Johnson

XML documents follow a stricter set of rules than HTML. Every open tag must have a corresponding close tag; "empty" tags may use a shorthand form (<.tag/>); XML documents must begin with an XML declaration while the HTML declaration is optional; the HTML declaration doesn't use the <.? ?> notation. These rules, and many more, define a "well-formed" document.
Advertisement
On this topic




A “valid” XML document corresponds to the Document Type Definition (DTD) defined structure. The DTD defines valid tags and attributes, and their permissible contents. Just as rules govern the writing of XML documents, so too do rules govern the writing of XML DTDs. XML’s DTD language being, to a large extent, a subset of SGML’s makes XML a subset of SGML. Conversely, some SGML DTD features (like optional close tags) aren't allowed in XML.

Remember these basic rules when writing XML DTDs:

  • Element definitions may not be repeated. The <.!ELEMENT> declaration defines a tag in the XML DTD. Defining the same tag twice in a DTD evokes an error. For example,

<.!ELEMENT p EMPTY>
<.!ELEMENT p EMPTY>

A DTD containing the above element definitions is not “well- formed”, despite the definitions being identical.

  • Attributes may not be defined multiple times for an element. The <.!ATTRLIST> declaration defines an element’s valid attributes. Attributes may not be repeated in an attribute declaration. For example,

<.!ATTLIST something CDATA #IMPLIED
something_else CDATA #REQUIRED something CDATA #IMPLIED>

A DTD containing the above attribute definitions is not “well- formed”.

  • Mixed-content models" may be defined, but the order or number of occurrences of text or tags cannot be constrained. “Mixed-content models” are elements that allow a mix of tags and text.

Using the <.!ELEMENT> declaration, indicate zero or more occurrences of its contents to define a “mixed-content model”.

<.!ELEMENT textstuff (#PCDATA|tag1|tag2|tag3)* >

The asterisk above indicates that zero or more occurrences of anything inside the parentheses are allowed with no other restrictions. This rule simplifies XML parser writing.

The XML 1.0 specification provides the best, and most complete, information about writing “well-formed” DTDs. http://www.w3.org/TR/2000/REC-xml-20001006

The annotated XML specification, along with many other XML resources, is available from: http://www.xml.com.

 

Mark Johnson is president of Elucify Technical Communications, a Colorado-based training and consulting company dedicated to clarifying novel or complex ideas through clear explanation and examples.

Sponsored links
Locate Hidden Software on business PCs with this free tool
Bring harmony to your mix of UNIX-Linux-Windows computing environments
Top 5 Reasons to Combine App Performance and Security
KODAK i1400 Series Scanners stand up to the challenge
www.itworld.com    open.itworld.com     security.itworld.com     smallbusiness.itworld.com
storage.itworld.com     utilitycomputing.itworld.com     wireless.itworld.com

 
Contact Us   About Us   Privacy Policy    Terms of Service   Reprints  

CIO   Computerworld   CSO   GamePro   Games.net   IDG Connect   IDG World Expo   Industry Standard   Infoworld   ITworld   JavaWorld   LinuxWorld  MacUser   Macworld   Network World   PC World   Playlist  

Copyright © Computerworld, Inc. All rights reserved

Reproduction in whole or in part in any form or medium without express written permission of Computerworld Inc. is prohibited. Computerworld and Computerworld.com and the respective logos are trademarks of International Data Group Inc.