ITworld.com
  Search  
Menu Changing the way you view IT
Well-formed vs. Valid XML Documents
Sign up for XML IN PRACTICE
More Newsletters
 
 

XML IN PRACTICE --- 11/02/2000



Mark Johnson

The XML specification states that XML documents must be "well-formed" to be considered XML. The specification also talks about "valid" documents. Of course, an XML document might be "well-formed" without being "valid".
Advertisement
On this topic




A well-formed document complies with all of the well-formedness constraints in the XML 1.0 specification document. These constraints include things such as:

  • Tag nesting may not overlap. For example, <a><b></a></b> is not well-formed, because the "b" tag does not close before its enclosing "a" tag.
  • Special characters, such as < and &, must be represented as "entities", which keeps the XML parser from getting computed. You've probably seen entities in HTML. They look like: &.
  • All references to external information must be resolved. For example, other files included in any XML file must be present at the time of parsing.

A valid document matches the grammar defined in its Document Type Definition (DTD). A DTD describes the XML file’s required structure in order to be valid. A DTD optionally appears in the top of an XML file and describes the valid tag names, the tags’ order, the allowed attributes’ values, and so on.

A file might be well-formed, but still not comply with the rules described in the DTD. Valid files, however, are well-formed and match the DTD defined grammar. All valid documents are well-formed, but not vice-versa. Non-validated documents don't even have Document Type Definitions.

You can read the XML 1.0 specification for yourself at: http://www.w3.org/TR/1998/REC-xml-19980210

For more on validation and DTDs, see:
http://faq.oreillynet.com//XML/fetch.pl? CompanyID=414&ContentID=174&FaqID=149&word=writing%20a% 20dtd&faq_template=http://faq.oreillynet.com//XML/searchfaq.html&topic=& back_refr=http://faq.oreillynet.com//XML/&topicname=SGML/HTML%20authors

 

Mark Johnson is president of Elucify Technical Communications, a Colorado-based training and consulting company dedicated to clarifying novel or complex ideas through clear explanation and examples.

Sponsored links
Locate Hidden Software on business PCs with this free tool
KODAK i1400 Series Scanners stand up to the challenge
Top 5 Reasons to Combine App Performance and Security
Bring harmony to your mix of UNIX-Linux-Windows computing environments
www.itworld.com    open.itworld.com     security.itworld.com     smallbusiness.itworld.com
storage.itworld.com     utilitycomputing.itworld.com     wireless.itworld.com

 
Contact Us   About Us   Privacy Policy    Terms of Service   Reprints  

CIO   Computerworld   CSO   GamePro   Games.net   Industry Standard   Infoworld   ITworld  
JavaWorld   LinuxWorld  MacUser   Macworld   Network World   PC World   Playlist  

DEMO   IDG Connect   IDG Knowledge Hub   IDG TechNetwork   IDG World Expo  

Copyright © Computerworld, Inc. All rights reserved

Reproduction in whole or in part in any form or medium without express written permission of Computerworld Inc. is prohibited. Computerworld and Computerworld.com and the respective logos are trademarks of International Data Group Inc.