Mark Johnson
XML documents follow a stricter set of rules than HTML. Every open tag
must have a corresponding close tag; "empty" tags may use a shorthand
form (<.tag/>); XML documents must begin with an XML declaration while
the HTML declaration is optional; the HTML declaration doesn't use the
<.? ?> notation. These rules, and many more, define a "well-formed"
document.
A “valid” XML document corresponds to the Document Type Definition
(DTD) defined structure. The DTD defines valid tags and attributes, and
their permissible contents. Just as rules govern the writing of XML
documents, so too do rules govern the writing of XML DTDs. XML’s DTD
language being, to a large extent, a subset of SGML’s makes XML a
subset of SGML. Conversely, some SGML DTD features (like optional
close tags) aren't allowed in XML.
Remember these basic rules when writing XML DTDs:
- Element definitions may not be repeated.
The <.!ELEMENT> declaration defines a tag in the XML DTD.
Defining the same tag twice in a DTD evokes an error. For example,
<.!ELEMENT p EMPTY>
<.!ELEMENT p EMPTY>
A DTD containing the above element definitions is not “well-
formed”, despite the definitions being identical.
- Attributes may not be defined multiple times for an element.
The <.!ATTRLIST> declaration defines an element’s valid
attributes. Attributes may not be repeated in an attribute
declaration. For example,
<.!ATTLIST something CDATA #IMPLIED
something_else CDATA #REQUIRED
something CDATA #IMPLIED>
A DTD containing the above attribute definitions is not “well-
formed”.
- Mixed-content models" may be defined, but the order or number of
occurrences of text or tags cannot be constrained. “Mixed-content
models” are elements that allow a mix of tags and text.
Using the <.!ELEMENT> declaration, indicate zero or more
occurrences of its contents to define a “mixed-content model”.
<.!ELEMENT textstuff (#PCDATA|tag1|tag2|tag3)* >
The asterisk above indicates that zero or more occurrences of
anything inside the parentheses are allowed with no other
restrictions. This rule simplifies XML parser writing.
The XML 1.0 specification provides the best, and most complete,
information about writing “well-formed” DTDs.
http://www.w3.org/TR/2000/REC-xml-20001006
The annotated XML specification, along with many other XML resources,
is available from: http://www.xml.com.