ITworld.com
  Search  
Menu Changing the way you view IT
XML with an Accent
Sign up for XML IN PRACTICE
More Newsletters
 
 

XML IN PRACTICE --- 11/23/2000



Mark Johnson

HTML programmers are accustomed to having a large set of valid character entities for producing characters other than the common ASCII characters from space to ~ (hex 20 through hex 7f). These characters include:

  • international" characters, like characters with an accent (à = a with a "grave" or right accent) or a circumflex (ô = o with a circumflex or caret above it);
  • special characters" like œ, the smashed-together o and e in archaic spellings like "encyclopoedia";
  • various symbols like the Greek alphabet (α, β and so on), and the "for all" symbol (∀ = an upside-down capital "A").
Advertisement
On this topic




But how do you encode such characters in XML?

Character entities, like all other entities, can be defined in a DTD with an <!ENTITY> definition. XHTML (the new reformulation of HTML 4 as an XML document type) defines these entities, but XML does not (with the exceptions of & < and >). So, what do you do if you want to use these characters in XML?

The World Wide Web consortium, an international consortium dedicated to open Web standards, provides three entity definition files as a part of XHTML. These files define the character entities for XHTML, but they're usable in XML as well. You simply have to include the contents of those files in the DTD for your document. These files are:

  • http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent (Latin characters)
  • http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent (Special characters)
  • http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent (Symbols)

To include one of these files in your DTD, place the following line in your DTD:

<!ENTITY % HTMLsymbol PUBLIC
"-//W3C//ENTITIES Symbols for XHTML//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent"> %HTMLsymbol;

(This is the line for xhtml-symbol. Change it accordingly to use the other two files.) The http: URL above indicates the .ent file's location. However, you may not always be online so copy the .ent file to a local directory. Then, replace the http URL with a reference to the file.

The following small XML document demonstrates the use of these character entities:

<?xml version="1.0"?>

<!-- Start DTD -->
<!DOCTYPE ThisDoc [ <!ELEMENT AnyEntity (#PCDATA)>

<!-- Define entities for symbols -->
<!ENTITY % HTMLsymbol PUBLIC "-//W3C//ENTITIES Symbols for XHTML//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent"> %HTMLsymbol;

<!-- Define entities for special characters --> <!DOCTYPE ThisDoc [ <!ELEMENT AnyEntity (#PCDATA)> <!ENTITY % HTMLspecial PUBLIC "-//W3C//ENTITIES Specials for XHTML//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent"> %HTMLspecial;

<!-- Define entities for latin and other characters --> <!DOCTYPE ThisDoc [ <!ELEMENT AnyEntity (#PCDATA)> <!ENTITY % HTMLlat1 PUBLIC "-//W3C//ENTITIES Latins for XHTML//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent"> %HTMLlatin;

]> <!-- End DTD -->

<!-- Start document -->
<ThisDoc> <AnyEntity> Try some international characters: á ζ η θ † </AnyEntity> </ThisDoc> <!-- End document -->

<!-- End of example -->

 

Mark Johnson is president of Elucify Technical Communications, a Colorado-based training and consulting company dedicated to clarifying novel or complex ideas through clear explanation and examples.

Sponsored links
Bring harmony to your mix of UNIX-Linux-Windows computing environments
KODAK i1400 Series Scanners stand up to the challenge
Top 5 Reasons to Combine App Performance and Security
Locate Hidden Software on business PCs with this free tool
www.itworld.com    open.itworld.com     security.itworld.com     smallbusiness.itworld.com
storage.itworld.com     utilitycomputing.itworld.com     wireless.itworld.com

 
Contact Us   About Us   Privacy Policy    Terms of Service   Reprints  

CIO   Computerworld   CSO   GamePro   Games.net   Industry Standard   Infoworld   ITworld  
JavaWorld   LinuxWorld  MacUser   Macworld   Network World   PC World   Playlist  

DEMO   IDG Connect   IDG Knowledge Hub   IDG TechNetwork   IDG World Expo  

Copyright © Computerworld, Inc. All rights reserved

Reproduction in whole or in part in any form or medium without express written permission of Computerworld Inc. is prohibited. Computerworld and Computerworld.com and the respective logos are trademarks of International Data Group Inc.