Document Type Definition (DTD)

This is a cheat sheet for the DTD language. It is a follow up to the XML Cheat Sheet and summarizes this DTD Tutorial. There is also an XML Schema Cheat Sheet.

For reference this is the XML Specification (which also documents DTD) and the version annotated by Tim Gray.

  • A Document Type Definition (DTD) defines the legal building blocks of an XML document. It defines the document structure with a list of legal elements and attributes.
  • A DTD can be internally defined e.g.
    <!DOCTYPE root-element [
      element-declarations
    ]>

    or externally defined e.g.

    <!DOCTYPE root-element SYSTEM "filename.dtd">

    where the element-declarations are in filename.dtd

  • A simple example of DTD element-declarations
      <!ELEMENT note (to,from)>
      <!ELEMENT to      (#PCDATA)>
      <!ELEMENT from    (#PCDATA)>

    This says that note contains 2 child elements and to and from are of type #PCDATA .

  • A DTD for an XML document may contain the following building blocks
    • elements
    • attributes
    • entities (e.g. character entities)
    • PCDATA (parsed character data parsed for entities and markup)
    • CDATA (character data that is not parsed)

DTD Elements

  • An element is defined as
    <!ELEMENT element-name category>

    or

    <!ELEMENT element-name (element-content)>
  • category can be either EMPTY or ANY
  • element-content can be any combination of #PCDATA or child elements
  • child_element+ - one or more occurrences
  • child_element* - zero or more occurrences
  • child_element? - zero or one occurrences
  • (a|b) - Either a or b (can be #PCDATA)
  • a,b - a and then b in that order

DTD Attributes

  • An attribute specified in a DTD takes the form
    <!ATTLIST element-name
      attribute-name attribute-type default-value
      attribute-name2 attribute-type2 default-value2
         ... >

    for example:

    <!ATTLIST payment
       currency CDATA "US Dollars"
       amount CDATA #REQUIRED
       form (Cash|Cheque) "Cheque"> 
  • attribute-type can be one of the following
    • CDATA - character data
    • (en1|en2|..) - one from an enumerated list
    • ID - a unique id
    • IDREF - the id of another element
    • IDREFS - a list of other ids
    • NMTOKEN - a valid XML name
    • NMTOKENS - a list of valid XML names
    • ENTITY - an entity
    • ENTITIES - a list of entities
    • NOTATION - a name of a notation
    • xml: - a predefined xml value
  • default-value can be one of the following
    • value - the default value of the attribute
    • #REQUIRED - attribute is required but with no default
    • #IMPLIED - attribute is optional and with no default
    • #FIXED value - the attribute value is fixed

DTD Entities

  • Entities are variables used to define shortcuts to standard text or special characters.
  • Entities are declared in DTDs and can be declared internal e.g.
    <!ENTITY writer "Donald Duck."> 

    or external e.g.

    <!ENTITY writer SYSTEM "http://a.com/entities.dtd">
  • Entity references are references to entities and may be used in XML documents.
  • An entity reference has three parts: an ampersand (&), an entity name, and a semicolon (;). e.g.
    <author>&writer;</author>
Recent changes RSS feed Creative Commons License Donate Minima Template by Wikidesign Driven by DokuWiki