XML Schema

This is a cheat sheet for the XML Schema language. It is a follow up to the XML Cheat Sheet and summarizes this XML Schema Tutorial. There is also a DTD Cheat Sheet.

For reference this is the XML Schema homepage including specifications and a primer.

  • XML Schema is an XML-based alternative to DTDs. It is the successor to DTDs because it is richer and more extensible. It describes the structure of an XML document.
  • The XML Schema language is also referred to as XML Schema Definition (XSD).
  • AN XML Schema defines which elements can appear in an XML document, what their order and relationships are and how many of them there are. It also defines data types and fixed and default values for elements and attributes.
  • A simple XML Schema :
    <?xml version="1.0"?>
    <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
    targetNamespace="http://a.com" xmlns="http://a.com"
    elementFormDefault="qualified">
    
      <xs:element name="note">
        <xs:complexType>
          <xs:sequence>
    	<xs:element name="to" type="xs:string"/>
    	<xs:element name="from" type="xs:string"/>
          </xs:sequence>
        </xs:complexType>
      </xs:element>
    
    </xs:schema> 
  • A reference to this schema in a note XML document would look like
    <note xmlns="http://www.w3schools.com"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://a.com note.xsd"> 

Simple Types

  • A simple element contains only text, no other elements or attributes. But the text may be any of the XSD types or a custom type and may have restrictions on it. e.g.
     <xs:element name="start_date" type="xs:date"/>
  • Simple XSD types include xs:string, xs:decimal, xs:integer, xs:boolean, xs:date, xs:time.
  • An attribute is always a simple type (even though a simple element can't have any attributes) e.g.
    <xs:attribute name="start_date" type="xs:date"/>
  • A simple element or an attribute can also have a default or fixed value e.g. default=“red” or fixed=“red” .
  • By default an attribute is optional, to make it required add use=“required” .

Facets / Restrictions

  • XML Facets are restrictions on the acceptable values for elements or attributes.
  • A simple example which can be used to restrict age to be between 0 and 120 inclusive. This example inlines the type :
    <xs:element name="age">
      <xs:simpleType>
        <xs:restriction base="xs:integer">
          <xs:minInclusive value="0"/>
          <xs:maxInclusive value="120"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:element> 
  • Another example to restrict the model of cars to be one from a list. This example creates the type with a name so it can be used by multiple elements.
    <xs:element name="car" type="carType"/>
    
    <xs:simpleType name="carType">
      <xs:restriction base="xs:string">
        <xs:enumeration value="Audi"/>
        <xs:enumeration value="Golf"/>
        <xs:enumeration value="BMW"/>
      </xs:restriction>
    </xs:simpleType>
  • Restriction of a string to a specific regexp pattern (> 0 lowercase letters):
      <xs:restriction base="xs:string">
        <xs:pattern value="([a-z])+"/>
      </xs:restriction>
  • Whitespace restrictions .
      <xs:restriction base="xs:string">
        <xs:whiteSpace value="preserve"/>
      </xs:restriction>
    • preserve means to leave whitespace alone
    • replace means to replace all whitespace characters with spaces
    • collapse means to collapse all whitespace sequences to a single space.
  • A list of all possible restrictions
Restriction Description
enumeration Defines a list of acceptable values
fractionDigits Specifies the maximum number of decimal places allowed. Must be equal to or greater than zero
length Specifies the exact number of characters or list items allowed. Must be equal to or greater than zero
maxExclusive Specifies the upper bounds for numeric values (the value must be less than this value)
maxInclusive Specifies the upper bounds for numeric values (the value must be less than or equal to this value)
maxLength Specifies the maximum number of characters or list items allowed. Must be equal to or greater than zero
minExclusive Specifies the lower bounds for numeric values (the value must be greater than this value)
minInclusive Specifies the lower bounds for numeric values (the value must be greater than or equal to this value)
minLength Specifies the minimum number of characters or list items allowed. Must be equal to or greater than zero
pattern Defines the exact sequence of characters that are acceptable
totalDigits Specifies the exact number of digits allowed. Must be greater than zero
whiteSpace Specifies how white space (line feeds, tabs, spaces, and carriage returns) is handled

Complex Elements

  • Complex elements are elements that are not simple. There are 4 types :
    • empty elements
    • elements that contain only other elements
    • elements that contain only text
    • elements that contain both other elements and text
  • An example of a named complex type that contains only other elements :
    <xs:element name="employee" type="personinfo"/>
    <xs:element name="student" type="personinfo"/>
    
    <xs:complexType name="personinfo">
      <xs:sequence>
        <xs:element name="firstname" type="xs:string"/>
        <xs:element name="lastname" type="xs:string"/>
      </xs:sequence>
    </xs:complexType>

    Complex types can also be inlined as with simple types.

  • Complex types can also extend other complex types, e.g.
    <xs:complexType name="fullpersoninfo">
      <xs:complexContent>
        <xs:extension base="personinfo">
          <xs:sequence>
            <xs:element name="address" type="xs:string"/>
            <xs:element name="city" type="xs:string"/>
            <xs:element name="country" type="xs:string"/>
          </xs:sequence>
        </xs:extension>
      </xs:complexContent>
    </xs:complexType> 
  • Example of an empty element (attributes only) :
    <xs:complexType name="prodtype">
      <xs:attribute name="prodid" type="xs:positiveInteger"/>
    </xs:complexType>
  • Complex text-only elements contain only simple content (text and attributes), so we add a simpleContent element around the content. When using simple content, you must define an extension OR a restriction within the simpleContent element. e.g.
    <xs:complexType name="shoetype">
      <xs:simpleContent>
        <xs:extension base="xs:integer">
          <xs:attribute name="country" type="xs:string" />
        </xs:extension>
      </xs:simpleContent>
    </xs:complexType>
  • In a mixed-content complex-type, character data can appear between the child-elements. This is done by setting mixed to true e.g.
    <xs:complexType name="lettertype" mixed="true">
      <xs:sequence>
        <xs:element name="name" type="xs:string"/>
        <xs:element name="orderid" type="xs:positiveInteger"/>
      </xs:sequence>
    </xs:complexType>

Element Indicators

  • Indicators control how elements are to be used within a complex element. There are three types of indicators: order indicators, occurrence indicators and group indicators
  • Order Indicators include :
    • All - child elements can appear in any order but each must occur exactly once
    • Choice - either of several elements can occur
    • Sequence - elements must occur in order
  • Occurrence Indicators include :
    • maxOccurs - maximum number of times an element can occur or “unbounded” (default is 1)
    • minOccurs - minimum number of times an element can occur (default is 1)
  • There are two types of groups:element groups and attribute groups. Groups define related sets of elements or attributes. Once a group is created it can be referenced elsewhere.
  • Example of an element group. Note that an order indicator must appear within a group element.
    <xs:group name="persongroup">
      <xs:sequence>
        <xs:element name="firstname" type="xs:string"/>
        <xs:element name="lastname" type="xs:string"/>
      </xs:sequence>
    </xs:group>

    A reference to it within a sequence would look like

    <xs:group ref="persongroup"/>

* An Attribute group is similar for example.

<xs:attributeGroup name="personattrgroup">
  <xs:attribute name="firstname" type="xs:string"/>
  <xs:attribute name="lastname" type="xs:string"/>
</xs:attributeGroup>

A reference to the attribute group in a complex type would look like :

 <xs:attributeGroup ref="personattrgroup"/> 
  • Using <xs:any /> as an element allows elements not specified in the schema to occur.
  • Using <xs:anyAttribute/> in a complex type allows use of attributes not specified by the schema.
  • A substitution group allows other elements to substitute for the first. The head elemenet must be a global element (a direct child of the schema element) e.g.
    <xs:element name="name" type="xs:string"/>
    <xs:element name="navn" substitutionGroup="name"/>

    Substitution can be blocked with

    <xs:element name="name" type="xs:string" block="substitution"/>

Data Types

String Data Types

  • Apart from xs:string there are two other string stypes :
    • xs:normalizedString - No CR, LF or TAB characters are allowed
    • xs:token - No CR, LF or TAB characters are allowed, no leading or trailing spaces are allowed, no sequences of more than one space is allowed
  • There are many other types derived from string e.g. NMTOKEN, QName, ID, IDREF.

Date Data Types

  • date data type is used to specify a date in the format “YYYY-MM-DD” where all components are required.
  • A time data type must be specified in the following format “hh:mm:ss” where all components are required.
  • A dateTime datatype must be specified in the following format “YYYY-MM-DDThh:mm:ss” e.g. 2002-05-30T09:00:00
  • A timezone can be added to a date/time/dateTime by adding a Z (for UTC) or a signed offset at the end e.g. 2002-09-24Z , 2002-09-24+06:00
  • A duration data type must be specified in the following format “[-]PnYnMnDTnHnMnS” P is required, T is required if any time component is used and the other parts are optional e.g. P5Y, P5Y2M10DT15H, -P1Y
  • This is a list of all date types.
Type Description
date Defines a date value (“YYYY-MM-DD”)
dateTime Defines a date and time value (“YYYY-MM-DDThh:mm:ss”)
duration Defines a time interval ([-]PnYnMnDTnHnMnS)
gDay Defines a part of a date - the day (DD)
gMonth Defines a part of a date - the month (MM)
gMonthDay Defines a part of a date - the month and day (MM-DD)
gYear Defines a part of a date - the year (YYYY)
gYearMonth Defines a part of a date - the year and month (YYYY-MM)
time Defines a time value (“hh:mm:ss”)

Numeric Data types

  • The xs:decimal data type specifies a positive or negative numeric value (with fractional part).
  • The xs:integer data type specifies a positive or negative integer.
  • These are all the numeric types, which all derive from the decimal type.
Name Description
byte A signed 8-bit integer
decimal A decimal value
int A signed 32-bit integer
integer An integer value
long A signed 64-bit integer
negativeInteger An integer containing only negative values ( .., -2, -1.)
nonNegativeInteger An integer containing only non-negative values (0, 1, 2, ..)
nonPositiveInteger An integer containing only non-positive values (.., -2, -1, 0)
positiveInteger An integer containing only positive values (1, 2, ..)
short A signed 16-bit integer
unsignedLong An unsigned 64-bit integer
unsignedInt An unsigned 32-bit integer
unsignedShort An unsigned 16-bit integer
unsignedByte An unsigned 8-bit integer

Other Data Types

  • xs:boolean is a booleantype which must be true or false
  • xs:base64Binary and xs:hexBinary specify base-46 and hexadecimal encoded binary data respectively
  • xs:anyURI can be used for any URL or URN.
  • There are also xs:double and xs:float data types.
Recent changes RSS feed Creative Commons License Donate Minima Template by Wikidesign Driven by DokuWiki