XSD




The term schema is commonly used in the database community and refers to the organization or structure
for a database. When this term is used in the XML community, it refers to the structure (or model) of a
class of documents. This model describes the hierarchy of elements and allowable content in a valid XML
document. In other words, the schema defines constraints for an XML vocabulary.

New standards for defining XML documents have become desirable because of the limitations imposed
by DTDs. XML Schema Definition (XSD) schema, sometimes referred to as an XML schema, is a formal
definition for defining a schema for a class of XML documents. The sheer volume of text involved in
defining the XML schema language can be overwhelming to an XML novice, or even to someone making
the move from DTDs to XML schema. As previously stated before our detour into namespaces, XML
schemas have evolved as a response to problems with the W3C’s first attempt at data validation, DTDs.
DTDs are a legacy inherited from SGML to provide content validation and, although DTDs do a good
job of validating XML, certainly room does exist for improvement. Some of the more important concerns
expressed about DTDs are the following:

  DTD uses Extended Backus Naur Form syntax, which is dissimilar to XML.

  DTDs aren’t intuitive, and they can be difficult to interpret from a human-readable point of view.

  The metadata of DTDs is programmatically difficult to consume.

  No support exists for data types.

  DTDs cannot be inherited.

To address these concerns, the W3C developed a new validating mechanism to replace DTDs called XML
schemas. Schemas provide the same features DTDs provide, but they were designed with the previous
issues in mind and thus are more powerful and flexible. The design principles outlined by the XML
Schema Requirements document are fairly straightforward. XML schema documents should be created so
they are as follows:

  More expressive than XML DTDs

  Expressed in XML

  Self-describing

  Usable in a wide variety of applications that employ XML

  Straightforwardly usable on the Internet

  Optimized for interoperability

  Simple enough to implement with modest design and runtime resources

  Coordinated with relevant W3C specs, such as XML Information Set, XML Linking Language

(XLink), Namespaces in XML, Document Object Model (DOM), HTML, and the Resource
Description Framework (RDF) schema

As mentioned earlier in this chapter, an XML schema is a method used to describe XML attributes and
elements. This method for describing the XML file is actually written using XML, which provides many
benefits over other validation techniques, such as DTD. These benefits include the following:

14