The term schema is commonly used in the database community and refers to the organization or structure for a database. When this term is used in the XML community, it refers to the structure (or model) of a class of documents. This model describes the hierarchy of elements and allowable content in a valid XML document. In other words, the schema defines constraints for an XML vocabulary. |
New standards for defining XML documents have become desirable because of the limitations imposed by DTDs. XML Schema Definition (XSD) schema, sometimes referred to as an XML schema, is a formal definition for defining a schema for a class of XML documents. The sheer volume of text involved in defining the XML schema language can be overwhelming to an XML novice, or even to someone making the move from DTDs to XML schema. As previously stated before our detour into namespaces, XML schemas have evolved as a response to problems with the W3C’s first attempt at data validation, DTDs. DTDs are a legacy inherited from SGML to provide content validation and, although DTDs do a good job of validating XML, certainly room does exist for improvement. Some of the more important concerns expressed about DTDs are the following: |
DTD uses Extended Backus Naur Form syntax, which is dissimilar to XML. |
DTDs aren’t intuitive, and they can be difficult to interpret from a human-readable point of view. |
The metadata of DTDs is programmatically difficult to consume. |
No support exists for data types. |
DTDs cannot be inherited. |
To address these concerns, the W3C developed a new validating mechanism to replace DTDs called XML schemas. Schemas provide the same features DTDs provide, but they were designed with the previous issues in mind and thus are more powerful and flexible. The design principles outlined by the XML Schema Requirements document are fairly straightforward. XML schema documents should be created so they are as follows: |
More expressive than XML DTDs |
Expressed in XML |
Self-describing |
Usable in a wide variety of applications that employ XML |
Straightforwardly usable on the Internet |
Optimized for interoperability |
Simple enough to implement with modest design and runtime resources |
Coordinated with relevant W3C specs, such as XML Information Set, XML Linking Language |
(XLink), Namespaces in XML, Document Object Model (DOM), HTML, and the Resource Description Framework (RDF) schema |
As mentioned earlier in this chapter, an XML schema is a method used to describe XML attributes and elements. This method for describing the XML file is actually written using XML, which provides many benefits over other validation techniques, such as DTD. These benefits include the following: |
14