Learn Xml: CDATA, PCDATA, and Entity References

Textual data contained in an XML element can be expressed as Character Data (CDATA), Parsed

Character Data (PCDATA), or a combination of the two. Data that appears between <![CDATA[ and ]]>

tags is CDATA; any other data is PCDATA. The following element contains PCDATA:

<title>XSLT Programmers Reference</title>

The next element contains CDATA:

And the following contains both:

<title>XSLT Programmers Reference <![CDATA[Author – Michael Kay]]></title>

As you can see, CDATA is useful when you want some parts of your XML document to be ignored by

the parser and not processed at all. This means you can put anything between <![CDATA[ and ]]> tags

and an XML parser won’t care; however data not enclosed in <![CDATA[ and ]]> tags must conform to

the rules of XML. Often, CDATAsections are used to enclose code for scripting languages like VBScript

or JavaScript.

XML parsers ignore CDATA but parse PCDATA —that is, interpret it as markup language. You might

wonder why an XML parser distinguishes between CDATA and PCDATA. Certain characters, notably <,

>, and &, have special meaning in XML and must be enclosed in CDATA sections if they’re to be used

verbatim. For example, suppose you wanted to define an element named range whose value is ‘0 <

counter < 1000’. Because < is a reserved character, you can’t define the element this way:

You can, however, define it this way:

As you can see, CDATA sections are useful for including mathematical equations, code listings, and even

lanka sri