A DTD is a simple text file that contains the instructions for the elements contained within the corresponding XML document.
DTDs are written in Extended Backus-Naur Form (EBNF).
Instructions in a DTD include what elements are used, whether these elements contains parsed character data ( #PCDATA ),
or other elements, or both. Instructions also include which attributes can be used with those elements, and how those elements relate within the document's tree structure.
The basic unit in a DTD is an element. XML documents are required to have a root element.
Every other element must appear between the beginning and ending tags for the root element. The Slideshow shows the process for determining the root element.
When the vocabulary and structure of potential XML documents for a given purpose are considered together, you can talk about the type of the documents:
the 1) elements and 2) attributes in these documents, and how they interrelate are designed to cover a particular subject of interest. Generally speaking, this is not any more than using a specific XML language, for example Mathematical Markup Language (MathML) or X3D (for 3D graphics).
But for validation purposes, the nature of an XML language can be much more specific, and Document Type Defi nitions (DTDs) are a way to describe fairly precisely the shape
of the language.
This idea has parallels in human language. For example, if you want to read or write in English or German, you must have some understanding of the grammar of the language in question.
Man muss die Grammatik verstehen.
In a similar fashion, it is useful to make sure the structure and vocabulary of XML documents are valid against the grammatical rules of the appropriate XML language. Fortunately, XML languages are considerably simpler than human languages.
As you would expect, the grammars of XML languages are expressed with computer processing in mind.
The breaking down of a human-language sentence into its grammatical components is known as parsing
The same applies with XML and a machine parser can perform the parsing.
Parsers are the software subsystems
that read the information contained in XML documents into our programs.
The XML specification separates parsers into two categories:
- validating and
Validating parsers must implement validity checking using DTDs.
With a validating parser, a lot of content-checking code you might otherwise need in your application is unnecessary and
you can depend on the parser to verify the content of the XML document against the DTD.