XML Programming  «Prev  Next»
Lesson 4HTML limitations
Objective Describe the limitations of HTML.

Limitations of HTML

The primary limitation of HTML is that HTML tags do not describe the meaning of the data included in an HTML document.
HTML uses a fixed, predefined tag set that specifies formatting and instructs a browser how to render data included in these tags. But the tags do not convey the meaning or semantics of data contained in the tags. In many cases, the meaning of the data included in a document is critical. XML is designed to overcome this and other limitations of HTML.
A classic example in which this limitation of HTML is problematic is Web searches.
When you try to search for a document on the Internet on the basis of a word or phrase, you will literally get thousands of matches.
The search engine does not use the HTML tags but instead relies on keywords and meta-tags in the HTML document.
For example, if you want to search for information relating to the box office performance of the 1997 film Titanic, you might be deluged with articles about the ship Titanic, the numerous books about its fatal voyage, or pages of sales promotions touting "titanic" discounts.
An advanced search might return the exact information you want in the first few search results, but the search is still not efficient.

The Advent of XML

Unlike HTML tags, XML tags do convey meaning. XML tags make searching for information more efficient.
For example, to describe the movie Titanic we may use the following set of elements:

 <TITLE> Titanic </TITLE>
 <PRODUCER>James Cameron, Jon Landau</PRODUCER>

Titanic as the title

When these elements are included as part of a document on a server, a search program would have no problem identifying Titanic as the title of a film. In addition, the computer search program would be able to identify other attributes of the Titanic film such as its producer, director, and so on.

Structural definition

XML elements not only convey the meaning of data, but also enforce a well-defined structure for the data. XML elements may also contain other elements and a tree-like structure for this document is easily identified. The structural definition of an HTML document is not as easily discernable.
In the above example, the XML <FILM> element contains
  1. <TITLE>,
  2. <PRODUCER>,
  3. <DIRECTOR>,
  5. <DISTRIBUTOR>, and
  6. <BOX-OFFICE> elements.
Each of these elements, in turn, contains data.
The well-defined structure of an XML document is important when you use XML parsersas you will see later in the course.
The next lesson defines XML.

MetaLanguages Markup - Quiz

Click the Quiz link below to test your understanding of metalanguages, markup, and HTML limitations.
MetaLanguages Markup - Quiz