XML Programming  «Prev  Next»

Lesson 1

XML Data Representation and Markups

Separate Content from Presentation

Imagine that while driving through unknown territory, you could ask an onboard car computer for directions to the nearest gas station.
For that to be possible, the markup language used for this application must be specific not just in terms of document structure but also about the actual content contained within the documents. XML (eXtensible Markup Language) is a meta-markup language that expressly separates content from presentation. Using user-defined XML elements, or tags, XML documents provide meaning to the data contained in these documents. Presentation of XML documents may take several different formats, including rendering an XML document in a browser.
This module defines XML, discusses its origins and applications, and describes the evolution of markup and metalanguages.

Module learning objectives

By the end of the module, you will have the skills and knowledge necessary to:
  1. Describe markup languages
  2. Describe metalanguages
  3. Describe the limitations of HTML
  4. Define XML
  5. List the goals of XML
  6. Describe approaches to using XML

Throughout this course the terms "elements" and "tags" are used interchangeably.
XML will be referred to as a language and as a metalanguage. The next lesson describes markup languages in general.


Steps leading up to XML : Data Representation and Markups

There are two main uses for XML:
  1. One is a way to represent low-level data, for example configuration files.
  2. The second is a way to add metadata to documents; for example, you may want to stress a particular sentence in a report by putting it in italics or bold.
The first usage for XML is meant as a replacement for the more traditional ways this has been done before, usually by means of lists of name/value pairs as seen in Java's Property files.
The second application of XML is similar to how HTML files work. The document text is contained in an overall container, the <body> element, with individual phrases surrounded by <i> or <b> tags.
For both of these scenarios there has been a variety of techniques devised over the years. The problem with these disparate approaches has been more apparent than ever, since the increased use of the Internet and extensive existence of distributed applications, particularly those that rely on components designed and managed by different parties. That problem is one of intercommunication.
It's certainly possible to design a distributed system that has two components, one outputting data using a Windows INI file and the other which turns it into a Java Properties format.
Unfortunately, it means a lot of development on both sides, which should not really be necessary and takes away resources from the main objective, which is developing new functionality that delivers business value.
XML was conceived as a solution to this kind of problem; it is meant to make passing data between different components much easier and relieve the need to continually worry about different formats of input and output, freeing up developers to concentrate on the more important aspects of coding such as the business logic. XML is also seen as a solution to the question of whether files should be easily readable by software or by humans; XML attempts to fulfill both objectives.