SGML and XML are metalanguages for capturing document structures in textual form. They define a syntax for adding markup tags to text, but do not themselves define any markup tags. It is left to individual languages to specify the markup tags and their meaning. HTML is one such language, following the form of XML with some syntax deviations. The current version is HTML 5.

Many languages used in computing need to capture complex structures. Programming languages, for example, must capture the complex organization of program data and operations that act on the data. But there are other important computing languages that capture the structure of textual information such as web pages. There are also metalanguages that capture the structure of other languages.

A flexible document structuring language must be able to model hierarchical structures, much like the Composite design pattern. Generally such languages need to deal with three problems:

Programming languages such as C, C++, and Java, for example, have a structure for if statements. This form is shown below with color coding to indicate the language elements for dealing with the three needs described above.

if (condition) {
statements
}

XML based languages and HTML meet the three needs with a different syntax as shown below.

<tag-name attributes-list>
content
</tag-name>

Here tag-name identifies the type of element such as head, body, ul, or table. attributes-list is a space-separated sequence of attributes, each with the form

    attribute-name=attribute-value
      

where attribute-name is the name of an attribute and attribute-value is a single-quoted or double-quoted string specifying the value of the attribute. Most elements, for example, can have attributes named class and id. The value is just a name that can be used in style sheet rules to control the styling of all elements with a given class or individual elements with a given id.

content is a mixture of ordinary text and elements, allowing the overall structure to be viewed as an example of the Composite design pattern.

Books

Chuck Musciano and Bill Kennedy and Estelle Weyl, "HTML5: The Definitive Guide", 7th Edition, O'Reilly, 2014.

Online References

  1. HTML: HyperText Markup Language [developer.mozilla.org]
  2. HTML Tutorial [www.w3schools.com]
  3. HTML 5.1 Specification [W3C]