Semantic Structure

The Nature of HTML

The originators of HTML were scientists who wanted a standard means to share particle physics documents. They had little interest in the exact visual form of the document as seen on the computer screen. In fact, HTML was originally designed to enforce a clean separation of content structure and graphic design. The intent was to create a World Wide Web of pages that will display in every system and browser available, including browsers that "read" web page text to visually impaired users and can be accurately interpreted by automated search and analysis engines.

The inventors of the web did not realize the graphical and display potential of the web, and as such, HTML was not designed with display considerations in mind. They were so concerned about making web documents machine-friendly that they produced documents that only machines (or particle physicists) would want to read. In focusing solely on the structural logic of documents they ignored the need for the visual logic of graphic design and typography. This lack of a visual emphasis on the web is what causes web designers such stress in trying to get pages to look the way that they want them to look. This pressure is what caused browser software companies to begin to ignore the standards of proper HTML and allow additional visual and layout features or extensions of HTML to work within their browsers.

For example, most graphic designers avoid using the standard heading elements in HTML (<h1>, <h2>, and so on) because they lack subtlety: in most web browsers these elements make headlines look absurdly large (<h1>, <h2>) or ridiculously small (<h4>, <h5>, <h6>). But the header elements in HTML were not created with graphic design in mind. Their sole purpose is to designate a hierarchy of headline importance, so that both human readers and automated search engines can look at a document and easily determine its information structure. Only incidentally did browser designers create a visual hierarchy for HTML headers by assigning different type sizes and levels of boldness to each header element, though these type sizes tend to be somewhat limiting within the HTML language.

Fortunately CSS allows authors to change the visual presentation of elements to meet their design and visual preferences while maintaining the underlying semantic meaning (the word "semantic" literally means "meaning"). As with the original intention of the web, screen readers and other assistive technologies largely ignore visual styling and focus primarily on semantics and structure.

Using Headings for Content Structure

When encountering a lengthy web page, sighted users often scroll the page quickly and look for big, bold text (headings) to get an idea of the structure and content of the page. Screen reader and other assistive technology users also have the ability to navigate web pages by heading structure, assuming true headings are used (as opposed to text that is styled to be big and/or bold). This means that the user can view a list of all of the headings on the page, or can read or jump by headings, or even navigate directly to top level headings (<h1>), next level headings (<h2>), third level headings (<h3>), and so on.

Example

View the content structure of one of your web pages in WAVE. Enter the web page URL into the text box, press the button and then select the Outline tab in the side bar.

Pages should be structured in a hierarchical manner, generally with one 1st degree headings (<h1>) being the most important (usually page titles or main content heading), then 2nd degree headings (<h2> - usually major section headings), down to 3rd degree headings (sub-sections of the <h2>), and so on. Technically, lower degree headings should be contained within headings of the next highest degree (i.e., one should not skip heading levels, such as from an <h2> to an <h4>, going down the document). The following outline shows the hierarchy of what a web page might contain. In fact, it represents the hierarchy of the main content section of this page, with different degrees of headers to represent higher or lower levels of content hierarchy. You can click on any of the heading items to jump to that section within this page.

Using Headings Correctly

Do not use text formatting, such as font size or bold to give the visual appearance of headings - use actual heading (<h1> - <h6>) for all content headings. Assistive technologies and other browsers rely upon the literal markup of the page to determine structure. Items that are bolded or display in a bigger font are not interpreted to be structural elements.

Likewise, do not use headers to achieve visual results only. For instance, if you want to highlight or emphasize an element within your content that is not a heading (such as I did with the previous sentence), do not use heading elements to achieve the visual appearance you want. Instead, use font size, bold, or italics. Actually, you should use styles to achieve visual results. If you want to emphasize something, you technically should use the <strong> element instead of <bold> and the <em> element instead of <i>. Bold and italics (<i>) both connote visual emphasis, whereas strong and emphasis (<em>) suggest semantic emphasis. Visually, <b> and <strong>, and <em> and <i> look exactly the same and are, unfortunately, generally treated the same (if differentiated at all) in screen readers, but developers should use the more proper HTML elements.

Using Lists Correctly

HTML lists - <ul>, <ol>, and <dl> - also convey a hierarchical content structure. Each of these has rules regarding their use as well. Unordered lists should be used when there is no order of sequence or importance. Ordered lists suggest a progression or sequence. Definition lists should be used explicitly for presenting a structure for definitions. As with heading, lists should be used correctly and for the right purposes. Unordered and ordered lists should always contain list items. Definition lists must always have definition descriptions. Empty lists are incorrect HTML. Lists should never be used for merely indenting or other layout purposes. Nested lists should be coded properly.