Back in 1999, Tim Berners Lee expressed his vision about the semantic web:
I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents’ people have touted for ages will finally materialize.
The word semantic stands for the meaning of , and semantic web is a web that is able to describe things for computer to understand. In semantic web, the web is not just about links, it is also about the relationships between things, and the properties of things.
Since the time the semantic web has been envisioned, there had been so many efforts to put this into reality such as the use of RDF (Resource Description Framework) and some also had the wrong impression that their website is semantic just by describing a bunch of <div> and other elements with the use of IDs and classes .
With the introduction of HTML5 , it would be easier for us to create semantic web pages because of its new structural and content tags.
HTML5 NEW STRUCTURAL TAGS
- <section></section> – generic section of a document or application. Thematic grouping of content with a heading.
- <article></article> – represents a self-contained composition in a document, page, application or site that is independently distributable or reusable such as in syndication. (E.g, forum post, blog entry, user comment)
- <aside></aside> – represents a section of a page that consists of content that is tangentially related to the content around the aside element and which could be considered separate from the content
- <header></header> – represents a group of introductory or navigational aids, This is intended to contain section’s heading (h1-h6 or an hgroup element), but not required. This can also be used to wrap section’s table of contents, search form or logos.
- <hgroup></hgroup> – used to group a set of h1-h6 elements when the heading has multiple levels such as subheadings, alternative titles or tag lines.
- <footer></footer> – represents a footer for its nearest ancestor sectioning content or sectioning root element. This can be used to contain information about the section such as author, links related to it, or copy right data. A document may contain multiple footers. Footer is not just limited to the whole document but can also be used for sections.
- <nav></nav> – represents section of a page that links to other pages or to parts within the page.
HTML5 NEW CONTENT TAGS
- <figure></figure> – represents some flow content optionally with a caption. Can be used to annotate illustrations, diagrams, photos, code listings etc.
- <video></video> – media element whose data is video data possibly with associated audio data
- <canvas></canvas> – providing scripts with a resolution dependent bitmap canvas which can be used for rendering graphs, game graphics, or other visual images on the fly.
HTML5 has a new algorithm that instructs user agents how to parse HTML documents. The outline algorithm tells user agents how to parse sectioning contents within a document. Understanding the outline algorithm will ensure you that your documents would be parsed the way you want it to be. This could be used from a semantic standpoint for accessibility reasons or to make your content easier to syndicate.
One of the way to imagine the outline algorithm is to think of your content as a table of contents.
- Site Title
- Section 1
- Article 1
- Article 2
- Section 2
- Article 1
- Section 1
Outline Algorithm Parsing
- Section and Heading contents are used to define the outline
- Body is established as the outline root
- Items are added to the outline as sectioning content is found
- Heading contents are used to name sections
- Sectioning content within another sectioning content is nested in the outline
Example HTML5 Code:
<body> <heading>Ideyatech Inc. - Java Outsourcing Philippines</heading> <section> <h1>Products and Services</h1> <article> <h1>Our Services</h1> <p>Content....</p> </article> <article> <h1>Our Products</h1> <p>Content...</p> </article> </section> </body>
The outline algorithm would generate this outline based on the above code:
- Ideyatech Inc. – Java Outsourcing Philippines
- Products and Services
- Our Services
- Our Products
- Products and Services
As you can see, with HTML5, we can easily create semantic web pages not just on a markup viewpoint but also on making proper document structure by understanding HTML5′s document outline algorithm.
HTML5 would also impact SEO and SEM as more and more websites start using HTML5, and the more user agents start supporting it, new SEO and SEM strategies may emerge targeted towards HTML5. The way search engines crawl web pages may also be changed in support of HTML5.
Though HTML5 is still in its infancy, it has already gained so much attention as companies like Apple and Google started using it and the W3C also announced that it expects HTML5 to reach recommendation status by 2014.
With all these efforts going around HTML5, you may want to start exploring HTML5. Who knows, maybe in the next 2 to 3 years, HTML5 would be the new standard.