We have all the reasons to call this era an era of personalized web, where the distance between us – humans and computers is getting sharply reduced by new technologies that emerge this century. Undoubtedly, it brings advantages on many aspects and, probably, on the most important one, the relation, which is dictated by this obvious discrepancy that ruled it from the first place. Nonetheless, we are struggling to narrow it down.
Semantics, from the Greek, sēmantiká[ 1 ], represents the study of meaning – in other words, it basically refers to the relation between an unit and its meaning: a word stands for something, a symbol represents something, etcetera. Originally reserved only for linguistic studies, the field of semantics expanded from its origins and has been identified in almost each type of fieldwork.
In 1999 the creator of the Internet, Tim Berners-Lee, imagined a semantic web where information would be easily interpreted by machines, but in order for a device to interpret something, either be that a word, phrase, location or anything else, it has to know what it reads and its particular meaning to us, humans.
I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents’ people have touted for ages will finally materialize.Tim Berners-Lee, 1999 [ 2 ]
You can safely say that HTML has, in a low percentage, some kind of semantic level implemented from its earlier conception. Any machine can be programmed to read and interpret text as a heading, paragraph, list, quote or link – and even more of it, you had tags that were dealing with emphasized words within a paragraph, for instance: <i> and <b>.
Regardless of how many HTML tags were implemented in the early years, the language itself has proven to be limited on feature of semantics. This aspect has been given by the simple fact that each of those tags showed to be too general for machines to understand.
Search engines still rely on these type of HTML tags: which text is this? is it a heading? if so, what is its importance to your user? is this a list? if so, how many items do you have in a list? But a search engine or any other machine interpreter cannot determine if a specific word within a document is actually your company’s address or it is a phone number or it might be the result of a mathematical equation – from a robot’s view it is nothing but a string of characters. Also, due to the fact that HTML was so limited concerning this aspect, microformats were introduced and the vision of a semantic web is once again pushed forward.
Depending on the context where your project may be situated, there are advantages and disadvantages on implementing these code blocks into a web page markup. For instance, analyzing the aspect of search engine optimization, inserted microformats may later bring improved rankings in search results and, in the end, a better position.[ 3 ]
Nevertheless, these formats and this vision of a semantic web bring, taking aside current advantages or disadvantages, an unity between data we share and its significance to humankind.
This post was updated in 2011.
