Recently I’m so lost in microformats, I loved the idea behind. Its backwards-compatibility is great and in today’s world it’s really easy to implement microformats readers with already ready XML parsers. In fact, microformats are designed to work that way.
Then, what else is so great about microformats?
When you enter a blog, it’s usually easy to segment certain particles like blog posts, blogroll(s), tags and etc. Because, your brain is already aware of the blog pattern and its elements. In computing, unfortunately same thing doesn’t apply for machines that easily. As an example, let’s take a web crawler that regularly visits your blog and indexes the updates. Can it name a blog post block or even your blog’s being a blog? In practical, it’s a costly process to make it learn the blog patterns from pure HTML/XHTML mark-up without any helpers. This is the one of the major problems of today’s web. But luckily, instead of the complex pattern recognition techniques, microformats came with a simplistic idea: adding extra information into your mark-up to define some segments that machines can easily name what they are.
This technique is very similar to the philosophy behind the usage of fundamental HTML/XHTML elements. For instance, we know an a tag mentions a link and h1, h2, h3... tags are used for titles. They all have significant meanings, so web browsers know how to react when they cross one while parsing the source mark-up. Even more, some of today’s search engine crawlers like Googlebot finds headings and puts extra meaning to them.
So instead of using only HTML or XHTML standard tags, why aren’t we using more complex structures? For instance, in a web page, if I want my name to be machine readable, I can mark-up my name in the following way:
<author>JBF</author>
So an XML parser can easily find the author tag and learn about my name. But this mark-up isn’t validating the standards, so we should find another way to embed this extra information without going against all the web browsers and W3C. Microformats community makes it with class. Tags are not the only option for putting extra information inside the source. Any tag attribute is also fine. But class is the most suitable one among the all attributes. Let’s see what mf community imagined back in the 2004:
<div class="hcard"><span class="fn">JBF</span></div>
hCard specification tells that, every hcard class features contact information. hCard is totally a re-specified IETF vCard standard clone. Inside hCard, you can see the fn class which means there is the formatted name of the person inside. The rest of the hCard elements can be seen on hCard page on mf wiki. So again a parser can easily parse through the source and extract hcard and inside classes to segment the author information.
There are already many specifications are defined by the mf community such as hCard, hCalender, hAtom, hResume, geo and etc to abstract many of the today’s most common patterns on the web.
But, is it all perfect?
Nothing in life is, of course. Microformats seem to be a simple and easy to adopt idea, but in real life, for example we usually refuse to list all information about us one after another. So I’m wondering why hCard isn’t designed to support distributed information about contact information. And what about the extra effort we should spend on microformats while posting? Is everyone feeling nuts about it to consume more energy on mf? Is it possible to abstract information as a pattern? And what’s the optimal pattern number that can be specified within open standards?
Another question appears in mind when I hear about hAtom can be used instead of feeds. So we won’t need an extra feed to let our readers follow us from different mediums. But why to waste network traffic with formatted data and then use processor power to extract information inside. Isn’t separating data from UIs a better idea?
These are my questions and maybe you have the answers!