Enthusiasm for Microformats Premature

Earlier this year I raved about Microformats here on my blog. When Tantek Çelik gave his presentation at the Future of Web Apps Conference I had numerous epiphanies. As I am want to do, I projected my ideas and envisioned how Microformats could solve several problems on the web and I came away completely enthused. On the strength of its topic alone, I felt it was the best presentation at the show.

I have since spent many hours on uf-discuss[1], and I’ve come to the conclusion that my enthusiasm for Microformats was unfortunately premature. But before explaining my concerns let me give a quick overview.

30 Second Overview of Microformats

Microformats are developed by a community process and they allow web developers to provide semantic information within an HTML 4.01 document using defined keywords in class attributes[2].  This allows software programs to extract the semantic information from the HTML much like a program could extract information out of an XML file. The following example, if included in a web page would indicate that the content on the page was licensed using Creative Commons license:

<a rel="license" href="http://creativecommons.org/licenses/by/2.0/">License</a>

This example marks up a description of the time and place for the Future of Web Apps conference I attended:

<span class="vevent">
   <a class="url" href="http://www.futureofwebapps.com/pastevents.html">
      <span class="summary">Carson Workshops' Future of Web Apps</span>
   </a> was held
   <abbr class="dtstart" title="2006-09-13">September 13</abbr>-
   <abbr class="dtend" title="2006-09-14">14</abbr>,
   at the
   <span class="location">
      The Presido's Palace of the Arts in San Francisco, California
   </span>.
</span>

The previous markup[3], would display as:

Carson Workshops’ Future of Web Apps was held September 13-14, 2006 at The Presido’s Palace of the Arts in San Francisco, California.

To learn more about Microformats, visit http://microformats.org.

Our Mismatched Vision

I had envisioned a community process defining specific Microformats for different vertical needs, and then web developers using these Microformats to expose extractable data in their web pages. Business partners and other interested parties could then simply scrape these structured pages to retrieve the information all without having to create a separate XML files and related navigation. This would give 80% of the benefit of the semantic web with 20% of the effort[4].

Unfortunately, the Microformat community’s vision didn’t align.

So where was the mismatch? Read on:

So, after many vision-limiting responses I’ve become both disheartened and disenchanted with Microformats, especially after I envisioned Microformats being able to solve so many real world problems.

After the letdown

After an extremely compelling vision, it’s hard to backtrack and just ignore it. But unfortunately, the Microformats community’s vision doesn’t sync with mine. Continuing to advocate for an alternate vision will likely just waste my time and certainly upset everyone on the list, so that’s not a viable option. Instead, I’ll ponder the issue, and will post again if an alternate solution presents itself.

Microformats good, just know what to expect

However, I do want to clarify that I didn’t write this to trash Microformats or Tantek or the community. I still think the Microformat concept is brilliant, even with its differing vision. I still respect Tantek and the others on the Microformat list and appreciate their efforts. And I’m still impressed by existing Microformats created by the community and would love to see them implemented on all applicable web pages.

No, I didn’t write this to trash Microformats. Instead I wrote it to inform people they should take great care in setting their expectations regarding Microformats. Otherwise they’ll go through the same cycle of elation, frustration, and then disappointment as me. And that won’t do good for anybody. And in fairness, I wrote it in small part to officially register my issues about the governance of the Microformat community.

  1. “u” is the symbol for “micro”, and “f” is the first character of “format, so “uf-discuss” if the mailing list to discuss Microformats. Get it? Uh, huh, too cute for words.
  2. The “class” attribute is the main one used by Microformats as they also use “rev” and “rel” and a few more, depending on the specific Microformat.
  3. Carson Workshops actually uses this Microformat called “hCalendar” to mark up their entire conference schedule for the next time this conference is run; you can see it here. As an aside, they had a link on their schedule page for the San Fran conference that would add the entire conference into a calendar such as Outlook. At this moment his current page doesn’t do that; why I don’t know.
  4. Please don’t debate the percentages; I was being convenient and the percentages are tangential to the point of the post. Thanks in advance for your support. :)

11 Replies to “Enthusiasm for Microformats Premature”

  1. Mike,

    You should really take a look at <a href="http://skimstone.x-port.net/introduction-to-rdfa">RDFa</a&gt;. It’s being worked on over at the W3C by leading standards proponents from the HTML, RDF and Creative Commons communities. It is similar to microformats in that the intention is to allow HTML to carry rich metadata, so making publishing of information like vCards, calendars, items for sale, and so on, much easier.

    But a key difference is that we don’t spend time trying to define vocabularies, in the way that the microformats ‘community’ does, since vocabularies are best defined by experts in the particular field that needs the language (and they invariably already exist anyway).

    Another major difference is that we have a generic processing model. This means that not only do you only need one RDFa parser, but it also means that the rules for making different vocabularies work together are clear. Microformats seems to be getting bogged down in the need to create and maintain parsers for each langauge, and has no clear way to work out how to combine metadata from different languages.

    One last thing worth mentioning is that RDFa takes great pains to specify what exactly something ‘means’ in terms of RDF. That doesn’t make the syntax any more difficult–we also have <code>rel="license"</code>, for example–and it doesn’t mean that authors need to know anything about RDF either. But what it does do is allow the processing of documents that embed RDFa in such a way that they can benefit from the large number of rich processing tools that already exist for the Semantic Web.

    By using RDFa with HTML we don’t create an <em>alternative</em> to the Semantic Web, but we allow HTML documents to <em>join</em> it; as Ben Adida from Creative Commons memorably put it once, we’re <strong>bridging</strong> the clickable and semantic webs.

    There is a <a href="http://rdfa.info">site devoted to RDFa</a> which has links to the main documentation, bookmarklets, blog entries, demos, and so on.

    All the best,

    Mark Birbeck

  2. Mmm…the form says "Some html is allowed", but none of my mark-up has worked! Apologies for the way it looks.

    I’ve also put the wrong link to my own blog, which is actually at http://internet-apps.blogspot.com/. I mention it because it has quite a lot on RDFa, web applications, mash-ups, microformats, etc.

  3. Thanks for a very thoughtful post. I for one am continuing to listen to your input and feedback.

  4. >I for one am continuing to listen to your input and feedback.

    While describing it as a "vague emotional statement which was not actually a criticism of microformats per se"

  5. I think the most interesting thing about "Microformats" (sorry, gotta quote that, it’s quite annoying that an intrinsically generic term would be dominated by such a limited standard), is that they feel the need for top-down control of something as wide-ranging as

    It seems clear the idea is good but the process has to essentially throw open the doors openness if they want to really succeed. Again, such a shame that such a generic term is associated with the top-down approach.

    On the other hand, the only thing that bothers me is their project name. Other than that, the concept is so simple and easy to use that proliferation certainly doesn’t rely on their ideas or concepts, or even community.

    Really pretty funny this posted is labeled as vague and emotional by Tantek. Its anything but that. I think you should wear your censorship as a badge of honor.

    Design by committee == always bad for implementations. The minute they stepped out of the conceptual design, into the concrete implementation, they should have left the committee driven control far behind, but they didn’t, IMHO.

  6. Mike, it appears that your blog has been chosen by a particular individual as a place for exaggeration and drama. I stand by my statements – your blog post as a whole is quite thoughtful, and I’m still absorbing everything in it.

    Unfortunately that particular individual chose to extract only a vague emotional statement from your blog post which was not really a criticism of microformats per se, more about enthusiasim around, and then add that statement to the "criticism" page, which I considered off topic and thus removed. Furthermore, this same individual exaggerates/inflames with terms like "censorship" when obviously anything on a wiki is always present in diffs and therefore not censored.

    Mike, I encourage you to add to the criticism page yourself, you don’t need a member of the community who is currently moderated on the mailing list due to his misbehavior speaking for you and potentially misrepresenting you.

    Thanks,

    Tantek

  7. What a pity that Tantek feels the need to resort to dishonesty, "snarky" insinuation and ad hominem abuse.

    My posts to the moderated list are currently being censored by him, because I failed to comply with a *request* he made, to cease posting reasoned criticism of his unreasonable and hypocritical behaviour, before that request had even reached me.

    His claim that non-visible past-content is not censored is ludicrous. It goes against his own claims about hidden meta-data, and is akin to the censoring of books in a library by removing them from shelves and catalogues. Any fool can see through that argument.

    Tantek: If you objected to the quote I extracted from Mike’s post, "I’ve come to the conclusion that my enthusiasm for Microformats was unfortunately premature" (which strikes me as a pretty fair one-sentence representation of his comments, which he titled "Enthusiasm for Microformats Premature"; I’m sure he’ll say if that’s not so), why didn’t you amend it, or append your view, rather than simply censoring both it *and* the link to the comment?

  8. Andy,

    You’ve proved my point about your tendency to overdramatize, inflame, and exaggerate. In addition, you continue to provide weak arguments by analogy (wiki diffs accessible through a user interface and hidden meta-data?), perhaps for the purpose of simply being argumentative as you have on the microformats mailing list.

    The tone of discussion on Mike’s blog (and his comments) is up to him to moderate, but your above statements are good examples why you have been moderated on the microformats lists.

    I want to end this on a positive note however (as this will be last comment on this post regarding your comments) – you have made some positive contributions to microformats, both on the wiki and in the mailing lists, and I continue to hope that you focus on such positive contributions and avoid negative outbursts, drama-baiting, inflaming, exaggeration, and argumentativeness.

    Thanks,

    Tantek

  9. What a pity that Tantek again feels the need to resort to dishonesty, "snarky" insinuation and ad hominem abuse…

  10. @Mark: From the sounds of it, Microformats are just not what you’re looking for. It sounds like you want RDF or GRDDL or something more explicitly constructed, maybe with an XSLT transform to make it human readable.

    The Microformat way to avoid "naming conflicts in the wild" is by avoiding "a large number of interoperable Microformats." As for "non-visible Microformats," it says "Designed for humans first and machines second" right on the front page of microformats.org – what can humans do with invisible information?

    From my understanding of it, Microformats are a way to pave the cow paths already trod upon by blogs, wikis, social sites, etal. Trying to "address… vertical and vendor-specific needs" seems like a different goal to me.

    It just sounds like you’re just reaching for a different tool, and your hand happened to land on Microformats.

Comments are closed.