28 February 2005

The battle between flat and deep hierarchies

Beelerspace has a nice introduction to the del.icio.us tagging system in which he refers back to his praise of flat hierarchies. The argument goes that storing data in folders locks that data in to one tag--albeit one tag that exists in a hierarchy--and that limitation actually removes information from the data. For example, an email about music from a friend could be filed in the Personal > From Friends folder, eliminating the chance to categorize it in the Entertainment > Music folder. What to do? By flattening the hierarchy to tags, you can tag the email with both categories and find it under either folder.

I've always had a problem with flat hierarchies because of the loss of semantic depth. Very specific tags, From Friends, loose their place in a semantic space, Personal. What if I want to view all Personal emails? The flat tags force me either to give that up or to create tags that somehow contain their full semantic path. Maybe this is too obsessive, but it points out the ultimate limitation of the system. Tags must eventually be able to facilitate access to hundreds and thousands of pieces of data. If tags don't know that From Friends is Personal information, they have broken down in a similar-but-different manner as the deep hierarchy of folders.

The solution is to marry the two ideas and allow semantic tags. Somehow. Maybe WordNet could be used. A tag could inherit from the list of its hypernyms [Wikipedia] queried from WordNet. WordNet says that "friend" is a kind of "person," so searches for "person" would include emails tagged with "friend." Not so interesting, but it's a start.

28 February 2005 at 11:55:30 AM