22 August 2005

More on semantics

This HP paper on tagging systems explains well what I had previously explained so poorly. From the paper:

Like a Venn diagram, the set of all the items marked cats and those marked africa would intersect in precisely one way, namely, those documents that are tagged as being about African cats. Even this is not perfect, however. For example, a document tagged only cheetah would not be found in the intersection of africa and cats, though it arguably ought to; like the foldering example above, a seeker may still need to search multiple locations.

This illustrates the limitations of both folders and tags, and how an ontology however achieved is required to provide more encompassing results. A user should be able to specify "Africa" and "cats" and the system should understand all of the hyponyms of "cats" ("cheetah" etc.) as well as those of "Africa" ("Egypt" etc.). A taxonomy gives us this.

People have complained about the brittleness of these implemented tagging systems. I agree that they are not ideal, but as this HP document shows, their existence and popularity allow us to examine how a system could improve on them.

posted by sstrader on 22 August 2005