Facet-based Information Navigation

Facet-based site navigation systems provide a way of browsing complex data sets without forcing a formal taxonomy onto the data. Facet-Based browsing is different than other methods of crawling information. A common system for navigation is a Taxonomy where you start at the top of the tree and browse your way down to the information that you are looking for. Many different web sites use a taxonomy for browsing information. At ConsumerReports.org, you click on Appliances -> Large Kitchen -> Freezers. Another example would be looking for a MedLine article in a physical library. You find the Journal -> Volume -> Issue -> Pages.

In a facet-based system, you could select one of several facets to crawl the information by: Date, Author, Journal, Keyword. You can click and limit the results by each Facet. Browsing a Facetted system will never lead you to an empty result. You can see a demo at Siderian Software’s Seamark Demo. Siderean Software’s products crawl the repository to create a facet-based information browser. The selections are generated on-the-fly with each sub-selection.

Why this blog? Folksonomy as integration element

Why did I build this blog now? To demonstrate the power of tagging and of folksonomy as an enterprise collaboration and communication tool. There are several key parts of this integration picture that have finally come together (at least for me).

The first element is the growth of tagging – the ability of users to assign key words to objects that they place in a repository. Tagging allows users to mark URLs in del.icio.us with keywords that describe those URLs. Tagging allows users to mark photo that they upload into Flickr with key words. Tagging allows bloggers to mark their entries with key words that Technorati will capture. All this tagging builds a folksonomy – a taxonomy of the people, by the people for the people.

The second element is the growth of simple HTML, REST or RSS interfaces. Many of these interfaces have guess-able formats. If you want to get a list of all of my objects (URLs with descriptions and tags) from del.icio.us, you enter in the URL http://del.icio.us/jimphelps. If you want to see just those objects that have to do with folksonomy, you enter http://del.icio.us/jimphelps/folksonomy. Pretty easy to guess to the rest. The interfaces that Flickr, Technorati and del.icio.us expose are very simple to use. This means that people have started building cool stuff against these interfaces.

The third element is the open source movement around these services. Many people are developing code that leverages these various services. The code is out there and available for use, expansion and adaption.

Those key elements let me construct this blog as a demonstration application of the power of folksonomy and integration element in enterprise communication and collaboration.


Clay Shirky – Ontology is Overrated Presentation Notes

IT Conversations: Clay Shirky – Ontology is Overrated

Clay Shirky gave a presentation at ETech titled Ontology is Overrated. You can listen to the presentation at the link above (ITConversations).

Highlights [with my expansions in square brackets]:

(1) Ontologies are left over from times when we had to file objects on shelves. This is no longer true with data on the web [or in an enterprise].

(2) The ontological goal of finding the perfect categorization scheme for the “essence” of the objects you are categorizing is a false goal in this era.

(3) Library of Congress categorization scheme (hierarchical buckets without overlap between buckets) is optimized for numbers of books on the shelves not conceptual ideas or intellectual aspects. Books need to be in one place but ideas can be all over the place. We have confused the container for the things within the container.

(4) There is no shelf. There is no physical constraint that we have to enforce upon the web.

