Swimming with the Razorfishes

Monday, December 01, 2003

Oddly enough, Dave Winer and the Longhorn development people both got me thinking about something.

Dave is organizing his blog into a directory of sorts. It seems that date / time metadata plus Manilla categories will route stories into various "directories".

The people developing the Longhorn filesystem and its APIs are pushing the boundaries by attaching metadata to filesystem objects and by abstracting filesystem objects into things that are file-backed, network backed, etc... Wrapping this into a relational store adds an aspect of "queryability."

But here is my take on top-down, imposed taxonomies: they are fine, as long as you don't assume they'll be useful to anyone but you.

The idea of putting something "into" a directory or category is a premature optimization. When the simple act of parsing and categorizing data was a taxing procedure, like when we all had honkin' 286 desktops with 256k of RAM, it made sense to place the burden of categorization on the end-user; that way, the smart person imposed some kind of taxonomy up front, and the computer just had to handle the task of indexing and searching. "This file goes in this directory." That kind of thing.

But my PowerBook is damn fast. Way faster than I need for most of the software I use (with the notable exception of RadioUserland, Photoshop, and compiling large EJB-based software systems). It is so fast that real lexical discovery is possible with my information.

This is the Google model. "Just give me all your stuff. I'll analyze the soup and find my own connections."

That way, Google can figure out that my 'blog entry from February 16 should fall into the "fucktard" category; certainly not one I'd think of placing myself in. Google can also discover that in addition to my sparkling entry, Real Live Preacher and Chris should fall into the "fucktard" category. This leaves it up to the human to ponder the nature of the connection between these three things, rather than the connection itself.

The Google model is dynamic; it allows hierarchy, taxonomy, and connections to grow over time. The top-down taxonomy isn't better or worse than one allowed to emerge, but they inevitably reflect the mind that created them.

Beware the golden taxonomy.

0 Comments:

Post a Comment

<< Home