Recently there has been massive growth in the use of tags as a simple, flexible way to categorize resources. Tags are often used collaboratively to help share information using website; such as del.icio.us. However, the number of tags used in such a service is extremely large, so the unstructured nature of tags limits their value when navigating these websites, and prevents users from fully exploiting tags added by others. Clustering similar tags can improve this by adding structure. In this paper we discuss techniques for deriving tag similarity and explain two tag clustering algorithms. We applied the algorithms to two datasets containing tags provided by users with common interests. The first dataset is from a tagging service used by a small group of colleagues and the second is a public, web-based service. The paper examines the effectiveness of both clustering algorithms and their robustness to the different types of data, giving suggestions of possible ways to improve the algorithms.
Wednesday, December 26, 2007
Simpson, Edwin has published HP technical report HPL-2007-190 Clustering Tags in Enterprise and Web Folksonomies