Friday, December 28, 2007


Adding COinS and DOIs to our Contribution pages. Don't look for them yet, not yet live. I've also updated the RSS feed for the pages. This is not cataloging, or is it? The metadata fits some of the activities in FRBR. Find, identify, and acquire are all aided by these bits of info. In any event, anything that makes our work easier to find, use and cite is all good for the Institute. For me it makes a nice change of pace from ISBD/MARC/AACR2.

Wednesday, December 26, 2007

Clustering Tags

Simpson, Edwin has published HP technical report HPL-2007-190 Clustering Tags in Enterprise and Web Folksonomies
Recently there has been massive growth in the use of tags as a simple, flexible way to categorize resources. Tags are often used collaboratively to help share information using website; such as However, the number of tags used in such a service is extremely large, so the unstructured nature of tags limits their value when navigating these websites, and prevents users from fully exploiting tags added by others. Clustering similar tags can improve this by adding structure. In this paper we discuss techniques for deriving tag similarity and explain two tag clustering algorithms. We applied the algorithms to two datasets containing tags provided by users with common interests. The first dataset is from a tagging service used by a small group of colleagues and the second is a public, web-based service. The paper examines the effectiveness of both clustering algorithms and their robustness to the different types of data, giving suggestions of possible ways to improve the algorithms.

Titles in Retail and Publisher Data

There has been much talk about using metadata from other communities to enrich our catalogs and/or lower the costs of cataloging. Recently there has been quite a flap on AUTOCAT when distributors have dumped minimun level records into OCLC. Now Karen Coyle has looked at Titles in Retail and Publisher Data. Real data.