Problems with subject access to online catalogues and databases are not new. Studies on the use of OPACs have revealed two apparently endemic problems: on the one hand, the large number of searches with zero hits (failed searches) and on the other, the retrieval of an excessive amount of bibliographic records (information overload).
In this paper we describe a new information retrieval technique based on the combination of descriptor weighting and the use of the Universal Decimal Classification (UDC) call numbers.
The use of classification call numbers in order to search the catalogue has traditionally been very restricted. In most catalogues, call numbers are used only as topographical indicators and are not searchable. The new system described here makes much fuller use of them.
The system is based on the hypothesis that a set of descriptors correspond to a UDC call number. Through the analysis of the frequency of distribution of descriptors and call numbers, we create a set of clusters that allow increasing precision and recall. At the same time, these clusters offer alternative search modes, making it possible to systematize the indexing process and increase its consistency. Here we present a case study of the use of the system with the ERIC database.
Monday, June 02, 2008
Improving subject searching in databases through a combination of descriptors and UDC by Granados, Mariangels and Nicolau, Anna (2008) In Proceedings BOBCATSSS'08: Providing acces for everyone, Zadar (Croatia)