Wednesday, May 07, 2008

Using Wikipedia

Two new reports from HP Labs show interesting uses of Wikipedia in information management.

Boosting Inductive Transfer for Text Classification using Wikipedia by Somnath Banerjee. HPL-2008-42
Inductive transfer is applying knowledge learned on one set of tasks to improve the performance of learning a new task. Inductive transfer is being applied in improving the generalization performance on a classification task using the models learned on some related tasks. In this paper, we show a method of making inductive transfer for text classification more effective using Wikipedia. We map the text documents of the different tasks to a feature space created using Wikipedia, thereby providing some background knowledge of the contents of the documents. It has been observed here that when the classifiers are built using the features generated from Wikipedia they become more effective in transferring knowledge. An evaluation on the daily classification task on the Reuters RCV1 corpus shows that our method can significantly improve the performance of inductive transfer. Our method was also able to successfully overcome a major obstacle observed in a recent work on a similar setting. Publication Info: Published and presented at ICMLA 2007, the Sixth International Conference on Machine Learning and Applications (ICMLA'07), 13-15 Dec. 2007 Cincinnati, Ohio, USA
Clustering Short Texts using Wikipedia by Somnath Banerjee, Krishnan Ramanathan, and Ajay Gupta. HPL-2008-41
Subscribers to the popular news or blog feeds (RSS/Atom) often face the problem of information overload as these feed sources usually deliver large number of items periodically. One solution to this problem could be clustering similar items in the feed reader to make the information more manageable for a user. Clustering items at the feed reader end is a challenging task as usually only a small part of the actual article is received through the feed. In this paper, we propose a method of improving the accuracy of clustering short texts by enriching their representation with additional features from Wikipedia. Empirical results indicate that this enriched representation of text items can substantially improve the clustering accuracy when compared to the conventional bag of words representation. Publication Info: Published and presented at SIGIR 2007, the 30th Annual International ACM SIGIR Conference, 23-27 July 2007, Amsterdam, Netherlands

Monday, May 05, 2008

Slick Deal

Here is a bargain offered by Amazon, OCLC - MARC Record. It has free shipping too! This was seen on Slick Deals.

Don't they know they can get all the free MARC records they want from their local library?

Thanks Walter.

Thursday, May 01, 2008

myLOC

I may have missed this news, maybe while I was at TxLA, but I've not seen it elsewhere; the Library of Congress now has a "my" portal, myLOC.

Statement of International Cataloging Principles

The Statement of International Cataloging Principles is available for worldwide review.
As Chair of the IFLA Meeting of Experts on an International Cataloging Code (IME ICC) I am pleased to invite comments from the worldwide library community on the final draft of the Statement of International Cataloguing Principles and its accompanying Glossary.

In order to provide the appropriate review period and to schedule adequate time to cumulate, analyze, and incorporate comments before the General Meeting of IFLA in August, the Statement is being posted today on a public Wiki. The IFLA Headquarters Office is closed for holiday April 30-May 5th, but as soon as they return we will move the files there and redirect from the Wiki. In the meantime please link to: http://catprinciples.pbwiki.com/ and view and/or download the Statement for your review; and please use the accompanying voting document for your response.

MARC Records

Ed Summers has "created a bittorrent of the concatenated MARC files donated to the Internet Archive by Scriblio (7,030,372 records)":

http://inkdroid.org/torrents/lc-bib.torrent

Wednesday, April 30, 2008

Library of Congress Subject Heading Suggestion Blog-a-Thon

The results for the Library of Congress Subject Heading Suggestion Blog-a-Thon are in. The effort resulted in 24 subject headings, 6 cross-references, and 2 subdivisions suggestions.

Tuesday, April 29, 2008

Transparency

Get Satisfaction looks like a unique 2.0 tool to make the organization transparent.
Get Satisfaction is a direct connection between people and companies that fosters problem-solving, promotes sharing, and builds up relationships. Thousands of companies use this neutral space to support customers, exchange ideas, and get feedback about their products and services. Get Satisfaction is open, transparent, and free. You’re free to ask, free to answer, and free to start a new conversation. Everyone is invited and encouraged to participate: companies, employees, customers — anyone with an opinion, an answer, or something to say.
A few libraries are repersented. Michael Stephens needs to see this.

Monday, April 28, 2008

Free Comic Book Day

Free Comic Book Day is this weekend, May 3.

Additions to the MARC Code Lists for Relators, Sources, Description Conventions

The codes listed below have been recently approved for use in MARC 21 records. The codes will be added to the online MARC Code Lists for Relators, Sources, Description Conventions.

The codes should not be used in exchange records until after June 25, 2008. This 60-day waiting period is required to provide MARC 21 implementers time to include newly defined codes in any validation tables they may apply to the MARC fields where the codes are used.

Category Code Sources
The following codes are for use in subfield $2 in field 072 in Authority and Bibliographic records (Subject Category Code) and in subfield $z in field 073 (Subdivision Usage) in Authority records.

Additions:

bisacsh
BISAC Subject Headings
(http://www.bisg.org/standards/bisac_subject/index.html) [use only after June 25, 2008]
bisacmt
BISAC Merchandising Themes
(http://www.bisg.org/standards/merchandising.html) [use only after June 25, 2008]
bisacrt
BISAC Regional Themes
(http://www.bisg.org/standards/region_codes.html) [use only after June 25, 2008]
Classification Sources
The following code is for use in subfield $2 in field 084 in Bibliographic and Community Information records (Other Classification Number), in subfield $2 in field 084 in Classification records (Classification Scheme and Edition) and in subfield $2 in field 065 in Authority records (Other Classification Number).

Addition:
blissc
British Library Inside service subject classification. (London: British Library) [use only after June 25, 2008]
Term, Name, Title Sources
The following codes are for use in subfield $2 in fields 600-657 and 662 in Bibliographic and Community Information records, and in subfield $f in field 040 (Cataloging Source) in Authority records.

Additions:
bisacsh
BISAC Subject Headings
(http://www.bisg.org/standards/bisac_subject/index.html) [use only after June 25, 2008]
bisacmt
BISAC Merchandising Themes
(http://www.bisg.org/standards/merchandising.html) [use only after June 25, 2008]
bisacrt
BISAC Regional Themes
(http://www.bisg.org/standards/region_codes.html) [use only after June 25, 2008]
quiding
Quiding, Nils Herman. Svenskt allmant forfattningsregister for tiden fran ar 1522 till och med ar 1862. (Stockholm: Norstedt) [use only after June 25, 2008]
skon
tt indexera skonlitteratur: Amnesordslista, vuxenlitteratur.
(Stockholm: Svensk biblioteksfrening) [use only after June 25, 2008]

Friday, April 25, 2008

More Comments on TLA

The drive from Houston to Dallas was beautiful. The blue bonnets had past, except for a few scattered patches. However, the brown eyed susans, winecups, indian paintbrushes, and a white flower (cow's parsley?) were spectacular.

At the RDA preconference I had the pleasure of heading Carol Seiler, from AMIGOS, speak. Great presentor.

Watch New Records Enter WorldCat

Watch new records enter WorldCat.

Wednesday, April 23, 2008

DCMI Abstract Model

At the RDA preconference I noticed that RDA seems to have been based, at least in part, on the DCMI Abstract Model. I knew RDA had some basis in FRBR, but this was something new to me. Getting to know the DCMI Abstract Model before RDA hits has been added to my to-do list.
This document specifies an abstract model for Dublin Core metadata. The primary purpose of this document is to specify the components and constructs used in Dublin Core metadata. It defines the nature of the components used and describes how those components are combined to create information structures. It provides an information model which is independent of any particular encoding syntax. Such an information model allows us to gain a better understanding of the kinds of descriptions that we are encoding and facilitates the development of better mappings and cross-syntax translations.

What is a Work?

Good news from Martha Yee.
...all of my "What is a Work?" articles published in Cataloging & Classification Quarterly in 1994-1995 are now available at the UC eScholarship repository, as follows:

"What is a Work? Part 1, The User and the Objects of the Catalog." Cataloging & Classification Quarterly 1994; 19:1:9-28.
http://repositories.cdlib.org/postprints/2709

"What is a Work? Part 2, The Anglo-American Cataloging Codes." Cataloging & Classification Quarterly 1994; 19:2:5-22.
http://repositories.cdlib.org/postprints/2710

"What is a Work? Part 3, The Anglo-American Cataloging Codes, Continued." Cataloging & Classification Quarterly 1995; 20:1:25-45.
http://repositories.cdlib.org/postprints/2755

"What is a Work? Part 4, Cataloging Theorists and a Definition." Cataloging & Classification Quarterly 1995; 20:2:3-23.
http://repositories.cdlib.org/postprints/2711

Another relevant article that I wrote about FRBR-izing OCLC is available as well:

"Musical Works on OCLC, or, What if OCLC Were Actually to Become a Catalog?" Music Reference Services Quarterly 2002: 8:1:1-26.

http://repositories.cdlib.org/postprints/2713

In addition, my recent article analyzing the differences among cataloging, metadata, descriptive bibliography, and abstracting and indexing services is now available:

"Cataloging Compared to Descriptive Bibliography, Abstracting and Indexing Services, and Metadata." Invited for Ruth Carter festschrift, Cataloging & Classification Quarterly 2007; 44:3/4:307-328.

http://repositories.cdlib.org/postprints/2721

LCSH Suggestion Blog-a-Thon

The Radical Reference folks are having a Library of Congress Subject Heading Suggestion Blog-a-Thon.
Do subject headings still matter? We say they do.

Does the Library of Congress always identify accessible and appropriately named headings and implement them in a timely manner? We say not always. All you have to do is spend one day behind a reference desk to see examples of biased, non-inclusive, and counterintuitive classifications that slow down, misdirect, or even obscure information from library users. As librarians and library workers, providing access to information is important-and classifying it in ways that are inclusive and intuitive strengthens our egalitarian mission.

Between now and Sunday, April 27, Radical Reference invites you to suggest subject headings and/or cross-references which will then be compiled and sent to the Library of Congress. You can either choose one previously suggested by Sandy Berman (pdf or spreadsheet) or propose your own.

This is a chance to positively impact the catalog of the de facto national library of the United States, which also impacts cataloging all over the world!

Tuesday, April 22, 2008

Recommender System for the DSpace

A Recommender System for the DSpace Open Repository Platform by Desmond Elliott, James Rutherford, and John Erickson. HPL-2008-21.
We present Quambo, a recommender system add-on for the DSpace open source repository platform. We explain how Quambo generates content recommendations based upon a user selected set of examples, our approach to presenting content recommendations to the user, and our experiences applying the system to a repository of technical reports. We consider how Quambo could be combined with the peer-federated DSpace add-on to extend the item-space from which recommendations can be generated; a larger item-space could improve the diversity of the set from which to make recommendations. We also consider how Quambo could be extended to add collaboration opportunities to DSpace. Publication Info: Submitted to Open Repositories 2008, Southampton, UK, April 1-4, 2008

Monday, April 21, 2008

TLA Recap

TLA is over for the year. Always an excellent conference. Here are a few observations. The RDA preconference had 135 registered. Some had to be turned away, the most the room would hold was 135. There is definitely an interest in this.

Walt Crawford shows that common sense is not so common but in the right forum always interesting.

No graphic novel/comic vendors. No Marvel, DC, Antarctic Press, Strangers in Paradise. Missed them. Rod Espinosa did a presentation and autograph session. And the author of American Born Chinese did a presentation. Have to check out his stuff, very well-spoken.

The keynote panel was fun. Roy Tennet was a very good moderator.

OPALS looks like an open-source ILS worth investigating.

Post any failures at the Library Success wiki. Examples of things that did not work and even better info on why are important and useful to others.

The KIC copier looks interesting. Too expensive for us right now, $20,000 or so. But a flat scanner that produces a PDF or TIFF and then can email or move the file to a thumb drive looks like the future.

The Nasher Sculpture Center is a beautiful setting. The Willows, Irises and water at the end of the row Oaks was stunning.

The District Caucuses were the same time as the alumni dinners. I went for the dinner. Nice view from the 69th floor.

TLA 2009

It looks like the Lunar and Planetary Institute (LPI) Education Dept. will be having a preconference at TLA 2009. Explore! Fun with Science. Never too early to get this penciled in your daytimer.

RDF Tool

RDFify your data wtih Triplify.
Triplify provides a building block for the "semantification" of Web applications. Triplify is a small plugin for Web applications, which reveals the semantic structures encoded in relational databases by making database content available as RDF, JSON or Linked Data.

Triplify is very light weight: It consists just of few files with less than 500 lines of code. For a typical Web application a configuration for Triplify can be created in less than one hour and if this Web application is deployed multiple times (as most open-source Web applications are) the configuration can be reused without modifications.

Triplify makes Web applications easier mashable and lays the foundation for next generation, semantics based Web searches.

23 Things

23 Things is all the rage among the Library 2.0 folks. I had an idea, how about 23 Things for the Semantic Web? COinS, Microformats, RDF, Topic Maps, SKOS, etc. There would be plenty to investigate. Not sure the concept could be grasped quite as fast though.

Friday, April 11, 2008

VALE OLS Materials

Video streaming, audio podcasts and PowerPoint presentations from the VALE's Next Generation Academic Library System Symposium OLS (Open Library System) are now available on the VALE website.

Genre/Form Headings for Radio Programs

In August 2007, the Cataloging Policy and Support Office (CPSO) announced a project to begin issuing genre/form authority records (MARC 21 tag 155) for motion pictures, television programs, and videos. As the next step in the development of genre/form headings at the Library of Congress, CPSO has begun a project to create genre/form headings for radio programs. These headings are being created by catalogers in the Motion Picture, Broadcasting, and Recorded Sound Division (MBRS) Division and will join those already being established for moving images. They are based chiefly on the concepts represented in the Radio Form/Genre Terms Guide (RADFG). Existing LCSH headings in the area of radio programming (MARC 21 tag 150) will also be considered for inclusion.

To support the creation and application of these headings, CPSO and MBRS have drafted a Subject Cataloging Manual (SCM) instruction sheet, H 1969.5, which is available in PDF format on CPSO’s website. Interested parties are invited to send comments on this instruction sheet to Janis Young at jayo@loc.gov.

CPSO reminds SACO participants that change requests and proposals for genre/form headings are not being accepted at this time.

TLA Conference

Postings next week will be sporadic, at best, possibly non-existent. I'll be at TLA and though I will have the laptop I may not feel like posting at the end of long, very full days. I'll start the week off at the preconference on RDA. Last count I heard for that was 135 registered, blows my mind. Later on Tuesday I'll be at dinner with some catalogers, good folks all. Then if time permits catch the end of the welcome party. Looking forward to seeing some folks I've not seen in too long and meeting some new people.

Thursday, April 10, 2008

TLA Conference News

Cali Lewis has been moved out of the NetFair location into a regular room. I think the time has stayed the same. Have to check when I get my conferernce schedule. I'm no longer the room host, but I plan on being there.

So far my conference Twitter experiment is a flop. I've got no one following, nor anyone to follow. I guess TLA is a bit different than CiL. I will keep it up for a bit just to make sure it is not the right tool at this time.

Tuesday, April 08, 2008

OPAC Enhancement

Here is an interesting enhancement to an OPAC, Answer Tips. The American University of Rome Library did this. Now double clicking on any unlinked word brings up a short pop-up explanation. Quick and easy to do. How much value does it add? Interesting.

Monday, April 07, 2008

TechNet 2008

Looks like fun. "TechNet 2008 is the first annual North Texas conference focusing on technology in libraries" June 12, 2008.

TX Library Association Annual Conference

I've started my Twitter for the Texas Library Assoc. Conference. If you'll be there and want to keep in touch.

Friday, April 04, 2008

New Version of Omeka

News from Omeka.
Omeka 0.9.1 is our first release since the initial public launch on February 20, 2008. It fixes 20+ bugs, and we highly recommend that all users upgrade their existing Omeka installations. The API hasn’t changed since the 0.9.0 release, so existing themes and plugins should continue to work after the upgrade.
BTW
Omeka is a web platform for publishing collections and exhibitions online. Designed for cultural institutions, enthusiasts, and educators, Omeka is easy to install and modify and facilitates community-building around collections and exhibits. Omeka is free and open source

PREMIS Data Dictionary for Preservation Metadata

The PREMIS Editorial Committee is pleased to announce the release of the PREMIS Data Dictionary for Preservation Metadata, version 2.0. This document is a revision of Data Dictionary for Preservation Metadata: Final report of the PREMIS Working Group, issued in May 2005. The PREMIS Data Dictionary and its supporting documentation is a comprehensive, practical resource for implementing preservation metadata in digital archiving systems. Preservation metadata is defined as information that preservation repositories need to know to support digital materials over the long term.

This document is a specification that emphasizes metadata that may be implemented in a wide range of repositories, supported by guidelines for creation, management and use, and oriented toward automated workflows. It is technically neutral in that no assumptions are made about preservation technologies, strategies, syntaxes, or metadata storage and management. Members of the PREMIS Editorial Committee revised the original data dictionary based on comments and experience from implementers and potential implementers since its release. The Editorial Committee kept the preservation community informed about issues being discussed, solicited comments on proposed revisions, and consulted outside experts where appropriate.

The international Editorial Committee is a part of the PREMIS Maintenance Activity sponsored by the Library of Congress. The Maintenance Activity also includes PREMIS tutorials and promotional activities, and an active PREMIS Implementers Group.

Major changes in this revision include:
  • Expanded rights metadata
  • More extensive significant properties and preservation level information
  • Mechanism for extensibility for a number of metadata units
The PREMIS Data Dictionary for Preservation Metadata, version 2.0 is available. An XML schema to support implementation is currently in draft and is available. This is an extensive revision of the earlier PREMIS version 1.1 schemas.

After a one month review, the schema will be finalized. Please send comments about the XML schema by April 24 to Ray Denenberg, rden@loc.gov.

Monday, March 31, 2008

NISO Website

The NISO website has a new look.

Friday, March 28, 2008

Additions to the MARC Code Lists for Relators, Sources, Description Conventions

The codes listed below have been recently approved for use in MARC 21 records. The codes will be added to the online MARC Code Lists for Relators, Sources, Description Conventions.

The codes should not be used in exchange records until after May 28, 2008.
This 60-day waiting period is required to provide MARC 21 implementers time to include newly defined codes in any validation tables they may apply to the MARC fields where the codes are used.

Category Code Sources
The following code is for use in subfield $2 in field 072 (Subject Category Code/Code Source) in Authority and Bibliographic records.

Addition:
ekz
Systematiken der ekz [use only after May 28, 2008]
Classification
The following codes are for use in subfield $2 in field 084 in Bibliographic and Community Information records (Other Classification Number), in subfield $2 in field 084 in Classification records (Classification Scheme and Edition) and in subfield $2 in field 065 in Authority records (Other Classification Number).

Additions:
dopaed
DOPAED der UB Erlangen [use only after May 28, 2008]
methepp
Methode Eppelsheimer [use only after May 28, 2008]
ssgn
Sondersammelgebiets-Nummer [use only after May 28, 2008]
Description Conventions
The following codes are for use in subfield $e in field 040 in Bibliographic and Authority records (Description Conventions).

Additions:
din1505
Titelangaben von Dokumenten (Berlin: Beuth) [use only after May 28, 2008]
vd16
Formalerschliessung nach dem Verzeichnis der Drucke des 16. Jahrhunderts (VD 16) [use only after May 28, 2008]
vd17
Formalerschliessung nach dem Verzeichnis der Drucke des 17.
Jahrhunderts (VD 17) [use only after May 28, 2008]
rakddb
Ansetzungsform gemaess der RAK - Anwendung Der Deutschen Bibliothek [use only after May 28, 2008]
Other codes
The following code is for use in subfield $2 in field 210 in Bibliographic records (Abbreviated Title).

Addition:
din1430
Key Title nach DIN 1430 (Berlin: Beuth) [use only after May 28, 2008]
The following code is for use in subfield $2 in field 044 (Country of Publishing/Producing Entity Code) in bibliographic records.

Addition:
swdl
Lndercode der Schlagwortnormdatei (SWD) (Leipzig, Frankfurt am Main, Berlin: Deutsche Nationalbibliothek) [use only after May 28, 2008]
Term, Name, Title Sources
The following code is for use in subfield $2 in fields 600-657 in Bibliographic and Community Information records, and in subfield $f in fields 040 (Cataloging Source) and subfield $2 in fields 700-788 (Heading Linking Entries / Source of heading or term) in Authority records.

Addition:
rswkaf
Alternativform zum Hauptschlagwort [use only after May 28, 2008]
The codes listed below were previously defined for use in subfield $2 in fields 600-651 in Bibliographic and Community Information records, and in subfield $f (Subject heading or thesaurus conventions) in field 040 in MARC 21 Authority records.

Usage has been expanded to subfield $2 in fields 654-657 and 662 in Bibliographic records (Subject Added Entries/Index Terms); subfield $2 in fields 654-657 in Community Information records (Subject Added Entries/Index Terms); and subfield $2 in fields 700-788 (Heading Linking Entries / Source of heading or term) in Authority records.
rswk
Regeln fr den Schlagwortkatalog (Leipzig, Frankfurt am Main, Berlin: Deutsche Nationalbibliothek) (3MB PDF file) [use in new fields only after May 28, 2008]
swd
Schlagwortnormdatei (Leipzig, Frankfurt am Main, Berlin: Deutsche Nationalbibliothek) [use in new fields only after May 28, 2008]

TLA Conference

At TLA I'll be a room host for the session by Cali Lewis. Tuesday, 2 PM @ the NetFair. I'll also be at the RDA preconference.

I lost the election as councilor for the Digital Libraries group, I do think the best person won. So I'll be passing on the DL business meeting, but will most likely hit most of their sessions. I'll be starting a Twitter for the conference. I'm looking forward to seeing some folks soon.

This morning I restarted my Facebook account. I got bored with it about a year ago and shut it down. Today I reactivated it. Everything was still there. The information does not get erased when you close it down.

USEMARCON Plus

A new version of USEMARCON Plus, The Universal MARC Record Convertor, is available.
USEMARCON facilitates the conversion of catalogue records from one MARC format to another e.g. from UKMARC to UNIMARC. The software was designed as a toolbox-style application, allowing users with detailed knowledge of the source and target MARC formats to develop rules governing the behaviour of the conversion. Rules files may be supplemented by additional tables for more accurate conversion of MARC-specific character sets or coded information. The tables and rules files are simple ASCII text files and can be created using any standard text editor such as MS Windows Notepad.

Wednesday, March 26, 2008

Microformats

Microformats University: 100+ Articles and Resources by Jessica Hupp.
Microformats are small formatting pieces designed to make your data easier to read by both users and software. Although their use is not widespread, it’s important that every web developer becomes familiar with them, as they’re sure to be an integral part of the web’s future. Because of this, there are a number of articles and resources out there devoted to microformats. We’ve compiled more than 100 of the best here.

Tuesday, March 25, 2008

Additions to the MARC Country and Geographic Area Code Lists

As the result of Kosovo declaring independence from Serbia in February 2008, new country and geographic area codes have been defined for use in MARC records.
  1. MARC country code change

    The new country code is:
    kv
    Kosovo
    Kosovo was previously coded rb for Serbia from February 2007-May 2008. From 1992-April 2007 it was coded yu for Serbia and Montenegro. Prior to October 1992, yu was used for Yugoslavia, which included the Socialist republics of Bosnia and Herzegovina, Croatia, Macedonia, Montenegro, Serbia, and Slovenia.
  2. MARC geographic area code change

    The new geographic area code is:
    e-kv
    Kosovo
    Kosovo was previously coded e-rb for Serbia from February 2007-May 2008.
Yugoslavia [e-yu] will be retained for works on Yugoslavia as a whole (including the Kingdom of Yugoslavia, the Federal Republic of Yugoslavia, and the Socialist Federal Republic of Yugoslavia) and former Yugoslav republics before they separated.

Code4Lib Journal

The 2nd issue of Code4Lib Journal is now available. Plenty of good articles.

LibraryThing API

Tim Spalding has released an API for LibraryThing.
I just released a Javascript/JSON API to LibraryThing core work data.

http://www.librarything.com/thingology/2008/03/first-cut-works-json-api.php

It's basically a riff on what Google did recently—a way to link to LibraryThing if we have a book, and not if we don't. It also includes copy and review counts, and the average rating. It takes ISBNs, LCCNs and OCLC numbers.

Next up will be a JSON API into member books, so members can design their own widgets and mash their library up with the contents of a page.

It's all very beta, and my ears are wide open.

Monday, March 24, 2008

Organizing Without Organizations

Here Comes Everybody: The Power of Organizing Without Organizations is a talk by Clay Shirky discussing his new book, Here Comes Everybody: The Power of Organizing Without Organizations. (WorldCat Amazon) It is available in both video and audio. Seen on Thing-ology.

OAI Toolkit

Now available on Sourceforge, OAI4J a client library for PMH and ORE
OAI4J is an open-source client library for OAI-PMH and OAI-ORE created by the National Library of Sweden. The library is object-oriented in it's design and written in Java. It can be used to harvest metadata from OAI-PMH compliant repositories. It can also be used to create new OAI-ORE Resource Maps from scratch, to parse existing ones and to serialize them to xml.
This is the 1st tool I've noticed that works on the new OAI-ORE specs.

Tagging Structures and the Organization of Information

Analyzing Communal Tag Relationships for Enhanced Navigation and User Modeling by Edwin Simpson and Mark H. Butler (HPL-2008-24)
The increasing amount of available information has created a demand for better, more automated methods of finding and organizing different types of information resource. This chapter investigates methods of improving navigation, personalization and recommendation of information resources using collaboratively generated tags to model resources and users. We discuss the advantages and limitations of tags, and describe using relationships between tags to discover latent structures that could be used to automatically organize a community's tags. We give a hierarchical clustering algorithm for extracting latent structure and explain methods for determining tag specificity. Next we explain how latent structure visualizations could enhance navigation. Finally we discuss future trends including using latent tag structures to model users and their current tasks for recommendation and user interface personalization. Publication Info: Submitted to (Book) Collaborative & Social Information Retrieval and Access: Techniques for Improved User Modeling, Edited by Max Chevalier, Christine Julien and Chantal Soule-Dupuy, published by IGI Global.

RDF Browsing Tool

A new Hp report of possible interest, Humboldt: Exploring Linked Data by Georgi Kobilarov and Ian Dickinson (HPL-2008-23)
Abstract: We present Humboldt, a novel user interface for browsing RDF data. Current user interfaces for browsing RDF data are reviewed. We argue that browsing tasks require both a facet browser's ability to select and process groups of resources at a time and a 'resource at a time' browser's ability to navigate anywhere in a dataset. We describe Humboldt which combines these two features in a single coherent interface. Our approach is based on the operation of pivoting, which enables the user to move the focus of a browsing from one set of resources to a set of related resources. With repeated use of the pivot operation the user can browse anywhere in the data. We describe a preliminary evaluation of our approach and discuss its implications for further development. Publication Info: To be presented and published in Linked Data on the Web (LDW'08), Beijing, China, April 2008

Friday, March 21, 2008

New Book

Added to my must read list, Responsible Librarianship: Library Policies for Unreliable Systems by David Bade and Thomas Mann (WorldCat, Amazon)

Wednesday, March 19, 2008

Working Group on Aggregates Report

The report of the Durban meeting of the Working Group on Aggregates has been posted on IFLANET.

Tuesday, March 18, 2008

VuFind, New Release

The latest release of VuFind, the open source library resource discovery platform, is now available. Version 0.8 Beta is now available for download, it can be accessed from VuFind or from Sourceforge.

The major enhancement in version 0.8 is the new MARC import tool developed by Wayne Graham. This should help improve any issues dealing with importing records as well as a speed enhancement.

If you are interested in trying VuFind, have a look at the live demo.

CC:DA Discusion List

Distributed to several email lists.
CC:DA announces a public read-only email discussion list to allow non-CC:DA members to "view" committee discussions and work that takes place between outside of the ALA Annual Conference and Midwinter Meeting.

In order for the committee to do its work efficiently, non-CC:DA members will be able read CC:DA email, but will not be able to post messages. However, ideas and comments on discussions are welcome, and can be funneled to the list through your CC:DA representative or liaison. The committee roster is located at http://www.libraries.psu.edu/tas/jca/ccda/roster.html

To subscribe to the list, send an email to sympa@ala.org with the phrase "subscribe rules" in the subject line and the body of the message blank. Your email address will be captured automatically.

FRBR Object-Oriented Definition and Mapping to FRBRER

Comments are welcomed on FRBR Object-Oriented Definition and Mapping to FRBRER. This is a 140 page document. Message distributed to email lists.
The FRBR Review Group and the Working Group on FRBR and CIDOC CRM Harmonisation welcomes comments on FRBRoo (object-oriented definition and mapping to FRBRer) version 0.9 (January 2008) and also at: http://cidoc.ics.forth.gr/frbr_drafts.html

This document includes a substantive introduction to the purposes and methodology of the work, a graphical overview of the resulting model, complete FRBRoo class and property definitions, a mapping between FRBRer and FRBRoo, all CIDOC CRM class and property definitions referenced, and an appendix on the modelling of identifier creation.

The goal of the FRBRoo project is to express the conceptualisation behind FRBR using the object-oriented methodology as used in the CIDOC CRM. FRBRoo is defined as an extension to the CIDOC CRM, however, the FRBRoo document is self-contained in that all definitions referenced are included. This has provided the opportunity to verify FRBR's internal consistency, extend the scope of both FRBR and CIDOC CRM, enable interoperability and extend mutual understanding between the museum and library documentation communities by working towards a common ontology.

Comments on this work are appreciated on an ongoing basis, however, comments received prior to April 21, 2008 will be considered at the next meeting of the Working Group in May 2008.

Please send all comments to:

Pat Riva
(Chair, FRBR Review Group)
patricia.riva@banq.qc.ca

Monday, March 17, 2008

Google Books API and the OPAC

Google has an API for Google Books that can add informaiton to the OPAC.
The Google Book Search Book Viewability API enables developers to:
  • Link to Books in Google Book Search using ISBNs, LCCNs, and OCLC numbers
  • Know whether Google Book Search has a specific title and what the viewability of that title is
  • Generate links to a thumbnail of the cover of a book
  • Generate links to an informational page about a book
  • Generate links to a preview of a book

Dublin Core Geospatial Application Profile

The draft of the Geospatial Application Profile for Dublin Core is now available.
This document describes a DC Application Profile for describing geospatial resources. Resources of many types have geospatial properties. This profile provides guidance on the use of DC elements that are relevant to geospatial aspects of resource description, with the expectation that these shall be used together with the elements from other profiles.

RDA/MARC Working Group Established

Sent to the MARC email list.
Under the auspices of the British Library, the Library and Archives Canada, and the Library of Congress, an RDA/MARC Working Group has been established to collaborate on the development of proposals for changes to the MARC 21 formats to accommodate the encoding of RDA data. With the implementation of RDA anticipated for late 2009, the Working Group will be drafting proposals for review and discussion by the MARC community in June 2008.

Although the MARC 21 formats support the encoding of descriptions created according to a wide range of content standards, the close relationship between AACR and MARC 21 has contributed to the efficient exchange of information among libraries for decades. The RDA/MARC Working Group will identify what changes are required to MARC 21 to support compatibility with RDA and ensure effective data exchange into the future.

Members of the RDA/MARC Working Group are:

  • Everett Allgood (New York University and CC:DA Liaison to MARBI)
  • Corine Deliot (British Library)
  • Rebecca Guenther (Library of Congress)
  • Bill Leonard (Library and Archives Canada)
  • Sally McCallum (Library of Congress)
  • Marg Stewart (JSC Liaison to the RDA/MARC Working Group)
  • Martha Yee (UCLA Film and Television Archive)

Tuesday, March 11, 2008

Photographic Metadata

I was recently asked about adding metadata to photographs. After describing our tools, I did a quick search and found a MS image metadata tool, Microsoft Photo Info. Who could have guess so many folks would want to catalog images? I have yet to give it a try, but it look like a decent basic tool.

Grants?

A orginization for music teachers is looking for a grant to help with their lending library. They need funds to:
  • converting videos to DVD format
  • acquisitions of video or DVDs of the "big name" presenters who are no longer with us
  • setting up a system for delivering "videos" over the Internet to borrowers
Any suggestions? Thanks.

FRBR Text

The full text of the Functional Requirements for Bibliographic Records (FRBR) incorporating the amended definition of the expression entity as well as the errata identified to date has been made available on IFLANET in both PDF and HTML formats.

For the first time, the HTML versions of both the current text and the original 1998 text include the tables, rather than just references to the PDF version.

Notice distributed via e-mail.

Cataloging Research

“Just where’s the damn book?,” or, Rediscovering the art of cataloging by Chad P. Abel-Kops is now available on E-LIS.
Current discussions on the future of cataloging describe a "crisis" that has been going on longer than most realize. However, new challenges posed by the Internet have given increased attention to a more complete transformation of bibliographic control. Contributions by Calhoun and others have shown that much can be gleaned from research in fields beyond library and information science, namely in documenting how people actually react to information and the process they employ in its discovery. While many technical solutions have been offered in these discussions, the author considers the more elusive social and moral dimensions which help explain why what has been described as a "crisis" continues.

Friday Humor on Tuesday

Very funny. “Steroid” Scandal Rocks Major League Libraries by Daniel Cohen.

Map Catalogong Resource

News of this MAGERT publication was distributed in email to several lists.
Did you miss ALA's preconference on cataloging early maps and atlases, Rare, Antiquarian or Just Plain Old: Cataloging Pre-Twentieth Century Cartographic Resources, which was held last June at the Library of Congress prior to the American Library Association Annual Conference in Washington, D.C.? It was organized by the Map and Geography Round Table, and co-sponsored by ALCTS, GODORT, and RBMS. The workbook used in the preconference and issued to participants has been reprinted and is available for purchase from MAGERT for $40. Our current supply will soon be sold out, but we are doing another printing in response to your firm orders. So to be sure to obtain a copy, send your requests without delay to the address below. The workbook includes illustrations and cataloging examples taken from sheet maps, atlas plates and atlases, focusing on early and pre-twentieth century cartographic materials. Some of the areas covered by the workbook include elements of description, transcription, mathematical data and supportive research. The $40 price includes shipping and handling.

Orders for the workbook, Rare, Antiquarian, or Just Plain Old, should be sent to:
Jim Coombs
MAGERT Publications Distribution Manager
Maps Library
Missouri State University
901 S. National, #175
Springfield, MO 65897 USA
Email: jimcoombs@missouristate.edu

Friday, March 07, 2008

LibraryThing Local

LibraryThing Local maps libraries, bookstores, and book events.
LibraryThing Local is a gateway to thousands of local bookstores, libraries and book festivals—and to all the author readings, signings, discussions and other events they host. It is our attempt to accomplish what hasn't happened yet—the effective linking of the online and offline book worlds. Books still don't fully "work" online; this is a step toward mending them.

LibraryThing Local is a handy reference, but it's also interactive. You can show off your favorite bookstores and libraries (eg., mine include the Harvard Bookstore, Shakespeare and Company and the Boston Athenaeum) and keep track of interesting events. Then you can find out who else loves the places you do, and who else is going to events. You can also find local members, write comments about the places you love and more.

Is your library included? If not it is easy to add it. I added the Lunar and Planetary Institute Library.

Tuesday, March 04, 2008

Oreo Ad

Here in the States there is a new Oreo ad that has pitched percussion music, sounds like something from the Orff's Schulwerk. Is that so? Which volume?

ORE Specification and User Guide

The latest version of the ORE Specification and User Guide has been released.
Open Archives Initiative Object Reuse and Exchange (OAI-ORE) defines standards for the description and exchange of aggregations of Web resources. This document provides an introduction and lists the specifications and user guide documents that make up the OAI-ORE standards.

Monday, March 03, 2008

FRBR in Chinese

A translation of the Functional Requirements for Bibliographic Records (FRBR) into Chinese has just been made available on IFLANET

Peeps @ the Library

Peeps are in the stores, so it is time once again to point to Peep Research. Brings a smile every year.

New Look

Back on March 5, 2002 I started Catalogablog. Over the years I made many changes to the look and contents of the weblog, but always within the same template. Now I feel the need to change. Let me know what you think about the new look.

FRBR & RDA

The FRBR e-mail list brought the news that Barbara Tillett has updated the RDA section of her chapter in the book Understanding FRBR (WorldCat Amazon) edited by Arlene Taylor.

"Due to publishing schedules, the section published in the book reflected the way RDA was shaping up prior to the October changes that now more clearly show the relationship of RDA to FRBR."

Friday, February 29, 2008

Radical Cataloging

Soon to be available Radical cataloging : essays at the front by K R Roberto with an introduction by Sandy Berman. (Amazon WorldCat)

Wednesday, February 27, 2008

Interactive Course Assignment Pages

I saw the Interactive Course Assignment Pages (ICAP) mentioned on the Library Web Chic's weblog. not cataloging related, but still looks useful. For colleges and universities, of course, but how about for the homework help area at the public library? School libraries? So many places this could be useful.
Librarians have enough to do and maintaining static HTML pages is tedious and time-consuming. The ICAP tool enables librarians with minimal technical expertise to create dynamic web pages that integrate Web 2.0 features, such as chat and RSS feeds, with traditional library content, such as catalogs and article databases.

The ICAPs use a module layout to display content written and produced by librarians, as well as library resources and interactive widgets.

PICS -> ICRA

Somewhere along the way the Platform for Internet Content Selection (PICS) evolved into the Internet Content Rating Association (ICRA).
As a web author, we invite you to use our system to describe, that is, label, your online content in a way that can be processed by computers. The system is designed to be as objective as possible: ICRA makes no value judgements at all about any content.

Users, principally parents of young children, then apply their own judgement in deciding which sites should and should not be available in their homes or workplaces. This is done by means of software that can read and interpret the labels found.

Tuesday, February 26, 2008

Simple Knowledge Organization System Document

The W3C Semantic Web Deployment Working Group has announced the publication of the SKOS Primer as a W3C First Public Working Draft.
This is a substantial update to and replacement for the previous SKOS Core Guide W3C Working Draft dated 2 November 2005. It is a companion document to the SKOS Simple Knowledge Organization System Reference W3C Working Draft dated 25 January 2008.

The Weblog

I was thinking of moving this weblog over to Wordpress. It is just time for a change. However, that would break too many links. I think I'll just do a complete redesign of this site for March 5. That was the date in 2002 this got started. Just seems like a good time for a new look. Comments?

FRBR and Moving Image Materials

Greenwood Publishing Group kindly gave Martha Yee permission to post her chapter (Chapter 11, FRBR and Moving Image Materials: Content (Work and Expression) versus Carrier (Manifestation)) from Arlene Taylor's book, Understanding FRBR, at the UC eScholarship repository.
Some of the major problems with Anglo-American Cataloguing Rules, Second Edition (AACR2R) stem from the failure to clearly analyze the FRBR entities work and expression (content) so as to distinguish them from manifestation (carrier) for nonbook materials such as moving image materials. In this chapter, a clearer and more logical analysis of these concepts is attempted, and, at the end of the chapter, the progress made so far in RDA (Resource Description and Access) development is assessed as well.

Tagging and Culture

Collaborative and Social Tagging Networks by Emma Tonkin, Edward M. Corrado, Heather Lea Moulaison, Margaret E. I. Kipp, Andrea Resmini, Heather D. Pfeiffer and Qiping Zhang appears in Ariadne issue no. 54. Covers "a series of international perspectives on the practice of social tagging of documents within a community context".

Friday, February 22, 2008

Yee's Cataloging Rules

Martha M. Yee has updated her suggested cataloging rules and RDF model.
This is still a work in progress, so I would love to hear more suggestions for improvement from anyone who can afford the time to look it all over. James Weinheimer is helping me work on a wiki site for the cataloging rules, so keep your eye on this space (smile)...

Thursday, February 21, 2008

CONSER/BIBCO ALA At-Large Meeting Summary

The CONSER/BIBCO ALA At-Large Meeting Summary is now available. Topics discussed include:
  • CONSER standard record
  • Title presentation on e-resource web sites
  • PCC Series discussion paper
  • Integrating resource cataloging manual issues

Omeka Now Public

Omeka 0.9.0 is now available to the public.
Omeka is a web platform for publishing collections and exhibitions online. Designed for cultural institutions, enthusiasts, and educators, Omeka is easy to install and modify and facilitates community-building around collections and exhibits. Omeka is free and open source.
Here is the news release.
The Omeka team has worked very hard over the past few months to bring you the public beta, Omeka version 0.9.0, which is now available for everyone to download.

Here’s what you get bundled in your installation:
  • Basic themes that are easy to adapt with simple CSS changes
  • Exhibit building with 12 basic page layouts
  • Tagging for items and exhibits
  • RSS feed for new items
  • COins plug-in making all Omeka content readable by Zotero (zotero.org);
Find additional functionality by downloading plug-ins :
  • Bilingual plug-in for adding language fields to item metadata
  • Contribution plug-in for collecting items from visitors
  • Dropbox plug-in for batch adding items
  • Geo-location plug-in for displaying items on a map
  • Sitenotes plug-in for administrators to leave instructions for users
  • Tag Suggest plug-in for suggesting tags based upon their frequency in the item text areas
Lots of metadata there, COinS, tags, and RSS.

Wednesday, February 20, 2008

Work Begins on the RDA Vocabularies

The DCMI/RDA Task Group was formed in April of 2007, when members of the Joint Steering Committee for the Development of RDA, Dublin Core and the W3C Semantic Web Deployment Working Group met in London. At that meeting, two tasks relating to RDA vocabularies were identified:
  1. definition of an RDA Element Vocabulary
  2. disclosure on the public web of RDA Value Vocabularies using RDF/RDFS/SKOS technologies
The RDA Vocabularies Project proposes to surface these underlying bibliographic elements in the form of Semantic Web vocabularies, thereby making them reusable in Semantic Web applications and citable with Uniform Resource Identifiers (URIs). This will be based on RDF (Resource Description Framework), a generic grammar for expressing data for use not just by humans, but also in automated processes of data integration and "intelligent" reasoning.

The work will be lead by the DCMI/RDA Task Group chairs: Gordon Dunsire of the University of Strathclyde and Diane Hillmann of Cornell University (with support from Tom Baker of the Dublin Core Metadata Initiative). Other participants working closely with the project are:
  • Karen Coyle (independent consultant well known in the library world)
  • Alistair Miles (editor for the Simple Knowledge Organization System (SKOS) and member of the W3C SWDWG)
  • Mikael Nilsson (researcher in the Knowledge Management Research Group, Royal Institute of Technology, Sweden and co-chair of the DCMI Architecture Forum)
Partial funding for the effort has been secured, and sources of additional funding are still being sought. Potential funders should contact Diane Hillmann at dih1@cornell.edu for further information.

Public information on the progress of the project is available on the DCMI/RDA Task Group wiki. Continuing discussion on the work of the Task Group will take place on the public mailing list maintained by the task group and available for open subscription. Feedback, comment and experimentation with the products that the group will be presenting is both welcome and essential to the success of the work.

Tuesday, February 19, 2008

MARC and RDF

Semantic MARC, MARC21 and the Semantic Web by Rob Styles, Danny Ayers, and Nadeem Shabir is available as a preprint.
The MARC standard for exchanging bibliographic data has been in use for several decades and is used by major libraries worldwide. This paper discusses the possibilities of representing the most prevalent form of MARC, MARC21, as RDF for the Semantic Web, and aims to understand the tradeoffs, if any, resulting from transforming the data. Critically our approach goes beyond a simple transliteration of the MARC21 record syntax to develop rich semantic descriptions of the varied things which may be described using bibliographic records. We present an algorithmic approach for consistently generating URIs from textual data, discuss the algorithmic matching of author names and suggest how RDF generated from MARC records may be linked to other data sources on the Web.

Friday, February 15, 2008

Consolidated Edition of the International Standard Bibliographic Description

The consolidated edition of the International Standard Bibliographic Description (ISBD) is now available online.

Due to arrangements with the publisher, K.G. Saur, the file cannot be printed or copied from.

Princeton's Slavic Cataloging Manual

Princeton's Slavic Cataloging Manual is a resource I'd not heard of before. Bookmarked. Thanks to all who helped create this resource.

Thursday, February 14, 2008

LCCN Permalink

The Library of Congress is pleased to announce "LCCN Permalink" -- a new persistent URL service for creating links to bibliographic records in the Library of Congress Online Catalog using the Library of Congress Control Number (LCCN).

LCCN Permalink is a convenient way to cite items from the Library's collection in your bibliographies, reference guides, emails, blogs, databases, web pages, etc. Not only can you easily construct a permalink yourself, but we also display them as part of the bibliographic record in the LC Online Catalog (http://catalog.loc.gov/).

How to create an LCCN Permalink

Simply begin your URL with the LCCN Permalink domain name -- http://lccn.loc.gov/ -- then add an LCCN.*
Examples: http://lccn.loc.gov/2003556443 or http://lccn.loc.gov/82643250 or http://lccn.loc.gov/mm78044693

* LCCNs should be formatted according to the info:lccn URI specification. Instructions are also available in the LCCN Permalink FAQ.

How LCCN Permalink works

An LCCN Permalink retrieves a MARCXML-formatted bibliographic record using the Z39.50/SRU protocol. Both valid and cancelled LCCNs (MARC 21 fields 010a and 010z) are searched. LCCN Permalink displays are based on the Full Record display in the LC Online Catalog. Not only can you link directly into the LC Online Catalog, but you can also view the record in MARCXML, MODS, and Dublin Core formats.

More Information

The LC Permalink FAQ provides additional information on this new service. Specific questions can also be sent to the Library's Ask-A-Librarian service.

MARC Advisory Committee Papers

The cover sheets for the proposals and discussion papers presented at the
2008 Midwinter meetings of the MARC Advisory Committee have been updated with the results of the discussions. They are available at:
  • Proposal 2008-01: Representation of the Dewey Decimal Classification (DDC) System in MARC 21 formats
  • Proposal 2008-02: Definition of field 542 for information related to copyright status in the MARC 21 bibliographic format
  • Proposal 2008-03: Definition of first indicator value in field 041 (Language code) of the MARC 21 bibliographic format
  • Discussion Paper 2008-DP01: Identifying headings that are appropriate as added entries, but are not used as bibliographic main entries
  • Discussion Paper 2008-DP02: Making field 440 (Series Statement/Added Entry--Title) obsolete in the MARC 21 Bibliographic Format
  • Discussion Paper 2008-DP03: Definition of subfield $3 for recording information associated with series added entry fields (800-830) in the MARC 21 Bibliographic Format
  • Discussion Paper 2008-DP04: Encoding RDA, Resource Description and Access data in MARC 21

Tuesday, February 12, 2008

Text Mark-up

Calais looks like an interesting tool for semantic mark-up of text. Not sure how good it is, experimetns to generate keywords or a summary from texts have been just OK at best. Still it may be useful in some instances and is something to be aware of. What would this mean for TEI encoding, for example?
The Calais initiative seeks to help make all the worlds content more accessible, interoperable and valuable via the automated generation of rich semantic metadata, the incorporation of user defined metadata, the transportation of those metadata resources throughout the content ecosystem and the extension of it’s capabilities by user-contributed components.
Seen on LISNews.

Monday, February 11, 2008

Wordpress Plug-in

The CrossRef Citation Plugin is a "WordPress plugin that allows blog entry authors to search CrossRef's metadata using full or partial citations and then insert the formatted and DOI-linked citation into their blog posting along with COINs metadata." Can these plug-ins work on the hosted version of Wordpress? Have to investigate. If so, makes my decision to move much easier.

Friday, February 08, 2008

Wordpress

I've made a copy of this weblog on the Wordpress site. Thinking of moving over. Comments?

Cali Lewis @ TLA

I notice that Cali Lewis is scheduled as part of New Fair. In the past this has been a draped-off area in the exhibit area. Both small and noisy. I'm not sure this is the best venue for a Web 2.0 star. As someone who been on MSNBC and the CBC TV as well as having a very big Web presence she deserves a better space. I think she may also draw a larger crowd than the Nat Fair can handle. I'm going by the spaces I've seen in the past. Maybe this year's Net Fair is both quiet and spaciuos. I hope so.

Having someone like Ms Lewis speak well of the conference and profession is excellent PR. This is a chance to show the Web 2.0 crowd what the library 2.0 crowd is doing. I just hope we don't waste the opportunity.

Thursday, February 07, 2008

TLA Conference

The program for the Texas Library Association Annual Conference is now available. What a line-up. Cali Lewis, Walt Crawford, Roy Tennant, Stephan Abram, Karen Schneider, etc. Most time slots have too many sessions I want to catch. Hope to see some of you there.

I plan to Twitter at the conference, see if that helps make connections for meals, and drinks.

Wednesday, February 06, 2008

Taxonomies

Better Living Through Taxonomies by Heather Hedden appears in the latest Digital Web Magazine.
Large websites and intranets can benefit from improved methods of search and navigation. These include site maps, A-Z indexes, sophisticated search engines, and generally improved navigational design--and playing a potential role in all of these methods is well-planned taxonomy.

Tuesday, February 05, 2008

THATCamp

The Center for History and New Media, George Mason University is having an unconference.
Short for “The Humanities and Technology Camp”, THATCamp is a BarCamp-style, user-generated “unconference” on digital humanities. THATCamp is organized and hosted by the Center for History and New Media at George Mason University, Digital Campus, and THATPodcast.
May 31 - June 1, 2008. Limited space, so apply for a spot early.

THAT Podcast

The Humanities and Technology, THAT podcast sounds interesting. Brought out by the folks at The Center for History and New Media
@ George Mason University. The 1st show is about Wordpress.
Our inaugural episode of The Humanities and Technology Podcast explores Wordpress, the popular open source blogging platform. We interview Matt Mullenweg, the founder of WordPress, and demonstrate how to install the ScholarPress Courseware course management plugin used to set up a course website and blog.

Monday, February 04, 2008

IFLA Cataloging News

IFLA Cataloging News.
The annual report of the Cataloguing Section for 2007 has been posted on IFLANET and is available from the section's home page.

Also, draft version 0.9.1 of the object-oriented definition of FRBR (Functional Requirements for Bibliographic Records) is available from the page of the Working Group on FRBR/CRM Dialogue

Friday, February 01, 2008

Capturing Government Documents

Managing Web Harvested Content: Results from the EPA Harvesting Pilot Project describes the results of a crawl of the EPA site by the GPO. They have questions about thier use of PURLs, keeping local copies of the harvested items and bib level considered useful. Comments accepted through Feb. 8.
LSCM believes that providing access to the monographs and serials harvested as part of the EPA Pilot Project via the CGP best serves the needs of the depository community and the general public. As can be seen from the sample of 300 publications, making the content from the EPA Pilot Project accessible to the public is a multi-step process and involves the commitment of a significant amount of time. However, as staff become more familiar with the new brief bibliographic record format the time required to create one of these records will decrease. The identification of complete publications, the identification all the parts or issues of a title scattered within the results of the harvest and the de-duplication of the contents will continue to require a significant amount of time and staff to complete.

Additionally 1,000 monographs within scope of the FDLP have been identified from EPA Pilot Project for inclusion in the Automated Metadata Extraction Project. This is a two year project with the Defense Technical Information Service (DTIC) and Old Dominion University (ODU) to use automated metadata extraction software tools to create metadata for groups of electronic publications in GPO’s electronic collection. This is a two year project and the results are not expected until near the end of the project.

Tuesday, January 29, 2008

Physics and Astronomy Classification Scheme

Some classification news.
The 2008 edition of the AIP's Physics and Astronomy Classification Scheme (PACS)--an essential tool for classification and efficient retrieval of scientific literature--has been released. PACS, used by AIP and other international publishers, is a hierarchical subject classification scheme, comprised of ten broad subject categories subdivided into narrower categories. For PACS 2008, five categories received extensive revisions based on the contributions of experts from the physics community.

We have also prepared a Special Edition of PACS, which contains embedded mapping instructions for transforming the deleted 2006 PACS codes into the new 2008 codes.

Free downloads.

Monday, January 28, 2008

Blacklight, Another OPAC Option

Blacklight, an open source OPAC using ruby on rails and solr, has now been released.
A next generation library catalog written in ruby, using solr as the underlying search engine. All you have to do is export your marc records, index them with the scripts provided, start up ruby on rails, and you're on your way to faceted browsing bliss.

Thursday, January 24, 2008

Metadata Object Description Schema Revision

MODS version 3.3 is now available. Changes from version 3.2 are documented online.

ISBN Service

LibraryThing has a new API, one that corrects ISBNs and returns both a 10 and 13 digit ISBN. Very Restful. Just send an ISBN to http://www.librarything.com/isbncheck.php?isbn= and it will:
  • Give it any old ISBN and it does the math to return the ISBN10 and ISBN13 forms, if both exist.
  • It removes dashes and other junk.
  • It transparently fixes missing initial zeroes. This is a common problem with data from Excel files, which turn 0765344629 into 765344629.
  • If the ISBN isn't valid and can't be easily fixed, it returns an error.
They ask not to send more than 10 requests per minute second.

Wednesday, January 23, 2008

Photo Preservation Metadata

Photoplus: Auxiliary Information for Printed Images Based on Distributed Source Coding by Ramin Samadani and Debargha Mukherjee (HPL-2008-2) discusses some metadata for photographs that may be useful for preservation.
A printed photograph is difficult to reuse because the digital information that generated the print may no longer be available. This paper describes a mechanism for approximating the original digital image by combining a scan of the printed photograph with small amounts of digital auxiliary information kept together with the print. The auxiliary information consists of a small amount of digital data to enable accurate registration and color-reproduction, followed by a larger amount of digital data to recover residual errors and lost frequencies by distributed Wyner-Ziv coding techniques. Approximating the original digital image enables many uses, including making good quality reprints from the original print, even when they are faded many years later. In essence, the print itself becomes the currency for archiving and repurposing digital images, without requiring computer infrastructure. Publication Info: To be published and presented at VCIP 2008 - Visual Communications and Image Processing 2008, San Jose, CA

Cataloging Streaming Media

Good news from OLAC.
The Best Practices for Cataloging Streaming Media document is available on the OLAC website. Created by the CAPC Streaming Media Best Practices Task Force, it presents best practice guidelines and examples for cataloging both streaming video and audio, based on AACR2. It also presents definitions and examples of resources that can be considered as streaming media.

This document is available in both HTML and PDF formats.
Thanks.

Tuesday, January 22, 2008

Freebase Books Schema

Freebase is an interesting project, they accept data sets and then provide a platform to access them. Something like the Talis platform? They have a section for books and have a schema for book information. I'm not sure of the mechanics behind it all. I'd guess RDF would make the cross data set links easier. Here is an example of bibliographic data being just one type of data in a much larger system with connecitons to other data sets. Interesting.
Freebase is an open database of the world’s information. It is built by the community and for the community--free for anyone to query, contribute to, built applications on top of, or integrate into their websites.

Already, Freebase covers millions of topics in hundreds of categories. Drawing from large open data sets like Wikipedia, MusicBrainz, and the SEC, it contains structured information on many popular topics, like movies, music, people and locations--all reconciled and freely available via an open API. This information is supplemented by the efforts of a passionate global community of users, who are working together to add structured information on everything from philosophy to European railway stations to the chemical properties of common food ingredients.

In fact, part of what makes Freebase unique is that it spans domains--but requires that a particular topic exist only once in Freebase, even if it might normally be found in multiple databases. For example, Arnold Schwarzenegger would appear in a movie database as an actor, a political database as a governor and a bodybuilder database as a Mr. Universe. In Freebase, there is only one topic for Arnold Schwarzenegger, with all three facets of his public persona brought together. The unified topic acts as an information hub, making it easy to find and contribute information about him.

For books they have a work-like idea, a bit FRBR-like.
"Book" represents the abstract notion of a particular book, rather than a particular edition. It is on this level that articles or discussion about a book should generally occur (e.g., the article about Mary Shelley's "Frankenstein" is on the book topic, rather than on one or more of the hundreds of editions it has gone through). The book topic should also be used for connections to other types, such as films that have been adapted from a book.

Addition to the MARC Code Lists for Relators, Sources, Description Conventions

The code listed below has been recently approved for use in MARC 21 records. The code will be added to the online MARC Code Lists for Relators, Sources, Description Conventions.

This code should not be used in exchange records until after March 18, 2008. This 60-day waiting period is required to provide MARC 21 implementers time to include newly defined codes in any validation tables they may apply to the MARC fields where the codes are used. Term, Name, Title Sources

The following code is for use in subfield $2 in fields 600-657 (Subject Added Entries) in Bibliographic and Community Information records, field 662 (Subject Added Entry) in Bibliographic records, fields 700-788 (Heading Linking Entries) in Authority records and in subfield $f in field 040 (Cataloging Source) in Authority records.

Addition:
qlsp
Queens Library Spanish language subject headings (Queens, NY: Queens Library) [use only after March 18, 2008]

Friday, January 18, 2008

Authority Tools

OLAC has announced an update to one of their resources.
The online resources Authority Tools for Audiovisual and Music Catalogers: an Annotated List of Useful Resources, has been revised and updated. Along with some editorial updates of URLs and new edition information, reviews were added for the following titles:
  • Opera : an encyclopedia of world premieres and significant performances, singers, composers, librettists, arias and conductors, 1597-2000
  • A dictionary-catalog of modern British composers
  • Encyclopedia of the blues (ISBN: 0415926998)
I've also just received my OLAC Newsletter.

Thursday, January 17, 2008

FRBR Talk

Wiliam Denton mentions on his FRBR Blog that he will be speaking at the Ontario Library Association’s 2008 Superconference. I just finished his essay in Arlene Taylor’s Understanding FRBR: What It Is and How It Will Affect Our Retrieval Tools (Amazon WorldCat) and if speaks even half as well as he writes it should be an enjoyable session. If you want or need a history of cataloging, his essay would be a good start.

LOC Tagging Experiment on Flickr

The Library of Congress has uploaded two collections of photographs to Flickr and invited people to add tags. The sets are 1930s-40s in Color and News in the 1910s. It will be interesting to see how sucessful the tagging project is.

These images have the rights statement "no known copyright restrictions", an experimental rights statement for Flickr. Reaction from Flickr users seems to be extremely positive.

Monday, January 14, 2008

NISO to Develop Standard Identification for Institutions

FOAF for institutions from the folks at NISO.
Members of the National Information Standards Organization (NISO) have voted to approve the creation of a working group to explore issues surrounding institutional identification. This working group will be charged with proposing an identifier that will uniquely identify institutions and that will describe relationships between entities within institutions. This new NISO group will also consider what minimum set of data is required for unique identification as well as what other data may be used to support the business models of respective organizations, while also taking into account privacy and security issues.
There are and have been other schemes to identify institutions. The MARC Code List for Organizations is a good source for libraries and their parent institutions. You can even request codes there. Does your institution have a code? Then there is the SAN (Standard Address Number) for organizations in (or served by) the publishing industry. Once there was someone who kept track of institution numbers in barcodes, but I'm sure that has gone the way of the 8-track.