Wednesday, March 11, 2009

Normalize your LCCNs

Jonathan Rochkind points out the need to normalize your LCCNs when using some services. OCLC Identities apparently does not do this. LC does provide an explanation of the process. Seems like a simple enough process. Have any of the Code4Lib crowd written anything? Just how useful would this generally be?

2 comments:

jrochkind said...

This is crucially important if you want to compare two LCCNs gotten from two different records to see if they are the same LCCN.

One example of where you'd want to do this is taking an LCCN from a record in your ILS, and then searching Google Books or HathiTrust to see if they have a record representing the same LCCN. Unless GBS/HathiTrust normalizes before indexing, and you normalize before making your query -- you could get a false negative when the two parties have the same LCCN in different forms.

Same for any other circumstance where you need to compare or search for an LCCN.

"Useful" isn't the right word. It's CRUCIAL.

I have written code to do the normalizing, if that's what you're asking, it's not particularly hard.

edipretoro said...

There is already some existing code to normalize LCCN. I don't know if Anirvan Chatterjee is a member of Code4lib, but he has written a Perl module named Business::LCCN (http://search.cpan.org/~anirvan/Business-LCCN-0.12/). With this tool, you can normalize your LCCN.