Sunday, August 10, 2008

Panama City Beach Library


Panama City Beach Library
Originally uploaded by dbigwood
While on vacation I stopped into the library to check my e-mail. I was greeted by a person at the front desk. Very friendly, not like the Wal-Mart greeters. I was able to use the machine to check my email, Facebook page, and OK some comments to this weblog. Nothing seemed to be blocked. There was a 30 minute a day time limit, it seems a silly rule, if the machines are empty, but....

While I was there other folks were using the computers browsing the fiction, reading a magazine. It was a very small library, but they did have a childrens' collection and provided story time.

They caught me taking this picture and asked about it, curious not snooping.

Koha 3.0

The Koha folks have announced that a packaged release of Koha 3.00.00 is now available. It can be download from the usual location:

http://download.koha.org/koha-3.00.00.tar.gz
http://download.koha.org/koha-3.00.00.tar.gz.sig

The 3.0 manual is available and will continually be updated.

Friday, August 01, 2008

LibraryThing API

News from LibraryThing.
LibraryThing just released a free, CC-attribution-licensed Web Services XML API to our "Common Knowledge" system, including series data, fictional characters, author dates and much else. I'm particularly stoked about the series data. I think it's of exceptional quality, suitable for use in OPACs (eg., Star+Wars). Anyway, in a catalog or not, there are a lot of cool things to do with it.

OCLC Crosswalk Web Service Demo

New demo tool from OCLC Research, Crosswalk Web Service.
The purpose of the Crosswalk Web Service (CWS) is to translate a group of metadata records from one format into another.

For this service, a metadata format is defined as a triple of:
  • standard - The metadata standard of the record (e.g. MARC, DC, MODS, etc ...)
  • structure - The structure of how the metadata is expressed in the record (e.g. XML, RDF, ISO 2709, etc ...)
  • encoding - The character encoding of the metadata (e.g. MARC8, UTF-8, Windows 1251, etc ...)
To use the service you will have to write your own client software. With the aid of the WSDL file, this should be relatively easy. This documentation, however, does not cover how to write the client.

Facebook Blog Network

Still need a few more confirmations on the Blog Network on Facebook that I'm responsible for New and Noteworthy. Still need plenty for Recent Additions. Catalogablog is doing fine.

Vacation

I'll be on vacation next week. No WiFi in the beach house. I may or may not bring a laptop. So, most likely, no news for a week after today. Any good dancing, eating, hiking, gardens in the Panama City, Fla. area?

ORE Challenge at RepoCamp

There will be a cash prize of $2000, sponsored by Microsoft Research, for the the best prototype that uses and promotes OAI-ORE. This challenge is open to teams from anywhere, whether or not they attend RepoCamp. The competition deadline for prototype entries is August 8th (two weeks on from RepoCamp).

Training PDF Products Available for Free Download

Good news from LC.
On October 1, 2008, CDS will discontinue selling PDF training products. Instead, the following PDF training courses will be made available for free download:
  • The workshop materials from the Serials Cataloging Cooperative Training Program (SCCTP): Basic Serials Cataloging; Advanced Serials Cataloging, Integrating Resources Cataloging, Electronic Serials Cataloging, and Serials Holdings.
  • The workshop materials from Cooperative Cataloging Training (CCT): Basic Subject Cataloging using LCSH, Basic Creation of Name and Title Authorities, Fundamentals of Series Authorities, and Fundamentals of Library of Congress Classification.
  • The workshop materials from Cataloging for the 21st Century (Cat21): Rules and Tools for Cataloging Internet Resources, Metadata Standards and Applications, Metadata and Digital Library Development, Digital Project Planning and Management Basics, Principles of Controlled Vocabulary and Thesaurus Design.
The maintenance of these PDF training products will be handled by the Instructional development and Training Division of the Acquisitions and Bibliographic Control Directorate at the Library of Congress. Additional information about these workshops is available online.

CDS will continue to sell printed training products such as Cataloging Concepts and MARC Content Designation for the present.

CDS will not be issuing refunds to customers who purchased PDF course materials prior to October 1, 2008.

Wednesday, July 30, 2008

Database of Databases

The Internet Search Environment Number (ISEN) intends to catalog catalogs and other databases.
You know how the ISBN is assigned to books. Over 1 million books are assigned ISBNs each year. What ISEN plans to do is emulate that system for databases. We would assign over 1 million databases ISEN or Internet Search Environment Numbers once the system is in place in its first year. There may be as many as 5 million in the backlog for cataloging by a social nework of librarians. Life Science databases would be cataloged by life science librarians, law resources by law librarians, etc...

Then we would create a database of databases or search engine only for databases. Your hit list would only be databases instead of PDF files, blog postings and random HTML files. We pull out the databases. The hits you get would be the interface to databases which provides access to upwards of 500 to 650 times the amount of information available on the "surface web" indexed by the major search engines. ISEN reveals the what is called the "deep web".

They have a weblog and mailing list.

Tuesday, July 29, 2008

Vacation

Next week I'll be on vacation in Panama Beach, Fla. Anyone have suggestions on things to do or see in the area? Thanks.

FRBR Tool for ISIS

Roberto Sturman has announced that the IFPA2 (ISIS FRBR Prototype Application - ver. 2) is now online.

(username/password for dataentry: ifpa2/demo2)

The new implementation of the prototype is based on WebLis.

Its main features are:
  • new database design: relationships are managed in dedicated records, one relationship per record;
  • unlimited no. of relationships for each Entity (within the database capability);
  • creation of Entities/Relationships by hyperlinks; picklist assisted relationship management;
  • WEB based interface for all functions, data entry included;
  • pseudo-tree view of FRBR bibliographic "towers"
He asks us to "Please note as the user interface design is still in fluctuation and the application has still many bugs, inconsistencies, so it is not yet available for download. I hope to make it downloadable shortly."

The requirements are: Firefox, Opera, IE6 or IE7; cookies, javascript and pop-ups enabled. That last requirement might prove to be a problem.

IESR

A Registry of collections and their services : from metadata to implementation by Ann Apps appears in the Proceedings The International Conference on Dublin Core and Metadata Applications (DC2004), pp. 67-73, Shanghai (China).
The JISC Information Environment Service Registry (IESR) is a machine-to-machine middleware shared service providing a single central catalogue of quality descriptions of collections of resources available to researchers, learners and teachers in the UK, along with details of the services that provide access to those collections. The collections and services are described according to a set of metadata, which is defined by IESR, but is based on open standards wherever possible. The prototype registry is implemented as an XML repository indexed with the Cheshire II information retrieval software, with an associated meta-registry to support browsing and data capture. Several interfaces for server-to-server retrieval of IESR XML descriptions are available, as well as a Web interface.
Some other related papers by Ann Apps include:

Monday, July 28, 2008

Additions to the MARC Code Lists for Relators, Sources, Description Conventions

The codes listed below have been recently approved for use in MARC 21 records. The codes will be added to the online MARC Code Lists for Relators, Sources, Description Conventions.

The codes should not be used in exchange records until after September 25, 2008. This 60-day waiting period is required to provide MARC 21 implementers time to include newly defined codes in any validation tables they may apply to the MARC fields where the codes are used.

Description Conventions

The following code is for use in subfield $e in field 040 in Bibliographic and Authority records (Description Conventions).

Addition:

dcrms
Descriptive Cataloging of Rare Materials (Serials) (Washington, DC: Library of Congress) [use only after September 25, 2008]
Term, Name, Title Sources

The following codes are for use in subfield $2 in fields 600-657 and 662 in Bibliographic and Community Information records (Subject Added Entries/Index Terms); subfield $f in field 040 (Cataloging Source) in Authority records; and subfield $2 in fields 700-788 (Heading Linking Entries) in Authority records.

Additions:

chirosh
Chiropractic Subject Headings (http://www.chiroindex.org/abouticl.php) [use only after September 25, 2008]
eet
European education thesaurus (http://redined.r020.com.ar/en/) [use only after September 25, 2008]
pkk
Predmetnik za katoliske knjiznice (Ljubljana: Maribor) [use only after September 25, 2008]
ssg
Splosni slovenski geslovnik (http://www.nuk.uni-lj.si/ssg/ssg.html) [use only after September 25, 2008]
The code listed below was previously defined for use in subfield $2 in Bibliographic and Community Information records in fields 600-651 and field 040, subfield $f in Authority records. Usage has been expanded and this code is now available for use in subfield $2 in fields 600-657 and 662 in Bibliographic and Community Information records (Subject Added Entries/Index Terms); subfield $f in field 040 (Cataloging Source) in Authority records; and subfield $2 in fields 700-788 (Heading Linking Entries) in Authority records.

Change:

gtt
GOO-trefwoorden thesaurus (Den Haag: Koninklijke Bibliotheek) [use in new fields after September 25, 2008]
Other codes

The following code is for use in subfield $2 in field 047 (Form of Musical Composition Code) in the Bibliographic format.

Addition:

iamlmf
International Association of Music Libraries Musical forms codes (http://www.iaml.info/en/activities/cataloguing/unimarc/forms) [use only after September 25, 2008]
The following code is for use in subfield $2 in field 048 (Number of Musical Instruments or Voices Codes) in the Bibliographic format.

Addition:

iamlmp
International Association of Music Libraries Medium of performance codes (http://www.iaml.info/en/activities/cataloguing/unimarc/medium) [use only after September 25, 2008]

European APIs

The JISC Information Environment Service Registry (IESR):
  • is a machine readable registry of electronic resources;
  • contains information about these electronic resources, and details of how to access them;
  • aims to make it easier for other applications to discover and use materials which will help their users' learning, teaching and research.

Wednesday, July 23, 2008

Twitter

What is happening with Twitter? A week ago almost 400 people were following LPI_Library and I was following about 1/2 of that. Now both those numbers are less than 1/2 of what they were a week ago. It seems folk are leaving in droves. If it were just my followers going down I might reevaluate how I was posting but the people I follow has also been dropping so I can only assume either 1) Twitter has lost people's accounts or 2) people are leaving Twitter for other services.

I have created an account on FriendFeed. I'm capturing my Facebook, LPI_Library tweets, and postings here. Maybe this is where all the cool kids are hanging? Maybe Pownce or Jaiku or ?

This does raise a problem. Just what is our attention span with new tech tools? Twitter is not yet old enough to have been mentioned in any books and already it is passe. How can anyone keep up with this? How far ahead of the curve are we going to be? If all our users are a year or two behind us, are we serving them by continuing to move on?

OPAC 2.0

Chalon, Patrice X. and Di Pretoro, Emmanuel and Kohn, Laurence (2008) OPAC 2.0: Opportunities, development and analysis. In Proceedings 11th European Conference of Medical and Health Libraries, Helsinki (Finland).
Web 2.0 has raised new expectations from the library users : after reading a book, they wish to rate it, provide some comments or review about it and tag it for themselves or for others. They also expect to discover other interesting books thanks to the contribution of other people. Those functions, summarized under OPAC 2.0, are now provided by several Integrated Library Systems (ILS), at least partially. But, due to the slow development of some products, other paths were also explored: Content Management Systems (CMS) or specific software. CMS does provide the required functionalities like tagging and commenting. Some pioneers thus decided to develop a new Web OPAC based on CMS. Another approach was to build an OPAC that is independent from any ILS and which offers the required functionalities. In this paper, we propose to review the options available for the librarians wishing to offer Web 2.0 functionalities to their users. We also provide a synthesis of our own experience in implementing an OPAC 2.0 into our Library.

Breaking the Librarian Stereotype

This certainly breaks the stereotype, if she really is a librarian. My Spanish is not good enough to know if it is serious or meant to be ironic. Is there a Metal Librarian blog yet?

Tuesday, July 22, 2008

Small Libraries and OCLC

Are there any other libraries, other than here at the LPI, that would like to be an OCLC member but just don't have the funds?

How about OCLC services or products that you desire, but are out of reach? For instance, I want to access the authority files, then we could become NACO participants.

I'm asking because OCLC has a task force on small libraries and would like to hear from anyone in the same situation as we are. We would love to share our collection on WorldCat and Open WorldCat but find the set-up fees too large a hurdle. Too much of our cataloging is original, so the copy cataloging only option is not for us. There are no Groups we are able to join, anyone want to start a space science group or Houston group? In the end, our very rich unique collection is not visible via OCLC.

Now seems to be a good time to voice concerns to the Task Force or the folks at OCLC, since they are looking at small libraries.

Bibliographic Citation Tool in Facebook

OCLC has a Facebook app for those needing to create citations, CiteMe.
Get formatted citations in APA, Chicago, Harvard, MLA, or Turabian style. Start by searching for an item in WorldCat, the world's largest network of library content and services. Find your title in the results, select your favorite format, and you're done.
It also allows you to find other editions and find in a local library. I've added it to my Facebook account.

Monday, July 21, 2008

Saturday, July 19, 2008

Library APIs

Roy Tennant has posted a list of library APIs. If you know of any that deserves to be included, let him know.

Tuesday, July 15, 2008

From Awareness to Funding: A study of library support in America

The OCLC report on library funding, From Awareness to Funding: A study of library support in America has been released. One non-intuitive finding is that library use and library support are not correlated. Marketing to and mobilizing our users at election time is not the best use of our resources.

SCATNews

The latest issue of SCATNews, Newsletter of the Standing Committee of the IFLA Cataloguing Section (Number 29) is now available on the IFLA website.

Facebook

I've added this weblog to the Facebook Blog Network, now you can read it there is that is your preference.

Making your content available in more places makes metrics hard. Before Bloglines, Google Reader, Facebook Blog Network, Planet Catalog, and all the rest I could get a feel for the number of readers. Didn't matter too much to me, this is done for my own benefit as well as the community. However, if I was in a position and needed numbers to justify the work it would make it difficult.

Collocate and Disambiguate

Here's a new weblog of interest to catalogers, Collocate and Disambiguate. Not yet on Planet Cataloging, so grab the RSS feed for your reader.
Created by Lois Reibach, this blog will discuss news and trends in authority control, and new uses of authority data. Developments in controlled vocabularies will also be covered.

Monday, July 14, 2008

Moving Image Genre/Form Project Report

In early 2007 the Cataloging Policy and Support Office (CPSO) of the Library of Congress initiated a project to create authority records for genre/form headings (MARC tag 155), which indicate what a work is, as opposed to what it is about....

This past Tuesday members of CPSO presented a report on the moving image genre/form project to LC managers. The report
  • explains the function of genre/form headings, including the impact that they have on both cataloging operations and end-user searching;
  • reviews the history of genre/form headings in MARC format and at LC over the last decade;
  • explains the logic of choosing moving image headings as the experimental group and the principles and policies that CPSO developed as the project progressed; and,
  • recommends the expansion of genre/form headings beyond moving images and radio programs into such disciplines as law, music, literature, cartography, and religion.
From an email message.

The report says that the prefered method of entering genre/form information is 655 rather than subfield v. Is this the general consensus? Has any research been done? Any MLIS student even written a paper on the pros and cons of each approach?

Thursday, July 10, 2008

Classify from OCLC

Classify is a service from OCLC. Search, the resulting FRBR set is checked and then the classification numbers used displayed. Quick, simple way to get a class number. No need to be an OCLC member. Does Dewey, NLM, and LCC at least. Not sure about other less used classification schemes, like the one at the US Geological Survey.

Seen on Lorcan Dempsey's weblog.

Wednesday, July 09, 2008

PRISM News

PRISM (Publishing Requirements for Industry Standard Metadata) has announced the availability of the new PRISM Cookbook.
The PRISM Cookbook builds on the PRISM Specification and assumes users have a basic understanding of metadata and PRISM. It does not answer questions such as “What is metadata?”, “What is PRISM?”, and “Why choose PRISM?”, but assists implementers by providing a set of practical implementation steps for a chosen set of use cases and provides insights into more sophisticated PRISM capabilities.
There is also an online video about the Cookbook.

A Best Buy

Special offer for NEW members: JOIN WAML FOR 1/2 OFF

The Western Association of Map Libraries (WAML) is looking for folks who want to expand their knowledge of maps and geospatial information through fun-filled networking opportunities and information-packed meetings and journals!

$15 (normally $30 a year) -- Good for NEW members only. Membership offer good from now till July 31, 2008.

The Western Association of Map Libraries (WAML) is an independent association of map librarians and other people with an interest in maps and map librarianship. Membership in WAML is open to any individual interested in furthering the purpose of the Association which is "to encourage high standards in every phase of the organization and administration of map libraries."

BENEFITS:
Subscription to the Information Bulletin (IB) Discounted registration fees to WAML's bi-annual meetings Practical workshops on topics such as aerial photos, scanning projects, and map cataloging Networking regarding geospatial and cartographic information Participation in WAML's electronic discussion board

INFORMATION BULLETIN
WAML's Information Bulletin is issued three times a year and enjoys worldwide readership. It includes feature articles, photo essays, Association business, book and electronic resources reviews, new map lists, and selected news and notes.

MEETINGS!!!
WAML meetings are THE most fun-filled library-related events you can attend!! They occur in the Spring and Fall. They are small (around 50 people), held in great locations such as Las Vegas, Denver, Flagstaff, and Pasadena, and have great field trips and delicious banquets. The presentations deal only with geospatial topics.
Roundtable discussions and workshops take place at every meeting. The registration fee runs from $35 to $60. The accommodations are reasonably priced, the camaraderie is great, and the tone is relaxed. Often, WAML has a 'map exchange' where attendees bring their withdrawn and extra copies of maps and make them available for others.

We are headed to the San Diego in October 2008!!

Field trips have taken WAML members to national parks, volcanoes, mountain tops, museums, and vineyards/wineries.

In the last 5 years, WAML has met in Las Vegas, Denver, Flagstaff, Pasadena, Vancouver, Fairbanks, Chico California, and Santa Cruz. Future meeting sites include San Diego, Salt Lake City, and Yosemite National Park.

If that weren't enough, you are invited to give presentations at the conferences OR write articles for the Information Bulletin. Presentations and papers run from the very formal to 'how I done good.' In the past WAML presenters and IB authors have been not just librarians but scholars, novelists, artists, map collectors, map dealers, scientists, and cartographers.

Come join us. The price is right. The offer is available for a limited time. Good times, good friends and good maps await you!

Copied from email on distribution list.

Tuesday, July 08, 2008

Viewzi

Viewzi is a new search tool, a search mash-up (smash?). They have made it possible to create different views and parameters for a search. On search brings up screens for photos, videos, 4 search engines combined, etc. Interesting approach, they will have an open API where custom views can be constructed.

This inspired a couple of thoughts, first, there is no book search. There is an Amazon view. How about one with Worldcat, LibraryThing, Open Content Alliance, Google Books, and Project Gutenberg. Or whatever sites/collections make sense.

Second, is there anything here that could make our OPACs, i.e. the front ends to our catalogs, better. What ideas, or presentation or results work. The views often break things up by facets, MP3s, Videos, Websites, etc. Is faceting the results useful? Other times they provide results from just one resource, Techcrunch for instance. Can this inform our metasearch tool development? Maybe not, but maybe there is something worth considering.

Open Shelves Classification

LibraryThing is building the Open Shelves Classification (OSC), a free, "humble," modern, open-source, crowd-sourced replacement for the Dewey Decimal System.
The vision. The Open Shelves Classification should be:
  • Free. Free both to use and to change, with all schedules and assignments in the public domain and easily accessible in bulk format. Nothing other than common consent will keep the project at LibraryThing. Indeed, success may well entail it leaving the site entirely.
  • Modern. The OSC should map to current mental models--knowing these will eventually change, but learning from the ways other systems have and haven't grown, and hoping to remain useful for some decades, at least.
  • Humble. No system--and least of all a two-dimensional shelf order--can get at "reality." The goal should be to create a something limited and humble--a "pretty good" system, a "mostly obvious" system, even a "better than the rest" system--that allows library patrons to browse a collection physically and with enjoyment.
  • Collaboratively written. The OSC itself should be written socially--slowly, with great care and testing--but socially. (I imagine doing this on the LibraryThing Wiki.)
  • Collaboriately assigned. As each level of OSC is proposed and ratified, members will be invited to catalog LibraryThing's books according to it. (I imagine using LibraryThing's fielded bibliographic wiki, Common Knowledge.)
I also favor:
  • Progressive development. I see members writing it "level-by-level" (DDC's classes, divisions, etc.), in a process of discussion, schedule proposals, adoption of a tenative schedule, collaborative assignemnt of a large number of books, statistical testing, more discussion, revision and "solidification."
  • Public-library focus. LibraryThing members are not predominantly academics, and academic collections, being larger, are less likely to change to a new system. Also, academic collections mostly use the Library of Congress System, which is already in the public domain.
  • Statistical testing. To my knowledge, no classification system has ever been tested statistically as it was built. Yet there are various interesting ways of doing just that. For example, it would be good to see how a proposed shelf-order matches up against other systems, like DDC, LCC, LCSH and tagging. If a statistical cluster in one of these systems ends up dispersed in OSC, why?

Monday, July 07, 2008

Universal Decimal Classification

Maintenance of the Universal Decimal Classification: overview of the past and preparations for the future by Aida Slavic and Maria Ines Cordeiro and Gerhard Riesthuis appears in International Cataloguing and Bibliographic Control 37(2):pp. 23-29.
The paper highlights some aspects of the UDC management policy for 2007 and onwards. Following an overview of the long history of modernization of the classification, which started in the 1960s and has influenced the scheme's revision and development since 1990, major changes and policies from the recent history of the UDC revision are summarized. The perspective of the new editorial team, established in 2007, is presented. The new policy focuses on the improved organization and efficiency of editorial work and the improvement of UDC products.

Better Targeted Ads

Computing Semantic Similarity Using Ontologies by Rajesh Thiagarajan, Geetha Manjunath, and Markus Stumptner is a new HP Lab Report.
Determining semantic similarity of two sets of words that describe two entities is an important problem in web mining (search and recommendation systems), targeted advertisement and domains that need semantic content matching. Traditional Information Retrieval approaches even when extended to include semantics by performing the similarity comparison on concepts instead of words/terms, may not always determine the right matches when there is no direct overlap in the exact concepts that represent the semantics. As the entity descriptions are treated as self-contained units, the relationships that are not explicit in the entity descriptions are usually ignored. We extend this notion of semantic similarity to consider inherent relationships between concepts using ontologies. We propose simple metrics for computing semantic similarity using spreading activation networks with multiple mechanisms for activation (set based spreading and graph based spreading) and concept matching (using bipartite graphs). We evaluate these metrics in the context of matching two user profiles to determine overlapping interests between users. Our similarity computation results show an improvement in accuracy over other approaches, when compared with human-computed similarity. Although the techniques presented here are used to compute similarity between two user profiles, these are applicable to any content matching scenario.

Thursday, July 03, 2008

eXtensible Catalog & Koha

News from LibLime about Koha and the eXtensible Catalog.
LibLime, the leader in open-source solutions for libraries and the eXtensible Catalog (XC) project-- an Andrew W. Mellon Foundation funded project currently underway at the University of Rochester's River Campus Libraries-- have announced a new partnership agreement to ensure future compatibility between the XC project and Koha, the first open-source integrated library system.

The XC/LibLime partnership will ensure that the open-source software being developed as part of the XC project and the Koha open-source integrated library system will be fully compatible with each other, enabling current and future users of Koha to take advantage of the added capabilities for managing and distributing metadata that XC will offer. These benefits include facilitating the ability to combine legacy metadata with emerging schemas, and delivering library content to web content management and learning management systems.

Wednesday, July 02, 2008

Changes to MARC Code List for Languages

As a result of a formal request from the National Libraries of Serbia and Croatia and those countries' national standards bodies to the ISO 639 Joint Advisory Committee, the MARC language codes for Serbian and Croatian will be changed as below from the ISO 639-2 bibliographic codes (ISO 639-2/B) to the ISO 639-2 terminology codes (ISO 639-2/T). This change also supports established usage in bibliographic databases in Croatia. Because the codes are obsolete, rather than deleted, they may still appear in bibliographic records created before the implementation of this change.


New CodeLanguage NamePreviously Coded
srpSerbianscc
hrvCroatianscr
Subscribers can anticipate receiving MARC records reflecting these changes in all distribution services not earlier than September 1, 2008.

Martha Yee Articles

Some more articles by Martha Yee are now available.

Integration of Nonbook Materials in AACR2. Cataloging & Classification Quarterly 1983; 3:1-18.

Attempts to Deal With the Crisis in Cataloging at the Library of Congress in the 1940's. Library Quarterly 1987 Jan; 57:1-31.

What is a Work? In: The Principles and Future of AACR: Proceedings of the International Conference on the Principles and Future Development of AACR, Toronto, Ontario, Canada, October 23-25, 1997. Ed., Jean Weihs. Ottawa: Canadian Library Association; Chicago: American Library Association, 1998: 62-104.

Editions: Brainstorming for AACR2000. In: The Future of the Descriptive Cataloging Rules: Papers from the ALCTS Preconference, AACR2000, American Library Association Annual Conference, Chicago, June 22, 1995. Ed., Brian E.C. Schottlaender. (ALCTS Papers on Library Technical Services and Collections, no. 6) Chicago: American Library Association, 1998: 40-65.

Viewpoints: One Catalog or No Catalog? ALCTS Newsletter 1999; 10:4:13-17.

Lubetzky's Work Principle. In: The Future of Cataloging: Insights from the Lubetzky Symposium, April 18, 1998, University of California, Los Angeles. Ed., Tschera Harkness Connell, Robert L. Maxwell. Chicago: American Library Association, 2000.

Tuesday, July 01, 2008

RDA News

News from RDA.
The Co-Publishers of RDA Online (the American Library Association, the Canadian Library Association, and the Chartered Institute of Library and Information Professionals) have reached the conclusion that further time is required to complete the development of the new software that will be used for distributing the full draft of RDA for constituency review.

The full draft was originally scheduled for release on August 4, 2008. Instead, it will now be issued in October 2008. The three month time period allocated for comments on the full draft is unchanged, and in this new schedule will extend from October into January 2009. More specific dates for RDA's final release will be forthcoming shortly.

Members of the Committee of Principals (CoP) and the Joint Steering Committee for Development of RDA (JSC) agree that the importance of distributing RDA content in a well-developed and tested version of the new software is such that a two-month delay is justified. They concluded that this extension is worthwhile given the ultimate value of the exceptional effort that is going into RDA and feel that the review by constituencies will be enhanced as a result.

OCLC Terminology Services

Terminology Services, an Experimental Services for Controlled Vocabularies, a project of OCLC Research is now available.

Highlights

  • Search descriptions of controlled vocabularies
  • Search for concepts/headings in a controlled vocabulary
  • Retrieve a single concept/heading by its identifier
  • View relationships for a concept/heading including equivalence, hierarchical, and associative
  • Retrieve concepts/headings in multiple representations including HTML, MARC XML, SKOS, and Zthes.
  • Search using SRU CQL syntax
Vocabulary Resources include:
  • FAST subject headings
  • GSAFD Form and genre terms
  • Library of Congress AC Subject Headings
  • Library of Congress Subject Headings
  • Medical Subject Headings
  • Thesaurus for graphic materials: TGM I
  • Thesaurus for graphic materials: TGM II

New Union Catalog

The Avi Chai Foundation has announced a new tool for Judaica librarians — the Avi Chai Bookshelf Union Catalog. The union catalog, contains the MARC bibliographic holdings of 31 Jewish high school libraries in the United States and Canada that have been recipients of Avi Chai's Bookshelf grant. The Avi Chai Union Catalog runs on the OPALS (open source) library automation system.

Monday, June 30, 2008

Discovery at Safari Books

Jeff Patterson, CEO, Safari Books Online LLC spoke at the O'Reilly Tools of Change Conference on Valuing Content in a Web-enabled World
To effectively market their wares, publishers need to understand how their content is valued by the audience. With the web turning traditional distribution models on their head, easy searchability and access to a variety of free and paid resources must be considered. Jeff Patterson shares research on the information seeking habits of his client base of IT professionals. As users weigh the worth of information in exchange for their time, money and attention, publishers must grasp not just what is sold, but what is read, used and reused....

Money is one part of the equation, but time, and willingness to share personal details, are also important forms of currency. Patterson's studies posed a number of scenarios which revealed different behaviors depending on the urgency of the information seeking. Subscribers researching a long term question tended to start with paid resources such as online subscriptions or print books. Those with urgent business questions were more likely to use search engines as their first tool. These different behaviors bring home the point that products must be discoverable within a sea of available options. Information consumers will place a value on different resources depending on their context. The burden is now on the publishers to understand how their information is being used.

Friday, June 27, 2008

2008 Midwinter MARBI Meeting Minutes

The 2008 Midwinter MARBI Meeting minutes are now available online.

Cataloging Principles and RDA

Cataloging Principles and RDA by Barbara Tillett is the newly available webcast from LC.
The second in a series on RDA: Resource Description and Access, the next generation cataloging code designed for the digital environment. This presentation deals with the cataloging principles that have influenced the development of RDA; the challenges they present to the international sharing of bibliographic and authority data; and the challenges they present to the developers of RDA.

Wednesday, June 25, 2008

Metadata for Resource Discovery

Metadata to Support Next-Generation Library Resource Discovery: Lessons from the eXtensilble Catalog, Phase 1 by Jennifer Bowen has been published in the June 2008 issue of Information Technology and Libraries (p. 6-19).

The slides for her upcoming talk at ALA as part of the ALCTS Program, Creating the Future of the Catalog and Cataloging (Sunday morning, June 29, 8 AM-12 PM, Anaheim Convention Center, Room 204B) are on the XC Shared Results Page.

The next time nominations roll around for Movers and Shakers someone should nominate Jennifer. Her work on RDA and the eXtensilble Catalog more than qualify her.

Delay in Publication of 31st Edition of Library of Congress Subject Headings

News from LC.
Delay in publication of 31st edition of Library of Congress Subject Headings

Due to production problems, the 31st edition of the five-volume printed edition of the Library of Congress Subject Headings, commonly referred to as the Red Books, will not be available until the spring of 2009. The data cutoff date for the 31st edition will now be December 31, 2008.

Open Source OPAC

Rapi is yet another open-source OPAC project. It uses Lucene and Ruby like most of the projects do.
Rapi is an open-source project of the WING group in the School of Computing, National University of Singapore licensed under the MIT license. Rapi provides an OPAC package that allows you to:
  1. Build a Lucene index from your MARC files
  2. Screen scrape live circulation data from your own iii OPAC
  3. Wrap your OPAC with a customizable user interface
The user interface packaged with Rapi has been tested with Firefox 2 and 3 as well as Internet Explorer 7. The user interface supports a variety of features including tabs, an overview+details view, and a suggestion bar among many others. Note that although the user interface supports query suggestions, the package currently does not provide any suggestion modules. With that said, if you do have query suggestion modules, they can be easily integrated with the package. As an example, our live demo incorporates a spelling suggestion module.

Distributed Metadata Control Systems

Distributed Version Control and Library Metadata by Galen M. Charlton.
Distributed version control systems (DVCSs) are effective tools for managing source code and other artifacts produced by software projects with multiple contributors. This article describes DVCSs and compares them with traditional centralized version control systems, then describes extending the DVCS model to improve the exchange of library metadata.
Interesting suggestion. Network theory applied here. Only one node would be useless, two or three nodes interesting depending on the institutions, something like the old Linked System Project. More widespread adoption would make it much more useful.

Approved Books

The Open Library folks are considering adding information about banning to their bibliographic records. Other than MPAA ratings does anyone add approval by some body to their bibliographic records? I can remember seeing Nihil obstat and Imprimi potest on some books growing up. Is this still useful to some patrons for selecting an item?

Cross-concordances

Mayr, Philipp and Petras, Vivien (2008) Cross-concordances: terminology mapping and its effectiveness for information retrieval. World Library and Information Congress: 74th IFLA General Conference and Council, Québec, Canada.
The German Federal Ministry for Education and Research funded a major terminology mapping initiative, which found its conclusion in 2007. The task of this terminology mapping initiative was to organize, create and manage ‘cross-concordances’ between controlled vocabularies (thesauri, classification systems, subject heading lists) centred around the social sciences but quickly extending to other subject areas. 64 crosswalks with more than 500,000 relations were established. In the final phase of the project, a major evaluation effort to test and measure the effectiveness of the vocabulary mappings in an information system environment was conducted. The paper reports on the cross-concordance work and evaluation results.

Script Codes

One of the issues being considered by MARBI, Discussion Paper No. 2008-DP05, is how to indicate the script used in the bibliographic record. There is strong support for using the ISO 15924 Code List, Codes for the representation of names of scripts or Codes pour la représentation des noms d’écritures.

Thursday, June 19, 2008

FireFox Problems

I got the new improved FireFox, version 3, yesterday and now I'm using MS Explorer. FF3 is SLOW. I can't get into Blogger. Several add-ons I liked, TinyURL Creator, Link Evaluator, Persistent URL Bookmarker, and Map+ (opens a map for any address) don't work. I'm going to have to investigate wither it is possible to roll-back to the old version. I sure hope so. My advice, FWIW, wait.

It is the portable version of FireFox, maybe the regular version would not be so slow. It still wouldn't have the add-ons.

Operator+, an add-on that allows working with microformats is not working properly. I can't seem to export hCal events to Outlook.

June 24, I've reverted to an older version of FF Portable. All my tools are working again. At home I plan on moving to FF3. It will not be the portable version and the add-on tools are much less important.

Wednesday, June 18, 2008

OCLC Group Services

I've just heard of OCLC Group Services, a way for small libraries to participate in OCLC. Anyone have any experience with a group? Any group willing to have the Lunar and Planetary Institute Library become a member?

Tuesday, June 17, 2008

The Future of Cataloging: A PALINET Symposium

MP3s and slides from The Future of Cataloging: A PALINET Symposium are now available. The talks were:
  • Keynote Address, Karen Calhoun "Traveling Through Transitions in Technical Services: From Surviving to Thriving"
  • Response to Keynote, Panel Discussion / Beth Picknally Camden
  • Functional Requirements for Bibliographic Records (FRBR) and Current Development and Implementation Plans for Resource Description and Access (RDA) / John Attig
  • On the Record, One View of the Future – Library of Congress Report on the Future of Bibliographic Control / Nancy Fallgren
  • Making Special Collections Not So Special? The Implications for Archives and Special Collections of the Report of the Library of Congress Working Group on the Future of Bibliographic Control / Christine Di Bella
  • High Quality Discovery in a Web 2.0 World: Architectures for Next Generation Catalogs / John Mark Ockerbloom
  • Summary & Closing Remarks / Dina Giambi

Monday, June 16, 2008

Tagging

@toread and Cool : Subjective, Affective and Associative Factors in Tagging. In Proceedings Canadian Association for Information Science/L'Association canadienne des sciences de l'information (CAIS/ACSI), Vancouver, British Columbia (Canada).
This paper examines the use of non subject related tags in social bookmarking tools. Previous studies of tagging determined that many common tags are not directly subject related but are in fact affective tags dwelling on a user's emotional response to a document or are time and task related tags related to a users current projects or activities. These tags have been analysed to examine their role in the tagging process.
While not an academic study, the experience of LibraryThing in cleaning up tags for sale to libraries might be an interesting comparison. The study compares Del.icio.us, Connotea and CiteULike. It would be interesting to see how other tagging sites compare. What is the difference between tagging books, articles, websites and toasters? Is tagging different in different cultures? Do people in Japan tag differently than those in France? How about folk in Economics and Astrophysics? Lots of room for more research here. The next step would be to use the findings to inform our construction of subject headings. The FRBR group working on subjects might have a new body of knowledge to use in their work.

Friday, June 13, 2008

MARBI @ ALA

The remainder of the June 2008 MARC Advisory Group proposals have been posted and linked to the agenda for the meeting.

Chopac.org

Chopac.org has some interesting cataloging tools. There is an Amazon to MARC converter, DDC22 summaries, Amazon review server, and some others. They also have an ILS to download. Runs in the LAMP environment. They seem to have it up and running on their site. It gets additional info from Amazon and Google Books to enrich the records.

Thursday, June 12, 2008

On Descript

When I started this weblog back in 2002 nobody was covering cataloging. There was AUTOCAT, great place for discussion. But no one place was acting as a news source. Now there are plenty of other place to keep current in cataloging, check Planet Cataloging for a good list of weblogs in this space. Now another voice joins the chorus, On Descript, and we are richer for it.
On Descript is a forum dedicated to all things description in Library and Information Science (LIS). Here, you'll find information on subjects like cataloging, indexing, abstracting and the foundations of description practices in LIS. Please share your ideas!
Not yet covered by Planet Catalog, so visit his site.

Tuesday, June 10, 2008

Functional Requirements for Bibliographic Records

A German translation of the text of Functional Requirements for Bibliographic Records (FRBR) as amended and Japanese translations of the recently published errata and the amendment to the expression entity have been made available through IFLANET.

Monday, June 09, 2008

DCMI Registry Task Group

From the DCMI page.

DCMI Registry Task Group: call for participation.

A DCMI Registry Task Group has been set up with the primary aims of developing shared functional requirements and inter-registry interoperability issues. This group is currently recruiting participants. Those with an interest in metadata schema registries, terminology registries, ontology registries and metadata vocabulary management are invited to visit the Task Group's Wiki for further information, news, upcoming events and opportunities to contribute.

OLAC-MOUG 2008 Conference

Registration for the OLAC-MOUG 2008 Conference is open.

The joint conference of OLAC (Online Audiovisual Catalogers) and MOUG (Music OCLC Users Group) will take place in Cleveland, Ohio, between Friday, September 26 and Sunday, September 28, 2008. Attendees will enjoy four workshops on cataloging various non-book materials, keynote speech by Lynne Howarth (former Dean of the Faculty of Information Studies at the University of Toronto); closing address by Janet Swan Hill (Associate Director for Technical Services, University of Colorado); and a session on RDA, to name just a few highlights.

Preconference: space is limited for Thursday September 25th's Map Cataloging preconference, given by Paige Andrew.

Please see the conference website for more information and the registration form.

Posted to many distribution lists.

OAI-ORE Resource Maps

Posted to several lists.

The Foresite project is pleased to announce the initial code of two software libraries for constructing, parsing, manipulating and serialising OAI-ORE Resource Maps. These libraries are being written in Java and Python, and can be used generically to provide advanced functionality to OAI-ORE aware applications, and are compliant with the latest release (0.9) of the specification. The software is open source, released under a BSD licence, and is available from a Google Code repository.

You will find that the implementations are not absolutely complete yet, and are lacking good documentation for this early release, but we will be continuing to develop this software throughout the project and hope that it will be of use to the community immediately and beyond the end of the project.

Both libraries support parsing and serialising in: ATOM, RDF/XML, N3, N-Triples, Turtle and RDFa

Foresite is a JISC funded project which aims to produce a demonstrator and test of the OAI-ORE standard by creating Resource Maps of journals and their contents held in JSTOR, and delivering them as ATOM documents via the SWORD interface to DSpace. DSpace will ingest these resource maps, and convert them into repository items which reference content which continues to reside in JSTOR. The Python library is being used to generate the resource maps from JSTOR and the Java library is being used to provide all the ingest, transformation and dissemination support required in DSpace.

Please feel free to download and play with the source code, and let us have your feedback via the Google group:

foresite@googlegroups.com

Friday, June 06, 2008

More MARBI News

Some more MARBI news.

The following papers are available for review by the MARC community:
  • Proposal No. 2008-04: Changes to Nature of entire work and nature of content codes in field 008 of the MARC 21 bibliographic format
  • Proposal No. 2008-09: Definition of Videorecording format codes in field 007/04 of the MARC 21 Bibliographic format
  • Proposal No. 2008-10: Definition of a subfield for Other standard number in field 534 of the MARC 21 bibliographic format
Additional proposals and discussion papers will be posted shortly.

The draft agenda for the 2008 ALA Annual MARBI meetings is available online.

Please note that there is a strong possibility that MARBI may meet during its Monday afternoon time slot of 1:30-3:30 for continuation of the discussion.

Skype News

Skype now lets you set your mobile number as your caller-id on outgoing calls. Very nice. I'm set up.

ALA Annual MARBI Meeting

Posted to many e-mail distribution lists.

The following papers are available for review by the MARC community:

  • Proposal No. 2008-06: Adding information associated with the Series Added Entry fields (800-830)
  • Proposal No. 2008-07: Making field 440 (Series Statement/Added Entry--Title) obsolete in the MARC 21 Bibliographic Format
  • Proposal No. 2008-08: Definition of subfield $z in field 017 of the MARC 21 Bibliographic and addition of the field to the MARC 21 Holdings formats
  • Discussion Paper 2008-DP06: Coding deposit programs as methods of acquisitions in field 008/07 of the MARC 21 holdings format
Additional proposals and discussion papers will be posted shortly.

The draft agenda for the 2008 ALA Annual MARBI meetings will be made available soon.

Wednesday, June 04, 2008

Yahoo Search Monkey

Another step towards the Semantic Web, Yahoo SearchMonkey.
SearchMonkey is fundamentally about transforming the way search results are compiled and displayed by leveraging the same structured data that powers the millions of pages indexed by Yahoo! Search. By sharing structured data with Yahoo!, site owners and content publishers can build more useful, relevant and visually appealing search results, which can increase the quantity and quality of traffic from Yahoo! Search....

You can share data by embedding microformats, using semantic web standards such as RDF, sharing an XML data feed directly with Yahoo! Search, or using the SearchMonkey developer tool to build custom data services that extract structured data from your pages.

LibriVox

LibriVox is becoming a valuable resource for free audio books. They just reached 1500 titles in the collection.
We’ve had a pretty extraordinary May. We cataloged our 1,500th book, James Baldwin’s children’s history book, Four Great Americans, which was a great accomplishment. (Considering seven months ago we were at 1,000).

But we also had an impressively productive month: we released 115 (!) audiobooks into the public domain, almost four per day. Our previous record for monthly production was 77, reached in July 2007.
Is anyone cataloging these and adding them to their collection? Burning them to CDs and adding those to the collection? A few months back the Nebraska Library Commission made news by adding a few books licensed under Creative Commons to their catalog. Anyone doing the same for the LibriVox materials? Adding the records to OCLC for sharing or making them available via OAI-PMH?

Code4Lib Conference

The video from the Code4Lib Conference is now on Archive.org. Note that you can get the MPEG2 high def format there. Some talks include:
  • MARCThing Casey Durfee discusses MARCThing, a self-contained web service which aims to do for MARC and Z39.50 what Solr did for searching.
  • OpenURL Ross Singer and Jonathan Rochkind describe Ümlaut, an open source OpenURL middleware layer intended to improve the link resolving chain by analyzing incoming citations and intelligently querying resources to better enable access to them.
  • Blacklight Bess Sadler describes Blacklight, a Solr based OPAC replacement being developed by University of Virginia Library.
  • Scriblio Casey Bisson describes Scriblio, the OPAC replacement based on the WordPress authoring system.
  • A Metadata Registry Jon Phipps gives an introduction to the Metadata Registry, an open source vocabulary, metadata schema, and DC application profile manager and registry.
And plenty more.

Tuesday, June 03, 2008

Object Reuse and Exchange (ORE ) Specifications

The Open Archives Initiative has announced the public beta release of Object Reuse and Exchange Specifications.
Over the past eighteen months the Open Archives Initiative (OAI), in a project called Object Reuse and Exchange (OAI-ORE), has gathered international experts from the publishing, web, library, and eScience community to develop standards for the identification and description of aggregations of online information resources. These aggregations, sometimes called compound digital objects, may combine distributed resources with multiple media types including text, images, data, and video. The goal of these standards is to expose the rich content in these aggregations to applications that support authoring, deposit, exchange, visualization, reuse, and preservation. Although a motivating use case for the work is the changing nature of scholarship and scholarly communication, and the need for cyberinfrastructure to support that scholarship, the intent of the effort is to develop standards that generalize across all web-based information including the increasing popular social networks of “web 2.0”.

Monday, June 02, 2008

FGDC Digital Cartographic Standard for Geologic Map Symbolization

Found this sitting in the draft folder for quite some time. Here it is at last. The PostScript version of the FGDC Digital Cartographic Standard for Geologic Map Symbolization is now available as a USGS Techniques and Methods publication.

Reblog this post [with Zemanta]

Geologic Map Symbolization

The PostScript version of the FGDC Digital Cartographic Standard for Geologic Map Symbolization is now available as a USGS Techniques and Methods publication.

Improving Subject Searching

Improving subject searching in databases through a combination of descriptors and UDC by Granados, Mariangels and Nicolau, Anna (2008) In Proceedings BOBCATSSS'08: Providing acces for everyone, Zadar (Croatia)
Problems with subject access to online catalogues and databases are not new. Studies on the use of OPACs have revealed two apparently endemic problems: on the one hand, the large number of searches with zero hits (failed searches) and on the other, the retrieval of an excessive amount of bibliographic records (information overload).

In this paper we describe a new information retrieval technique based on the combination of descriptor weighting and the use of the Universal Decimal Classification (UDC) call numbers.

The use of classification call numbers in order to search the catalogue has traditionally been very restricted. In most catalogues, call numbers are used only as topographical indicators and are not searchable. The new system described here makes much fuller use of them.

The system is based on the hypothesis that a set of descriptors correspond to a UDC call number. Through the analysis of the frequency of distribution of descriptors and call numbers, we create a set of clusters that allow increasing precision and recall. At the same time, these clusters offer alternative search modes, making it possible to systematize the indexing process and increase its consistency. Here we present a case study of the use of the system with the ERIC database.

Friday, May 30, 2008

Tag Cleaner

Bring some consistency to your tagging with Delicious Tag Cleaner
What would a "Delicious Tag Cleaner" be? It is tool for removing unnecessary tags from your del.icio.us account....

If you're like me, you probably have thousands of bookmarks collected over years and years of web surfing and hundreds of tags used to describe them. But the thing is that over these months/years you haven't been able to come up with a consistent taxonomy for your tags.

I have, for example, dozens of different tags for expressing links related to software development: "dev", "devel", "development" etc.

So this tool can suggest you tags to be merged together, so you can choose one by one and have this tool to merge the chosen tags on your delicious account.
As you clean-up tags doesn't that remove them from the stream-of-consciousness thing? Aren't they losing their value and becoming subject headings? Poor ones at that.

Statement of International Cataloguing Principles

A reminder from IFLA about the Statement of International Cataloguing Principles.
This is a reminder announcement that the Statement of International Cataloguing Principles developed by the five IFLA Meetings of Experts on an International Cataloguing Code is now available for worldwide review and comment.

A vote form is also available there and can be used by anyone to indicate whether they approve the statement or not and to make comments. The form can be printed out, filled in, and faxed, or it can be filled in electronically and sent as an e-mail attachment.

Wednesday, May 28, 2008

2.0 Speaking Opportunities

Any folks who want to represent the library community in an eduction 2.0 setting should check out CR 2.0. They are having a series of 20 workshops around the U.S. and are using an unconference format. Go to their website and suggest a topic and the folks attending vote on what they want to hear. Even if you don't become a facilitator for the discussion, at least they have seen that libraries are part of eduction 2.0. Just participating in the discussion might open some eyes to the role of libraries in education.

Tuesday, May 27, 2008

Tagging @ NASA

NASA is sporting a tag cloud on their home page. It is generated from words used to search the site. Look to the right a bit down. It sports a nice star field background.

Friday, May 23, 2008

Web Ontology Language (OWL)

Some papers from HP Labs concerning the Web Ontology Language (OWL)
  • An OWL Full Interpretation by Jeremy Carrooll HPL-2008-60

    This report is an appendix to report HPL-2008-59. It gives a worked example of the construction used in the proof from that report. For finiteness, a reduced datatype map consisting of only xsd:boolean is used. Each of the graphs in the construction is listed explicitly, with some redundancy eliminated. The final Herbrand graph contains about 15,000 triples.

  • The Consistency of OWL Full (with proofs) by Jeremy Carroll and Dave Turner HPL-2008-59

    We show that OWL1 Full without the comprehension principles is consistent, and does not break most RDF graphs that do not use the OWL vocabulary. We discuss the role of the comprehension principles in OWL semantics, and how to maintain the relationship between OWL Full and OWL DL by reinterpreting the comprehension principles as permitted steps when checking an entailment, rather than as model theoretic principles constraining the universe of interpretation. Starting with such a graph we build a Herbrand model, using, amongst other things, an RDFS ruleset, and syntactic analogs of the semantic "if and only if" conditions on the RDFS and OWL vocabulary. The ordering of these steps is carefully chosen, along with some initialization data, to break the cyclic dependencies between the various conditions. The normal Herbrand interpretation of this graph as its own model then suffices. The main result follows by using an empty graph in this construction. We discuss the relevance of our results, both to OWL2, and more generally to a future revision of the Semantic Web recommendations. This longer version contains the proofs.

  • The Consistency of OWL Full by Jeremy Carroll and Dave Turner HPL-2008-58

    We show that OWL1 Full without the comprehension principles is consistent, and does not break most RDF graphs that do not use the OWL vocabulary. We discuss the role of the comprehension principles in OWL semantics, and how to maintain the relationship between OWL Full and OWL DL by reinterpreting the comprehension principles as permitted steps when checking an entailment, rather than as model theoretic principles constraining the universe of interpretation. Starting with such a graph we build a Herbrand model, using, amongst other things, an RDFS ruleset, and syntactic analogs of the semantic "if and only if" conditions on the RDFS and OWL vocabulary. The ordering of these steps is carefully chosen, along with some initialization data, to break the cyclic dependencies between the various conditions. The normal Herbrand interpretation of this graph as its own model then suffices. The main result follows by using an empty graph in this construction. We discuss the relevance of our results, both to OWL2, and more generally to a future revision of the Semantic Web recommendations. Publication Info: Submitted to ISWC 2008 b1 s 7th International Semantic Web Conference, Karlsruhe

MARC 2 MODS Tool

The Digital Library Federation announces a revision to their MARCXML to MODS tool.
The DLF Aquifer Metadata Working Group announces an update to the XML stylesheet they have developed for the Aquifer project, for conversion of MARCXML records to MODS. The current stylesheet, DLF_MARC2MODS_1.34.xsl, can be found from a link on our MARC to Aquifer MODS XSLT Stylesheet page. Changes are briefly documented in the comments at the beginning of the stylesheet. We have also updated the Introduction pages that give more detail about some of the changes.

The changes include re-added mapping for tag 510 citations to the note element for monographs only; added subject:hierarchicalGeographic element mapping of tag 662 Subject - Hierarchical Place Name; added mapping of tags 561 (ownership) and 581 (publications) to the note element, removed mapping of 007 specific material designation to the genre element when the value is "remote", and a correction to no longer repeat mapping of dates from the Leader to originInfo:date when the date type is "questionable".

Tuesday, May 20, 2008

MARC Update

Update No. 8 (October 2007) was recently released in multiple document formats. It includes changes made to the MARC 21 formats resulting from proposals which were considered by the ALA ALCTS/LITA/RUSA Machine-Readable Bibliographic Information Committee (MARBI), the Canadian Committee on MARC (CCM) and the BIC Bibliographic Standards Group in 2007.

The printed update is available through the Cataloging Distribution Service.
It includes pages for fields that have been changed, with changes marked with side lining. PDF of those printed update pages are also available online

D-Lib Magazine

The May/June 2008 issue of D-Lib Magazine is now available.

Some articles of interest include:
  • PREMIS With a Fresh Coat of Paint: Highlights from the Revision of the PREMIS Data Dictionary for Preservation Metadata Brian F. Lavoie, OCLC Online Computer Library Center
  • Adding Value to the Library Catalog by Implementing a Recommendation System Michael Moennich and Marcus Spiering, Karlsruhe University Library
I found the one on the recommendation system interesting. They are selling the service as an add-on to the OPAC. LibraryThing for Libraries is doing the same with their data. Syndantics has been doing this for quite some time with cover images and reviews. Seems to be a trend here, 2nd party additions to the OPAC supplying services based on data collected elsewhere. In the article world, there was some research done collecting OpenURL data to rate papers.

Monday, May 19, 2008

xOCLCnum

A new service from OCLC.
I'd like to announce and invite you to try xOCLCnum, the latest in the xIdentifier family of Web services from OCLC.

Just as xISBN allows you to find all related editions of a book by entering its ISBN, xOCLCnum does the same thing using OCLC number.

xOCLCnum is queried using a simple URL format, and returns an XML response with both related OCLCnums and related ISBNs (if any). It is designed to be easily built in to your library application, so you can expand queries, find all related editions, or do whatever creative thing you want to do.

Background:
ISBNs have been assigned since 1970, to most but not all books published.

OCLC numbers are assigned whenever a record is added to WorldCat, OCLC's global union catalog. These records cover a large portion of all books, old and new, held by any library in North America and, increasingly other regions worldwide (most recently, National Library of China).

So the coverage range of OCLC numbers is, not surprisingly, far greater than that of ISBNs: in WorldCat, for example, around 100 million OCLCnums compared to about 20 million ISBNs.

More Information on xOCLCnum
xOCLCnum API description

1:30 Ratio for Information

The post at Librarian.net about the book containing thirty tables-of-contents reminded me of the 1:30 rule for information.
Dolby and Resnikoff found these relationships:
  • A book title is 1/30 the length of a table of contents in characters, on average
  • A table of contents is 1/30 the length of a back of the book index, on average
  • A back of the book index is 1/30 the length of the text of a book, on average
  • An abstract is 1/30 the length of the technical paper it represents, on average
Is this the result of living in the material world and this won't hold true online? Or is this a function of the brain and how it deals with information and likely to hold true where ever we function?

XML Workshop

A couple of years ago I had the pleasure of taking the XML workshop offered by Eric Lease Morgan. One of the best workshops I've experienced. Now the notes have been revised and are available online.
XML is about distributing data and information unambiguously. Through this hands-on workshop you will learn: 1) what XML is, and 2) how it can be used to build library collections and faciliate library services in our globally networked environment.
  • An introduction to XML
  • Activity - Beyond MARC
  • Indexes make search easier
  • Activity - Indexing/searching MODS
  • Activity - Writing XML
  • Flavors of XML
  • Activity - Writing XML, redux
  • Activity - Full-text indexes
  • Client/server computing
  • Databases for data storage and maintenance
  • OAI-PMH - a de-centralized OCLC
  • Activity - Being an OAI service provider
  • Activity - Being an OAI data repository
  • Web Services
  • Activity - Creating a "mash-up"
  • Workshop summary
  • External links

Friday, May 16, 2008

MARC Online

More news from LOC.
The Network Development and MARC Standards Office is pleased to announce that the Full versions of the all five MARC 21 formats are now available online, along with the Online Concise.
The "full" version of a format contains detailed descriptions of every data element, along with examples, input conventions, and history sections - all of the information from the printed formats. There are no textual differences between the Online Full and the printed documentation. The Concise still contains all of the elements and enough description to serve many lookup needs. Changes from the most recent update of the formats are indicated in the text of both the Online Concise and the Online Full.

Links in LC Records

News about 856 links from LOC.
I've received a couple of questions recently about the 856 links in LC records for the TOCs, descriptions, bios, sample texts, etc. and wanted to spread the word about what we did.

Every month, around the first of the month, folks run their link checkers to validate the links in their copies of LC records. The volume of traffic against our web server was tremendous. A couple of times it nearly brought the server down. We tried several things to minimize the impact if it looked like a link checker was running against the web server, but this didn't seem to help the problem. In the end, we moved all of the files that are in the 856 fields to a different, larger, more robust server. Apparently this is causing link checkers to report that there is a redirect and people are asking if they need to change the URL for the links. I would say that there is no need to change the 856 links from http://www.loc.gov... to http://catdir.loc.gov.... In fact, I am still adding the URLs as http://www.loc.gov...

LC is committed to maintaining these URLs, you should not be experiencing access problems with them except when running link checkers or maybe harvesters. I appreciate any reports of wrong connections or other serious problems with the files. By my count, we have over 710,000 links in the LC catalog now, so you can see this is a major commitment for LC.

Wednesday, May 14, 2008

Manifestations and Near-Equivalents

Martha M. Yee continues to make her work readily available.
The two articles about 'manifestation' (the word everyone used to mean 'expression' until FRBR came along) that I published in 1994 are now available at the University of California eScholarship Repository, as follows:

Manifestations and Near-Equivalents: Theory, with Special Attention to Moving-Image Materials. Library Resources & Technical Services 1994; 38:227-256.

Manifestations and Near-Equivalents of Moving Image Works: a Research Project. Library Resources & Technical Services 1994; 38:355-372.

Re: Recommendation and Ranganathan

I hope everybody here is also reading Lorcan Dempsey's weblog. However, just in case there are some who don't, begin with the excellent post Recommendation and Ranganathan. I thought the description of the four types of metadata a very good place to start thinking and discussion.

Tuesday, May 13, 2008

eXtensible Text Framework (XTF)

The California Digital Library (CDL) is pleased to announce a new release of its search and display technology, the eXtensible Text Framework (XTF) version 2.1. XTF is an open source, highly flexible software application that supports the search, browse and display of heterogeneous digital content. XTF offers efficient and practical methods for creating customized end-user interfaces for distinct digital content collections.

Highlights from the 2.1 release include:
  • Extensive interface improvements, including new search forms, built-in faceted browsing, and a new look and feel.
  • Increased support for document and information exchange formats.
    • XHTML and OAI-PMH output
    • NLM article format indexing and output
    • Microsoft Word indexing
  • Streamlined XSLT stylesheets for simpler deployment and
    adaptation.
  • Updated documentation that has been moved to the XTF project wiki, allowing XTF implementers to share solutions with entire user community.
  • "Freeform" Boolean query language, offered as an experimental feature.
  • Backward compatibility with existing XTF implementations.
A complete list of changes is available on the XTF Project page on SourceForge, where the distribution (including documentation) can also be downloaded.

Since the first deployment of XTF in 2005, the development strategy has been to build and maintain an indexing and display technology that is not only customizable, but also draws upon tested components already in use by the digital library and search communities - in particular the Lucene text search engine, Java, XML, and XSLT. By coordinating these pieces in a single platform that can be used to create multiple unique applications, CDL has succeeded in dramatically reducing the investment in infrastructure, staff training and development for new digital content projects.

XTF offers a suite of customizable features that support diverse intellectual access to content. Interfaces can be designed to support the distinct tools and presentations that are useful and meaningful to specific audiences. In addition, XTF offers the following core features:
  • Easy to deploy: Drops directly in to a Java application server such as Tomcat or Resin; has been tested on Solaris, Mac, Linux, and Windows operating systems.
  • Easy to configure: Can create indexes on any XML element or attribute; entire presentation layer is customizable via XSLT.
  • Robust: Optimized to perform well on large documents (e.g., a single text that exceeds 10MB of encoded text); scales to perform well on collections of millions of documents; provides full Unicode support.
  • Extensible:
    • Works well with a variety of authentication systems (e.g., IP address lists, LDAP, Shibboleth).
    • Provides an interface for external data lookups to support thesaurus-based term expansion, recommender systems, etc.
    • Can power other digital library services (e.g., XTF contains an OAI-PMH data provider that allows others to harvest metadata, and an SRU interface that exposes searches to federated search engines).
    • Can be deployed as separate, modular pieces of a third-party system (e.g., the module that displays snippets of matching text).
  • Powerful for the end user:
    • Spell checking of queries
    • Faceted displays for browsing
    • Dynamically updated browse lists
    • Session-based bookbags
These basic features can be tuned and modified. For instance, the same bookbag feature that allows users to store links to entire books, can also store links to citable elements of an object, such as a note or other reference.

XTF was actually used as an experimental OPAC technology at the CDL for an experiment with ranking and recommendation features with our catalog data.

Posted to many e-mail distribution lists.

Non-Latin Data in Name Authority Records

From LC:
As previously announced, MDS- Name Authority records will be enhanced with non-Latin script data in 4XX fields and selected notes beginning June 1, 2008, (see earlier announcements at http://www.loc.gov/catdir/cpso/nonroman_announce.pdf and http://www.loc.gov/catdir/cpso/nonlatin_whitepaper.html for additional information.) An additional FAQ related to the project will be posted at http://www.loc.gov/aba/ shortly.

An effort to automatically pre-populate existing authority records with non-Latin references by OCLC, Inc. will also begin in early June 2008. The initial rate of pre-population will be limited to several hundred records per week, and will grow to a rate of approximately 25,000 records per week. Note that other clean-up projects that have recently increased the volume of name authority records (http://www.loc.gov/cds/notices/2008-02-14.pdf ) will be suspended during this pre-population effort. It is estimated that approximately 400,000 pre-population records will be distributed over a number of months.

CDS is making available a file of name authority test records containing non-Latin script data. The file of 110 test records can be found on the Library of Congress rs7 server under the /emds/test subdirectory with file names of names.nonlatintest.records for the MARC 8 version and names.nonlatintest.records.utf8 for the UTF8 version.

Spam

I've been blasted with comment spam. So I've had to turn on the comment moderation function.

It is a shame how these few folks can ruin things for all. A few years back a e-card was a fun thing to receive and send. now so many are spam, I've stopped sending and opening them. Open comments seem ready to go the same way.

Friday, May 09, 2008

Metadata for Learning Resources

Metadata for Learning Resources: An Update on Standards Activity for 2008 by Sarah Currier appears in the latest issue of Ariadne.
The major areas of development covered in this article are:
  1. LOM Next: plans for the next version of the IEEE LOM
  2. The Joint DCMI/IEEE LTSC (Learning Technology Standards Committee) Taskforce: bringing together the two major metadata standards used for learning resources, and providing an RDF translation for the LOM
  3. DC-Education Application Profile (DC-Ed AP): a modular application profile purely looking at educational aspects of resources, based on community requirements
  4. The United Kingdom’s Joint Information Systems Committee Learning Materials Application Profile (JISC LMAP) scoping study: working alongside a number of similar projects looking at application profiles for repositories in other areas, e.g. images.
  5. International Standards Organisation Metadata for Learning Resources (ISO MLR): based primarily in Canada, this international standards body is devising a new international standard for educational metadata, in response to perceived limitations of the IEEE LOM
  6. The European Commission’s PROLEARN Harmonisation of Metadata project: a study into the issues and challenges of achieving harmonisation in metadata, given the heterogeneous landscape

Thursday, May 08, 2008

Metadata Advocates

I had an Ah-Ha moment while listening to John Udell's show Interviews with Innovators. The episode was Working with Data Sources with Raymond Yee.
Raymond Yee is a lecturer at the UC Berkeley School of Information and the author of Pro Web 2.0 Mashups: Remixing Data and Web Services. In this conversation he talks about teaching students how to work with existing data sources, and speculates with Jon Udell on ways to expand the supply of available sources.
What struck me was that we should be advocates for metadata standards. If the local geneology society puts up a calendar on their website, help them get it into iCal or hCal format. Then we could drop their info into a pathfinder. Or geocoding the local bird-watchers sightings, or school district's lunch menu, or .... We could offer our understanding of the importance of standards and data reuse to our community. The library benefits by becoming the go-to-place for information management. The community benefits because they get the word out more effectively. It would be a very different job description for a cataloger to become the community data standard outreach person. But, not a bad place to be.

Resource Description and Access

Now available, Outcomes of the Meeting of the Joint Steering Committee Held in Chicago, USA, 13-22 April 2008.

Wednesday, May 07, 2008

Using Wikipedia

Two new reports from HP Labs show interesting uses of Wikipedia in information management.

Boosting Inductive Transfer for Text Classification using Wikipedia by Somnath Banerjee. HPL-2008-42
Inductive transfer is applying knowledge learned on one set of tasks to improve the performance of learning a new task. Inductive transfer is being applied in improving the generalization performance on a classification task using the models learned on some related tasks. In this paper, we show a method of making inductive transfer for text classification more effective using Wikipedia. We map the text documents of the different tasks to a feature space created using Wikipedia, thereby providing some background knowledge of the contents of the documents. It has been observed here that when the classifiers are built using the features generated from Wikipedia they become more effective in transferring knowledge. An evaluation on the daily classification task on the Reuters RCV1 corpus shows that our method can significantly improve the performance of inductive transfer. Our method was also able to successfully overcome a major obstacle observed in a recent work on a similar setting. Publication Info: Published and presented at ICMLA 2007, the Sixth International Conference on Machine Learning and Applications (ICMLA'07), 13-15 Dec. 2007 Cincinnati, Ohio, USA
Clustering Short Texts using Wikipedia by Somnath Banerjee, Krishnan Ramanathan, and Ajay Gupta. HPL-2008-41
Subscribers to the popular news or blog feeds (RSS/Atom) often face the problem of information overload as these feed sources usually deliver large number of items periodically. One solution to this problem could be clustering similar items in the feed reader to make the information more manageable for a user. Clustering items at the feed reader end is a challenging task as usually only a small part of the actual article is received through the feed. In this paper, we propose a method of improving the accuracy of clustering short texts by enriching their representation with additional features from Wikipedia. Empirical results indicate that this enriched representation of text items can substantially improve the clustering accuracy when compared to the conventional bag of words representation. Publication Info: Published and presented at SIGIR 2007, the 30th Annual International ACM SIGIR Conference, 23-27 July 2007, Amsterdam, Netherlands

Monday, May 05, 2008

Slick Deal

Here is a bargain offered by Amazon, OCLC - MARC Record. It has free shipping too! This was seen on Slick Deals.

Don't they know they can get all the free MARC records they want from their local library?

Thanks Walter.

Thursday, May 01, 2008

myLOC

I may have missed this news, maybe while I was at TxLA, but I've not seen it elsewhere; the Library of Congress now has a "my" portal, myLOC.

Statement of International Cataloging Principles

The Statement of International Cataloging Principles is available for worldwide review.
As Chair of the IFLA Meeting of Experts on an International Cataloging Code (IME ICC) I am pleased to invite comments from the worldwide library community on the final draft of the Statement of International Cataloguing Principles and its accompanying Glossary.

In order to provide the appropriate review period and to schedule adequate time to cumulate, analyze, and incorporate comments before the General Meeting of IFLA in August, the Statement is being posted today on a public Wiki. The IFLA Headquarters Office is closed for holiday April 30-May 5th, but as soon as they return we will move the files there and redirect from the Wiki. In the meantime please link to: http://catprinciples.pbwiki.com/ and view and/or download the Statement for your review; and please use the accompanying voting document for your response.

MARC Records

Ed Summers has "created a bittorrent of the concatenated MARC files donated to the Internet Archive by Scriblio (7,030,372 records)":

http://inkdroid.org/torrents/lc-bib.torrent

Wednesday, April 30, 2008

Library of Congress Subject Heading Suggestion Blog-a-Thon

The results for the Library of Congress Subject Heading Suggestion Blog-a-Thon are in. The effort resulted in 24 subject headings, 6 cross-references, and 2 subdivisions suggestions.

Tuesday, April 29, 2008

Transparency

Get Satisfaction looks like a unique 2.0 tool to make the organization transparent.
Get Satisfaction is a direct connection between people and companies that fosters problem-solving, promotes sharing, and builds up relationships. Thousands of companies use this neutral space to support customers, exchange ideas, and get feedback about their products and services. Get Satisfaction is open, transparent, and free. You’re free to ask, free to answer, and free to start a new conversation. Everyone is invited and encouraged to participate: companies, employees, customers — anyone with an opinion, an answer, or something to say.
A few libraries are repersented. Michael Stephens needs to see this.

Monday, April 28, 2008

Free Comic Book Day

Free Comic Book Day is this weekend, May 3.

Additions to the MARC Code Lists for Relators, Sources, Description Conventions

The codes listed below have been recently approved for use in MARC 21 records. The codes will be added to the online MARC Code Lists for Relators, Sources, Description Conventions.

The codes should not be used in exchange records until after June 25, 2008. This 60-day waiting period is required to provide MARC 21 implementers time to include newly defined codes in any validation tables they may apply to the MARC fields where the codes are used.

Category Code Sources
The following codes are for use in subfield $2 in field 072 in Authority and Bibliographic records (Subject Category Code) and in subfield $z in field 073 (Subdivision Usage) in Authority records.

Additions:

bisacsh
BISAC Subject Headings
(http://www.bisg.org/standards/bisac_subject/index.html) [use only after June 25, 2008]
bisacmt
BISAC Merchandising Themes
(http://www.bisg.org/standards/merchandising.html) [use only after June 25, 2008]
bisacrt
BISAC Regional Themes
(http://www.bisg.org/standards/region_codes.html) [use only after June 25, 2008]
Classification Sources
The following code is for use in subfield $2 in field 084 in Bibliographic and Community Information records (Other Classification Number), in subfield $2 in field 084 in Classification records (Classification Scheme and Edition) and in subfield $2 in field 065 in Authority records (Other Classification Number).

Addition:
blissc
British Library Inside service subject classification. (London: British Library) [use only after June 25, 2008]
Term, Name, Title Sources
The following codes are for use in subfield $2 in fields 600-657 and 662 in Bibliographic and Community Information records, and in subfield $f in field 040 (Cataloging Source) in Authority records.

Additions:
bisacsh
BISAC Subject Headings
(http://www.bisg.org/standards/bisac_subject/index.html) [use only after June 25, 2008]
bisacmt
BISAC Merchandising Themes
(http://www.bisg.org/standards/merchandising.html) [use only after June 25, 2008]
bisacrt
BISAC Regional Themes
(http://www.bisg.org/standards/region_codes.html) [use only after June 25, 2008]
quiding
Quiding, Nils Herman. Svenskt allmant forfattningsregister for tiden fran ar 1522 till och med ar 1862. (Stockholm: Norstedt) [use only after June 25, 2008]
skon
tt indexera skonlitteratur: Amnesordslista, vuxenlitteratur.
(Stockholm: Svensk biblioteksfrening) [use only after June 25, 2008]