Friday, February 24, 2006

Open WorldCat and FRBR

OCLC has applied a FRBR tool to the results of Open WorldCat searchs.
The display of WorldCat records in the Open WorldCat program has been simplified to help people locate specific versions of a title more quickly. Web users who reach the "Find in a Library" interface from partner search engines now see consolidated results for different formats of a source work.

OCLC has fully applied the FRBR conceptual model to the 3-million-record set currently exposed through Yahoo! Search and Google, providing a deeper view of WorldCat and greatly flattened search results.


LibraryThing

Some very interesting things going on at LibraryThing, the home cataloging tool. They now have MARC records for many of their titles. I didn't see a way to download them, but they might be useful for very small libraries.

But even more interesting is the use of the idea of the work to combine different titles. The idea of work they seem to be using is based on the social rather than the creator. What people think is the same, is the same. Folks have been busy combining titles into works. The weblog post and comments gives some idea of the work being done.

Seems to me this is a goldmine for cataloging researchers. Here we can see just how our users view works and titles, how they struggle over the tricky stuff, how they are moving towards a kind of folk FRBR. All this user data should be useful in informing our decisions.

And what about the thoughts of the person behind LibraryThing? Has he been invited to speak at any conferences? I'm not saying this will replace our catalogs, nor that we should toss out ISBD, FRBR, RDA, and MARC. However, this is a place to see what could improve our catalogs and provide data on our users. Something to think about.

PhpMyLibrary Release

The PhpMyLibrary version 2.1.1 has been released to the public. PhpMyLibrary is a PHP MySQL Library automation application. The program consist of cataloging, circulation, and the webpac module. The programs also has an import export feature. The program strictly follow the USMARC standard for adding materials.

Features:

  1. Web-based Cataloging
  2. USMARC Import/Export
  3. Reports Printing
  4. Indexing Feature
  5. Data Retrieval System
  6. User Management
  7. Borrowing/Loan Management
  8. CDS/ISIS .iso files importing module
  9. USMARC compliant
Free and built on standard tools like PHP, MySQL, and Apache. Seen on oss4lib.

Thursday, February 23, 2006

Additions to MARC Code List for Languages

The following codes have been approved for use in the international language code standard, ISO 639-2 (Codes for the Representation of Names of Languages--Part 2: alpha-3 code) and are also being added to the MARC Code List for Languages.

New code Language name Previously coded
  • anp Angika bih (Bihari)
  • frr Northern Frisian n/a
  • frs Eastern Frisian n/a
  • gsw Swiss German ger (German)
  • krl Karelian fiu (Finno-Ugrian (Other))
  • srn Sranan cpe (Creoles and Pidgins, English-based (Other))
  • zxx No linguistic content Previously coded as 3 blanks in 008/35-37
LC Implementation Plans
Subscribers can anticipate receiving MARC records reflecting these changes in all distribution services not earlier than June 1, 2006.

unAPI

APIs are a great way to share metadata, one thing we have lots to share. However, each API is a custom job. Using it requires adhering to the particular conditions of each site. Creatng one requires starting from scratch. More places would have one, if they did not require so much work to construct. More places would find uses for the iformation, if they did not have to write code for each place. Welcome unAPI.
unAPI is a simple website API convention. There are many wonderful APIs and protocols for syndicating, searching, and harvesting content from diverse services on the web. They're all great, and they're all already widely used, but they're all different. We want one API for the most basic operations necessary to perform simple clipboard-copy functions across all sites. We also want this API to be able to be easily layered on top of other well-known APIs.

The objective of unAPI is to enable web sites with HTML interfaces to information-rich objects to simultaneously publish richly structured metadata for those objects, or those objects themselves, in a predictable and consistent way for machine processing.

Wednesday, February 22, 2006

2006 OLAC Conference

The 2006 OLAC Conference website is now available. They have an opening keynote address by Jennifer Bowen and closing address by Barbara Tillett as well as workshop presentations covering varying aspects of audiovisual cataloging. Registration will be available soon as well as other updates to the site.


Tuesday, February 21, 2006

Metadata and Data Quality Problems

Metadata and Data Quality Problems in the Digital Library by Jeffrey Beall appears in the latest Journal of Digital Information.
This paper describes the main types of data quality errors that occur in digital libraries, both in full-text objects and in metadata. Studying these errors is important because they can block access to online documents and because digital libraries should eliminate errors where possible. Some types of common errors include typographical errors, scanning and data conversion errors, and find and replace errors. Errors in metadata can also hinder access in digital libraries. The paper also discusses the responsibility for errors in digital documents and offers suggestions for managing digital library data quality.

ONIX for Books

Some minor changes in ONIX.
Special supplementary release: Release 2.1 revision 03. To meet the short-term needs of ONIX implementations in Australia and Spain, ahead of any new general Release, a further minor revision, 03, has been completed in January 2006, and can be downloaded from a separate page. All other users are recommended to continue to work with Release 2.1 revision 02. However, even if you are staying with Revision 02, you may wish to download the Revision 03 documentation package, which includes some more general improvements. Within the documentation, new elements that are applicable only to Revision 03 are clearly marked as such.

Federated Searching

Patel, Yatrik and Vijayakumar, J. K. and Ramesh, Badnapuri (2002) Ontological metadata approach for accessing distributed web content : a proposed model for bibliographic databases. In Homayoun, Behrous, Eds. Proceedings First Euro-Asian Conference on Advances in Information and Communication Technology (EURASIA ICT 2002), pp. 487-491, Shiraz (Iran).
Content in Distributed servers, located at various locations, mostly non homogenous in nature with different platforms, different OS, different format becoming a big challenge for information professionals. This paper proposes a model for how to map relational bibliographic databases such as library databases/catalogues and bring these content in to a single homogenous platform irrespective of its multiple platforms and provides access through a single (gateway or portal) and what are the tools and techniques to be used. A detailed discussion on tool for data mining from various RDBMS, a common pool for data mapping, interfacing non homogenous data mines at distributed locations through ontological Metadata approach and display through XML etc are presented in this paper. The implementation of the proposed model by data mining, mapping the bibliographical relations, pool for interfacing, and mechanism of accessing the content through are also described.

Metadata Object Description Schema

Some proposed changes in MODS:
We would like to make some minor changes to MODS for a version 3.2. This will be considered an incremental version in that it will not invalidate existing records. We plan to work on more extensive changes in a version 4.0 in the near future.

Proposed changes for MODS v. 3.2
  1. Add an attribute linkInfo to <location><url> to allow for notes associated with the link (MARC 856$z). (For DLF Registry of Digital Masters)

    We first considered adding an element like <urlNote>, but decided on an attribute because this note is associated with the URL. You could have a <physicalLocation> (for the repository that is responsible for the resource) and <url> (for the URL of the resource) that are bound together in a <location> container. So just adding a note under <location> wouldn't tell you whether it belongs with <physicalLocation> or <url>. It is associated with the link given in <url>, so an attribute seemed appropriate.

  2. Add enumerated values under digitalOrigin to accommodate more of those in MARC ER 007/11 (Antecendent/source) (also for DLF Registry of Digital Masters)

    Now the MODS values are:
    born digital
    reformatted digital (which means reformatted to digital from the original)

    We want to add:
    digitized microfilm (MARC code b)
    digitized other analog (MARC code d)

    In the next full version of MODS (4.0) we could change "reformatted digital" to something more precise (to bring out that it means digitized from analog). Since this version should not invalidate existing instances, we would not want to make that change now.

  3. Add ID attribute to part element. This will allow the element to be referred by an IDREF attribute from within the same document. When MODS is used in METS documents, we need to be able to reference MODS relatedItem and part elements using the METS div DMDID attribute (which is an IDREF type). This makes it possible to associate particular entities represented by divs in the structMap with particular descriptive information.

  4. Restore ID attribute to relatedItem element. (It disappeared inadvertently from Version 3.1-- was in 3.0).

  5. Allow part element to be optionally empty (for when the ID and type attribute values contain all the information needed about the part).

    Example:

    <mods:relatedItem ID="DMD_article01" type="constituent">
    <mods:titleInfo>
    <mods:title>Wien, 10. mai.</mods:title>
    </mods:titleInfo>
    <mods:genre>article</mods:genre>
    <mods:part ID="DMD_article01_para01" type="paragraph"/>
    <mods:part ID="DMD_article01_para02" type="paragraph"/> </mods:relatedItem>

    The idea is that it is possible to use the part element plus type attribute to note the existence of paragraphs (as suggested in the MODS Outline of Elements and Attributes). Given that the above construct expresses all I needed to express (i.e. the existence of a sequence of paragraphs in document order) it renders the text subelement superfluous. Currently the text child element (or one of the other child
    elements- detail, extent, or date- are required.

    <mods:relatedItem ID="DMD_article01" type="constituent">
    <mods:titleInfo>
    <mods:title>Wien, 10. mai.<mods:title>
    </mods:titleInfo>
    <mods:genre>cle</mods:genre>
    <mods:part ID="DMD_article01_para01" type="paragraph">
    <mods:text/>
    </mods:part>
    <mods:part ID="DMD_article01_para02" type="paragraph">
    <mods:text/>
    </mods:part>
    <s:relatedItem>

  6. Make a few elements global that are referenced in the DCMI Library Application Profile (these are subelements under other elements):

    dateCaptured (under <originInfo)
    edition (under <originInfo>
    physicalLocation (under <location>)

  7. Addition of collection description elements Most of the elements in the DCMI Collection Description Profile map already to MODS. We are thinking we should provide for them all, since in many cases people will be exposing their metadata in OAI as MODS, but there may be collection description information they want along with their items that are exposed as MODS. So the following are the elements that would need to be added:

    Accrual periodicity: defined as "The frequency with which items are added to a collection" MODS has frequency under <originInfo>, although this is uncontrolled text and is under <originInfo> (and used for MARC 310 information). That is used for the frequency of a publication as opposed to the frequency of adding something to a collection. But then if what you are describing is a collection you probably could use this frequency and add an authority for a controlled list of values.

    Accrual method: in MARC holdings this is Method of acquisition (008/07)

    Accrual policy: in MARC holdings similar to Receipt or acquisition status (008/06).

If this will require a lot of discussion I'd rather wait until the next full version of MODS, since we want to get 3.2 out. If people think we should pursue it now because they think it's important to have it right away, I'll suggest some elements.

Comments welcome. We'd like to be able to put out a new schema in a few weeks.

ETD Metadata

The Metadata Working Group of the Texas Digital Library (TDL) has developed a descriptive application profile for electronic theses and dissertations (ETDs) in the Metadata Object Description Schema (MODS). Their goal was to develop a standard that provides semantically rich bibliographic description in a flexible, web-friendly syntax, and they found that MODS fit their needs. They have posted version 1 on the TDL web site. Besides continuing to develop MODS for ETDs, they are planning to investigate metadata for compound/complex objects, rights/access management, and preservation.

IFLA Statement of International Cataloguing Principles

The updated Latvian translation of the IFLA Statement of International Cataloguing Principles (based on the Sept. 2005 draft) have been posted.