Wednesday, December 29, 2004

Intranet for Public Libraries

Some libraries might find Brushtail a useful tool to organize the information on their intranet. It provides:
  • Content Management
  • Calendars
  • Casual staff list
  • Email forms
  • IT Jobsheet
  • Hours available noticeboard
  • Keyword Search
  • Events Bookings
  • PC Bookings
  • Room Bookings
  • Reference Database
  • Social Noticeboard
Brushtail requires a PHP enabled webserver and the MySQL database server. It is free and released under the GPL open source license.

Rights Metadata

ccPublisher 1.0 is now available for Mac OS X and Windows. Both Windows (.msi) and Mac OS X (.dmg) files are available for download.

ccPublisher is a tool which allows users to select a Creative Commons license and upload their work to the Internet Archive for free hosting and cataloging. It also supports Internet Archive keywords and description information for resource discovery.

Monday, December 27, 2004

TEKSLink Project

The TEKSLink Project is not new but deserving of mention.
The purpose of the TEKSLink Project is to provide a tangible link between the materials located in the library media center and the TEKS (Texas Essential Knowledge and Skills) standards used in the classroom.


TEKSLink provides the link between the standards and subject headings. By amending the authority record of the library catalog, a teacher can search by TEKS in the catalog, and the resulting materials are available in his or her campus library.

School libraries in Texas, should consider downloading these records into their systems, if not contributing in a more substantial way. Currently, the Elementary Science and Social Studies TEKS (except 5th grade Social Studies) are finished.


GNU Eprints Version 2.3.0 is now ready. Improvements include:
  • New (and improved) default configuration with detailed help
  • Full text searching
  • Improved searching for names and dates
  • RSS Output
  • Improvements to the user interface
  • Improved views. You can make sub views. eg. /view/type_and_year/article/2000/
  • Support for Apache 2.0 and Perl 5.8.1
  • Automatic content negotiation for multi-lingual sites

Wednesday, December 22, 2004


The January - February selection for the Librarian's Book Club is The Intellectual Foundation of Information Organization (Digital Libraries and Electronic Publishing)

Authority Records

Andrew Houghton commented on my remark about making the GSAFD authority records available for Web services in XML. Since not everybody sees the comments I'm reproducing it here.
The MARC-XML records for GSAFD are already available from OCLC's Terminology Services Project.

Monday, December 20, 2004


The latest issue of the newsletter of the IFLA Bibliography Section contains:
  • National Bibliography in a Globalized World: The Latin American Case
  • Survey on the state of national bibliographies in Latin America
  • The Argentine national bibliography: a continuing obligation
  • Electronic Consortium of Libraries: a bibliographical cooperation scheme
  • The national bibliography as a system of bibliographic directories for the study of Cuban culture. Abstract
  • Challenges of the national bibliographies and the national bibliographic agencies in Latin America

Catalog Error Checking

Dorothea Salo has made available the paper Toward better automated error-checking in cataloguing.
Errors in library catalogs damage patron access as well as patron confidence in library offerings. They embarrass librarians of all sorts and frustrate cataloguers. Many of them, however, are eminently correctable and even preventable with relatively simple programmed validation. Databases everywhere employ input checks and similar validators to keep their data clean; library catalogues should also.
She also argues for spell checkers and has a good word for the ability of patrons to easily report errors.


The MARBI Meeting Minutes from the ALA Annual Meeting Orlando, FL -- June 26-27 2004 are now available.

MARC Authority Records

Over on AUTOCAT there has been much discussion about the Guidelines on Subject Access to individual works of Fiction, Drama, etc. lately. As I've mentioned before, the MARC authority records for these terms are freely available. Would encoding them in XML make them a more useful resource on the Web? The essay by Lorcan Dempsey just got me thinking. Thanks to all those who are making them freely available to all.

Metadata Authority Description Schema

A new (preliminary) draft of MADS, an XML schema based on the MARC Authorities Format, is available.

Tech Trends

Stitching services into user environments - intrastructure is an interesting essay appearing in Lorcan Dempsey's weblog.

Friday, December 17, 2004


The following document is available for review by the MARC 21 community:This paper will be discussed in a meeting of the MARC Advisory Committee on Sunday, January 16, 2005 in Boston. I had never thought much about it before, I guess because I deal mostly with text, but there is a diference in the way we discuss pictures and texts. A document is about something, a picture is of something. Should this distinction be made in the MARC record?

A draft agenda for the meeting is available.

Problems in the Catalog

The Murky Bucket Syndrome by Roy Tennant appears in the latest Library Journal. He describes the problems of standardization of large historical datasets, like our catalogs. "As we try to do things programmatically, the structure and content practices really matter in ways they might not have before (FRBRization, data mining, etc.)…." However, greater uniformity in content pratices means more rules in AACR and greater granularity in structure means a more complex MARC. He has in the past argued that things are too complex already. The XML version of a MARC record is much larger and complex than the MARC version, for example. The experience of Dublin Core shows the probelms in trying to make something simple and easy to use. Those records are even more of a problem than our MARC records, and they have only been created over a few years not decades.
I've been hitting on metadata issues hard in this column, especially in recent months. I am increasingly disturbed by our inability to get this right, at least given today's needs. The library profession seems fond of assuming that its bibliographic infrastructure is the best ever devised, worthy of respect and admiration. There is some truth to that but also some self-delusion. If this is the best bibliographic infrastructure ever devised, then we (and, more importantly, our users) are in trouble. We must fix it, and soon.

Thursday, December 16, 2004

Professional Reading

The December issue of D-Lib Magazine has these articles that may be of interest:
  • A Repository of Metadata Crosswalks by Carol Jean Godby, Jeffrey A. Young, and Eric Childress
  • Metadata Development in China : Research and Practice by Jia Liu
  • Resource Harvesting within the OAI-PMH Framework by Herbert Van de Sompel; Michael L. Nelson; Carl Lagoze and Simeon Warner
  • The Role of RSS in Science Publishing: Syndication and Annotation on the Web by Tony Hammond, Timo Hannay and Ben Lund
The RLG DigiNews has the articles:
  • X Marks the Spot: The Role of Geographic Location in Metadata Schemas and Digital Collections by Stephanie C. Haas
  • PREMIS -- Preservation Metadata Implementation Strategies Update 2: Core Elements for Metadata to Support Digital Preservation by Rebecca Guenther

Wednesday, December 15, 2004

New Blog

I have to mention a new 'blog here in Texas, Library Technology in Texas by Christine Peterson, of Amigos, our OCLC service provider. Most of the items apply to libraries outside Texas so give it a read. Today she has some low cost, easy it do, tips on safe computing.


A metadata schema, new to me, is the Data Documentation Initiative (DDI).
The Data Documentation Initiative (DDI) is an effort to establish an international XML-based standard for the content, presentation, transport, and preservation of documentation for datasets in the social and behavioral sciences. Documentation, sometimes called metadata (data about data), constitutes the information that enables the effective, efficient, and accurate use of those datasets.

Tuesday, December 14, 2004

OLAC Conference

The conference reports, handouts and PowerPoint presentations from OLAC
are now available on the OLAC website. Papers include:
  • Expanding Access: FRBR and the Challenges of Non-print Materials
  • Expanding Access, Expanding the Challenges
  • Descriptive Cataloging of Music Scores
  • Cataloging Cartographic Materials on CD-ROMs
  • Cataloging and Indexing of Still and Moving Images
  • Cataloging Unpublished Oral History Interviews and Collections
  • Improving Access to Audio-Visual Materials by Using Genre/Form Terms
  • Future of the GMD: What Can Be Done to Improve It or to Find Alternate Ways to Fulfill Its Function?
  • Videorecordings Cataloging Workshop
  • Cataloging Electronic Resources: The Long and the Short of It
  • Introduction au Catalogage des Ressources Int√©gratrices = Introduction to Integrating Resources Cataloging
  • Descriptive Cataloging of Sound Recordings = Le Catalogage Descriptif des Enregistrements Sonores Musicaux

Monday, December 13, 2004

Midwinter 2004 MARBI papers available

The following documents are available for review by the MARC 21 community:
  • Proposal No. 2005-01: Definition of Field 766 in the MARC 21 Classification Format.
  • Proposal No. 2005-02: Definition of Subfield $y in Field 020 (International Standard Book Number) and Field 010 (Library of Congress Control Number) in the MARC 21 Formats.
  • Proposal No. 2005-03: Definition of Subfield $2 and Second Indicator value 7 in Fields 866-868 (Textual Holdings) of the MARC 21 Holdings Format.
  • Proposal No. 2005-04: Hierarchical Geographic Names in the MARC 21 Bibliographic Format.
  • Proposal No. 2005-05: Change of Unicode mapping for the Extended Roman alif character.

These papers will be discussed in a meeting of the MARC Advisory Committee on Saturday, January 15, 2005 and Sunday, January 16, 2005 in Boston.


Today is the 1 year anniversary of my auto accident. A good deal of my time and effort this past year has been to get well. Doctor visits, shots, tests, pills and physical therapy still continue. My life is now back to normal in many ways. I can walk without a limp, no one would know I was in an accident by looking at my walk. I go swimming. Currently I can do 1/4 mile, not bad but not as good as before the accident. Cora and I have been dancing. Yesterday we tried English country for the first time in a year. I could do most of it but had some problems with the skip step. We have been contra dancing several times and I've lasted 2 hours at one of the dances. Again, not too bad but not up to pre-accident levels. I can climb a ladder, tie my shoes, drive, reshelve books (even on the bottom shelf), I am not limited in many ways. I do have time and repetition limits that are lower than before.

Over the past year many people have contributed to my recovery. Thanks to y'all. And thanks to everybody who sent "get well soon" messages and cards. Those also have helped me get through this difficult time.

Open Source ILS

Out of Finland comes Emilda, another open source library system.
Emilda is a complete Integrated Library System that features amongst others an OPAC, circulation and administration functions, Z39.50 capabilities and 100% MARC compatibility. MARC compatibility is achieved using Zebra in conjunction with MySQL.
On December 7 Emilda 1.2-rc1 was released.
The first release candidate Emilda 1.2-rc1 for the upcoming Emilda 1.2 stable release is now available for testing. We encourage all Emilda users with available testing resources to evaluate and stress-test this release, and report all anomalies to Emilda Developers or the Emilda Bug Database. For Emilda 1.1 users, this release brings tons of improvements, such as: new search capabilities, LDAP support, web-based setup, item types, subjects, etc. Translators are called upon to begin possible translations of Emilda as language by this release has been frozen.

Friday, December 10, 2004


Knowing where they're going: statistics for online government document access through the OPAC by Christopher Brown appears in Online Information Review (2004) v. 28, no. 6 pp. 396-409
While documents librarians are generally familiar with document usage through their circulation statistics, they have no idea of how publications are being accessed online. The University of Denver has developed a system for tracking online document access. By redirecting every URL in their OPAC for federal documents first to a ColdFusion database, recording the URL, and then sending the user to the online document, they were able to track each access to online documents. Then, importing these statistics into an Access database, they were able to provide an analysis by agency, date of document, and other features. This article presents the results of one year of tracking access through the University of Denver library OPAC.

Authorty Tool

a.k.a. (also known as) lists author pseudonyms, aliases, nicknames, working names, legalized names, pen names, noms de plume, maiden names... etc. As of 06/05/04 it included 11,516 entries (4,142 'real' + 7,374 'pseudo').

Thursday, December 09, 2004


The metadataLibrarians e-mail lilst may be of intrest to some folks.
This listserv is intended for Metadata Librarians, Digital Librarians, Metadata Architects, Information Architects, and other professionals working in cultural heritage institutions and the information sciences.

This list is geared toward discussing qualitative issues central to the metadata and digital library world. Discussions on popular metadata standards such as EAD, METS, MODS, MARCXML, DC, OAI, etc., will be prominent. The list also serves as a forum for discussing workflow issues in our respective insititutions, as well as a meeting place for collaborating.

Some technical discussion is certainly welcomed. Because metadata professionals come from all disciplines and technical backgrounds, though, we prefer to keep the discussion fairly non-technical. Heavy technical discussions regarding metadata and digital libraries can be found on XML4LIB, CODE4LIB, diglib, SYSLIB, etc.

Not very much traffic, yet.

Wednesday, December 08, 2004


The draft schedule 741.5 Cartoons, caricatures, comics, graphic novels, fotonovelas is now available on the Dewey web site. Interested libraries are invited to test all or parts of the draft schedule and send comments before the schedule is finally approved for implementation. Please send comments and suggestions by March 31, 2005, to Julianne Beall, assistant editor, DDC,

Tag of the Month

The MARC Tag of the Month at Follett for December is a sample record for Continuing Resource -- Serial (Periodical).

Monday, December 06, 2004 Citations

Amazon seems to be getting into the citation linking game. Citations is a program that helps customers discover books related to the ones they're interested in. Amazon scans every book in the Search Inside the Book program looking for phrases that match the names of books in our catalog. We make a note of these "citations" and display them to you in one of two ways.

If a book cites two other books, we show you which two books it cites, and link to the pages in the book where the citations appear. If a book is cited by two other books, we show you which two books cite it, and link to the pages in those books where the citations appear. Please note that Citations is not a comprehensive list of all existing citations. For example, an author may cite a book using a slightly different form of its name from that which appears in our catalog, or a title may be mentioned in a book not yet part of the Search Inside the Book program. In such cases, we will not find a match.

Digitization Blog

A new 'blog that may be of interest to some readers, digitizationblog
digitizationblog focuses on digitization and related activities in libraries, archives, and museums, and is a source of news relevent to people who manage and implement digitization projects. Postings about new technologies and tools (particularly software), developments in metadata, and government or consortial initiaves are welcome, as are pointers to new and innovative collections of digitized and born-digital material. Even though there are several excellent sources of digitization news such as the DigiCULT Newsletter and RLG DigiNews (and this blog certainly isn’t intended to replace them), there is a lack of space on the web where implementors can share ideas and useful pointers. digitizationblog is intended to fill part of this gap.

Title Notes

The OLAC Cataloging Policy Committee Subcommittee on Source of Title Note for Internet Resources has been working on a substantial revision of the online document of the same name. The original Version 1 document is located at on the OLAC Web site. Please send comments to Susan Leister at


Using FRBR by Knut Hegna appears in HEP Libraries Webzine Issue 10 (2004).
This article presents a possible user interface based on bibliographic data entered according to the FRBR conceptual model.

The main ideas are inspired by the old card catalogue which included a structure in the filing system which was lost in the process of computerization of the catalogues. When pulling out a drawer in the card catalogue you were made aware of the structure by the guide cards, the filing logic and the relations represented by see and see also references.

Friday, December 03, 2004


The NISO Newsline for December 2004 is now available.


  • NISO Hosts International Standards Meeting
  • The ARK Identifier Scheme: A New NISO Registration
  • Next Generation Resource Sharing
  • IETF Approval of NISO's INFO URI Expected
  • NISO Metasearch Survey Sets Direction for Standards
  • Reminder: Midwinter Meeting 2005
  • Getting a Handle on Data
  • Moving Beyond Metasearching: Are Wrappers the Next Big Thing?
  • Breaking Down Information Silos: Integrating Online Information
  • Binary XML Proponents Stir the Waters
  • EPA Builds a Better Search
  • Digital Memories, Piling Up, May Prove Fleeting
  • Data-Mining
  • Sony Lab Tips 'Emergent Semantics' to Make Sense of Web
Just yesterday I saw mention of the ARK Identifier Scheme for the first time. Now here it appears as a NISO registration. Maybe something to keep an eye on.

Topic Maps

Still struggling to understand topic maps. A collection of papers on them is at techquila. Another group at ontopia.


Typographical and Factual Errors in Library Databases by Jeffrey Beall appears in the December issue of the Informed Librarian Online.

Thursday, December 02, 2004


As part of two projects funded by the U.S. Federal Institute of Museum and Library Services (IMLS), we have been creating a MySQL database of the MARC 21 Concise Format for Bibliographic Data documentation. We have created a public mirror site to our application to let people see what we have done so far.

This is a work in progress and should be considered a draft version of what the final product will look like. As a result of this structuring of MARC documentation into a database, we will be able to generate different views of the data and documentation. We will be able to generate, for example, tab or comma delimited output of all the tags and their associated indicators, and subfields. We are working on an XML output of this as well.

Posted to the MARC e-mail distribution list. This work is being done at the University of North Texas, my alma mater. Anyone looking for an MLS should investigate UNT, some very interesting work is being done there.

Persistent Digital Object Identifiers

An open source release of a tool for generating unique persistent digital object names and other identifiers has been made. The tool, called "noid" (nice opaque identifier), can be used as a major piece of an overall identifier strategy no matter which naming scheme you choose (e.g., ARK, DOI, Handle, LSID, PURL, or URN). The documentation, updated Nov 30, is technical on the whole, but it starts with a general overview and a brief tutorial section. This software release is available as open source.

A paper describing the motivation for persistent identifiers and for ARKs (Archival Resource Key).

The "noid" (rhymes with void) tool creates minters, or identifier generators, and accepts commands that operate them. Once created, a minter can produce persistent, globally unique names for documents, databases, images, vocabulary terms, etc.

Properly managed, minted identifiers can be used as long term durable information object references within naming schemes such as ARK, PURL, URN, DOI, and LSID. At the same time, alternative minters can be set up to produce short term names; for example, transaction identifiers and compact web server session keys. These are some of the ways in which the California Digital Library is using noids.

A minter can bind arbitrary metadata to identifiers with individual stored values or rule-based values. Included are instructions for setting up a URL interface and a name resolver. Based on open source Berkeley DB databases, minters are extremely fast, scalable, reliable, and easy to create, and they have a small technical footprint.

Naming and identifying are decisions of what constitutes a work, manifestation or expression and have implications in cataloging.


RSS... What Is It and Why Should I Care? appears in the latest Amigos Agenda & OCLC Connection. A decent introduction including some screen shots of various readers.

Lorcan Dempsey's weblog

Lorcan Dempsey, VP of Research for OCLC, has weblog.

Tuesday, November 30, 2004

My TEI publisher

Eric Lease Morgan describes a set of tools he uses to publish, index and distribute his writings, My TEI publisher. Pieces, or the whole, could be useful in other settings.
As a librarian, it is important for me to publish my things in standard formats complete with rich meta data. Additionally, I desire to create collections of documents that are easily readable, searchable, and browsable via the Web or print.

In order to accomplish these goals I decided to write for myself a rudimentary TEI publishing system, and this text describes that system.

N.B. This is not a TEI/XML editor, rather a publishing system.

Monday, November 29, 2004

Subject Searching

Subject searching in the OPAC of a special library: problems and issues by M.S. Sridhar in OCLC Systems & Services (2004) vol. 20, no. 4 pp.183-191
This paper draws on data from a comparative study of use of the online public access catalogue (OPAC) and the card catalogue of the ISRO Satellite Centre (ISAC) library, and examines the steady decline in the use of subject searching by end-users and the associated problems and issues. It presents data to highlight the negligible use of Boolean operators and combination searches, variations in descriptors assigned to books of the same class numbers, and too many records tagged to very broad descriptors. The article concludes that moving from a traditional card catalogue to a modern OPAC has not made subject searching more attractive or effective.


Antelman, Kristin (2004) Identifying the Serial Work as a Bibliographic Entity. Library Resources & Technical Services 48(4):pp. 238-255.
A solid theoretical foundation has been built over the years exploring the bibliographic work and in developing cataloging rules and practices to describe the work in the traditional catalog. With the increasing prevalence of multiple manifestations of serial titles, as well as tools that automate discovery and retrieval, bibliographic control of serials at a higher level of abstraction is more necessary than ever before. At the same time, models such as IFLA’s Functional Requirements for Bibliographic Records offer new opportunities to control all bibliographic entities at this higher level and build more useful catalog displays. The bibliographic mechanisms that control the work for monographs--author, title, and uniform title--are weak identifiers for serials. New identifiers being adopted by the content industry are built on models and practices that are fundamentally different from those underlying the new bibliographic models. What is needed is work identifier for serials that is both congruent with the new models and can enable us to meet the objective of providing work-level access to all resources in our catalogs.


ZOpenArchives, an OAI implementation for Zope, transforms Zope or Plone into an OAI Data Provider and Aggregator has been released. Written in Python, same as Zope.

This product is an OAI implementation for the Zope server, it provides the following components :

  • Zope OAI Server : Which will contain ZCatalog Harvesters
  • Open Archives Agregator : Which will contain OAI Harvesters
  • OAI Harvester : Which will do the harvest from external OAI servers
  • ZCatalog Harvester : Which will provide ZCatalog records as OAI records

NISO Standards

Three standards are out for review and ballot and these ballots are due in December.

Z39.86-200x Specifications for the Digital Talking Book
This standard defines the format and content of the electronic file set that comprises a digital talking book (DTB) and establishes a limited set of requirements for DTB playback devices. It uses established and new specifications to delineate the structure of DTBs whose content can range from XML text only, to text with corresponding spoken audio, to audio with little or no text. DTBs are designed to make print material accessible and navigable for blind or otherwise print-disabled persons. Ballot closes December 1, 2004

NISO Z39.18-200x Scientific and Technical Reports - Preparation, Presentation and Preservation
This standard outlines the elements, organization and design of scientific and technical reports, including guidance for uniform presentation of front and back matter, text, and visual and tabular matter in print and digital formats, as well as recommendations for multi-media reports. Ballot closes December 18, 2004

ANSI/NISO Z39.78-200x Library Binding
Binding is the first line of defense in library preservation and can be a major part of a library's preservation budget. Developed jointly by NISO and the Library Binding Institute, this ANSI/NISO/LBI standard describes the technical specifications and materials to use for first-time hardcover binding of serials and paperbound books intended for the rigors of library use. It also covers rebinding of hardcover books and serials. Following this standard will give you volumes that are sturdy, durable and flexible. Ballot closes December 16, 2004

Wednesday, November 24, 2004


Thursday, here in the U. S. of A., we will be celebrating Thanksgiving. My library is also closed Friday, a four day weekend. Happy Thanksgiving to all in the states. I'll be back Monday.

Color Access

Recently the bookstore that allowed some artists to arrange the stock by color received a good bit of press. Ian pointed me to an older effort by the New England School of Law to provide color as a limit on searching. I've posted this before, but it is an interesting concept. Using the binding company's colors provides a limited palette and removes ambiguity such as purple, plum, wine, ox-blood, and all the other terms that are available to describe color.


Moving Beyond Metasearching: Are Wrappers the Next Big Thing? by Brian Kenney appears in the latest Library Journal.
Bieber who also is codirector of NJIT's Collaborative Hypermedia Research Libratory told LJ that project participants are using a "wrapper" specific to each system that analyzes the display screens such as where the author and title are located. Most vendors today produce pages in XML which makes the process easy--the information is embedded. All results from the same vendor are returned in the same layout wherever you search.


The WEB search interface of the ISIS FRBR Prototype Application is now on-line.

A brief WEB page about the software, with some links, is available.

They say that the server performance is very poor, but I found it just fine.

Metadata Authority Description Schema

In preparation for a new MADS draft, there is a "preview" draft".

For reference see Rebecca's message of August 27 (subject: MADS review) accessible from the archive.

There are three changes from the previous version:

  1. Descriptors now allow empty elements, so you can use an xlink attribute (URI) in lieu of a value.
  2. < ref > is replaced with two elements:
    • < related type="earlier | later |parentorg | broader | narrower |other" > and
    • < variant type="acronym |abbreviation | translation | expansion | other" >.
    The high level structure now is:
    • One < authority >
    • Zero or more < related >
    • Zero or more < variant >
    • Zero or one < otherElements > which is a wrapper for one or more "other elements".
  3. All of the MODS elements in the previous MADS draft (e.g. modsNamePart -- namePart from MODS was copied verbatim into MADS and renamed modsNamePart) have been removed, and instead there are references to MODS (e.g. mods:namePart). This requires some changes to MODS, and a draft is available. This is not a proposal for a new MODS version, just an illustration of what needs to be done to MODS to support these MADS references, and this MODS draft is fully compatible with the current version, it simply restructures some of the definitions.

An outline according to the new version is available.

Monday, November 22, 2004


The Outcomes of the Meeting of the Joint Steering Committee Held in Cambridge, England 18-21 October 2004 are now available.

Librarian's Book Club

So Many Books, So Little Time by Sara Nelson is the December selection for the Librarian's Book Club.

Friday, November 19, 2004

Just for Fun

Just for fun, and because it is so simple to do, I've created a Catalogablog tool bar. Download the Catalogablog toolbar. Hope everyone has an easy trip back from Internet Librarian. And I hope the folks who took our Explore! Fun With Science training this past week in South Carolina found it worthwhile.

Open Source

The Open Source Software and Libraries Bibliography compiled by Brenda Chawner has been updated.
it includes announcements, journal articles, and web documents that are about open source software development in libraries. It also includes articles that describe specific open source applications used in libraries, in particular dSpace, Koha, Greenstone, and MyLibrary.
The bibliography is also available in an EndNote format. Nice. Thanks. Seen on LISNews.

MARC to Dublin Core Converter

MARC::Crosswalk::DublinCore is now available on CPAN.

It includes methods to convert MARC records to both simple and qualified Dublin Core. Future plans include doing a reverse crosswalk (DC->MARC) and having the sample program (marc2dc) return DC/XML output rather than an ad-hoc text format.

There are other tools for this task, MarcEdit for example. It is nice to have a choice and find the best one for the work needed.

Thursday, November 18, 2004

LC Strategic Plan

The Bibliographic Access Divisions at the Library of Congress have issued a new strategic plan for the fiscal years 2005 and 2006 (October 1, 2004 through September 30, 2006). The plan is available on the Bibliographic Access Divisions public Web page.

Wednesday, November 17, 2004


Looks interesting. Spam kings by Brian McWilliams.
The book sheds light on the technical sleight-of-hand--forged headers, open relays, harvesting tools, and bulletproof hosting--and other sleazy business practices that spammers use; the work of top anti-spam attorneys; the surprising new partnership developing between spammers and computer hackers; and the rise of a new breed of computer viruses designed to turn the PCs of innocent bystanders into secret spam factories.

LC Cataloging Newsline

The latest issue of LCCN includes:
  • Library Services Realignment
  • IFLA Meeting of Experts on an International Cataloging Code
  • Carolyn R. Sturtevant Appointed BIBCO Coordinator
  • Eve Dickey Appointed Team Leader in Decimal Classification Division
  • Basic Subject Cataloging Using LCSH: New Workshop
  • Reser and Greenberg Address LC Conference Forum in Orlando
  • A Treasured Legacy
  • LC Plan to Accommodate 13-Digit ISBN


Medeiros, Norm (2004) Repurposed metadata : ONIX and the Library of Congress' BEAT Program. OCLC Systems & Services 20(3):pp. 93-95.
This article reviews the ONIX-based efforts of the Library of Congress' Bibliographic Enrichment Advisory Team (BEAT). The article describes BEAT's table of contents, publisher description, and sample text initiatives, and the ways libraries and their patrons can benefit from these efforts.
OK, so LC is using ONIX metadata. Does anyone else have access to it and use it? Seems to me the publishers are viewing it as proprietary information and not making it widely available.

Tuesday, November 16, 2004

MARC Code Lists

Additions to the MARC Code Lists for Relators, Sources, Description Conventions

The code listed below has been recently approved for use in MARC 21 records. The new code will be added to the online MARC Code Lists for Relators, Sources, Description Conventions. The new code should not be used in exchange records until after January 15, 2004. This 60-day waiting period is required to provide MARC 21 implementers with time to include the newly defined code in any validation tables they may apply to the MARC fields where the code is used.


    mar - Merenkulun asiasanasto = MariTerm (Maritime technology) (subfield $2 in Bibliographic and Community Information records in fields 600-651, 655-658 and field 040, subfield $f (Cataloging Source / Subject heading/thesaurus conventions) in Authority records)


Toward a Metadata Generation Framework: A Case Study at Johns Hopkins University by Mark Patton, David Reynolds, G. Sayeed Choudhury, and Tim DiLauro appears in D-Lib Magazine (November 2004) v. 10, no. 11
In an effort to explore the potential for automating metadata generation, the Digital Knowledge Center (DKC) of the Sheridan Libraries at The Johns Hopkins University developed and tested an automated name authority control (ANAC) tool. ANAC represents a component of a digital workflow management system developed in connection with the digital Lester S. Levy Collection of Sheet Music.

Open Source ILS

PMB, now at version 1.3, is a French ILS project, based on PHP/MySQL, which runs under Windows, Linux, MacOS X. Developed in French, the project has a very active users community. Although PMB supports an easy translation, there are still parts which remain to translate. A demo is available. Adapted from a notice on OSS4lib.

Koha 2.2.0RC1 is out. The RC means release candidate This version is in the stable tree of Koha, but still evaluated as "Release Candidate". A few bugs are still known or have to be found, maybe by you. But it's stable enough to be used in production.

Monday, November 15, 2004

Online Lunar Maps

Out of scope, but I have to mention the work done by my co-workers. The Lunar Topographic Orthophotomap (LTO) Series are now online. There will be change to the website over the next several weeks. Currently, the last image option available is the "click here to request a high-res image (Requires JPEG 2000 viewer)" We will be replacing this line with a direct link to the JPEG 2000 map image. The Web team will be replacing the link at the rate of 5-10 per day depending on the internet traffic.

The Lunar Photo of the Day gives the work a glowing review.

Once again, quietly and without fanfare, the Lunar & Planetary Institute has delivered a magnificent collection of rare lunar maps for our daily use. The Lunar Topographic Orthophotomaps - the absolutely best lunar maps ever map - are 1:250,000 scale Apollo Metric photos overprinted with accurate topographic contours at 100 m vertical intervals. LPI has released digitized versions of 215 LTOs at 4 resolutions: a browse image; a 72 dpi, 2.7 MB image; a 150 dpi, 10 MB image; and a high res image that is so big you have to request LPI puts it on a ftp server for you. I browsed through many images (and already downloaded 3 of the 150 dpi ones - I need that res to read all the contours) to remind myself of what they show. At the top left corner of the index map shown above is a small piece of 41-A4 map of the Beer and Feuille area southwest of Archimedes that gives a degraded view of the large scale and contour density. As the coverage map shows, only the areas under the Apollo 15, 16 and 17 flight paths had stereo coverage allowing topography to be derived by photogrammetry. But this area includes many interesting features, including the rugged dome Cauchy Tau, which is seen to be 342 m high, quite a bit taller than the previous best estimate of 149 m derived from the less accurate photometric method. Most of these LTO maps were published in 1974, so 30 years later, enjoy them and thank LPI for their tremendous effort in digitizing and placing these treasures online!

Government Documents

Your help is needed by the GODORT Notable Document Panel in not only publicizing but also acknowledging outstanding government information through the annual article in Library Journal.

Just take a few minutes between now and December 31, 2004 to nominate publications from any level of government---state and local, federal, international, intergovernmental organization, foreign government or material published on behalf of a governmental agency by a commercial publisher---for an award. Please only consider materials published in 2002 and 2003 but these can be in any format---websites, CDs, books, maps, audio-visuals, or microfiche. Agencies and publishers are encouraged to nominate their best titles.

The nomination form is available.

A short history of the program is available.

Each year the panel selects approximately 30 notable documents from the list of nominations. These are then featured in the May 15, 2004 issue of Library Journal. Our purpose is to publicize government documents to the broader library community, to honor the agencies and staff responsible for these wonderful documents or web sites, to create a selection tool available to all types of libraries, and to publicize the work of GODORT. The production and distribution of government information widely and in a timely manner remains of critical importance to us all.

Please contact the Panel Chair, Linda Johnson, if you need additional information or have questions. Thank you in advance for your participation in this important effort.

Linda B. Johnson
Head Government Documents Department
Dimond Library
18 Library Way
University of New Hampshire
Durham, New Hampshire 03824
phone: 603-862-2453
fax: 603-862-0247

NACO Normalization

New at CPAN, Text::Normalize::NACO - Normalize text based on the NACO rules.

Dublin Core

A new version - 0.2(beta)- of Dublin Core Services/Describethis has been published. This new version, as main feature, brings us an automatic generator of keywords: DCS incorporates now a dictionary of 5300 words in 11 different languages, included Catalan, Portuguese, Russian, Arabic, Italian, among others, that permits to recognize and generate keywords automatically. The system applies analitic algorithms to find the best terms that better describe a given resource. The new terms generated are added to the ones already included in the document, although these are marked visually to avoid confusions with the terms proposed by the own authors. In the case of the HTML documents that do not have included these type of metadata, the list of generated keywords can be used as a guide and a valid proposal for the publishers and authors of these contents.


Texadata discusses cataloging, metadata, and resource description issues for scholarly digital resources. Other topics of interest include academic libraries, digital libraries, epistemology, semantics, and trends in scholarly communication.

An interesting blog. Most recent post was about standards for digitizing still images. A neighbor too, from here in Texas, at Texas A&M.


Some interesting themes at the upcoming XTech Conference.
The second of the new tracks is called Open Data. Increasingly more information owners are choosing to be an active part of the web, rather than just hosting HTML pages. Some of the highest profile commercial examples of this has been Amazon and Google's web services.

Opening up data encourages its creative re-use, empowers citizens and creates new commercial opportunities. Governments, not-for-profit institutions, academia as well as commercial organisations are all experimenting with open data. Open Data is embodied by movements such as Creative Commons, OAI and Open Access.

At an individual level, exciting open data developments are happening through technologies such as blogging and social networking applications that choose not to lock in their data. Think FOAF, RSS, semantic web, Flickr and The Open Data track will look at the technology, policy and commercial issues involved in opening up data on the web.

I hope some European librarians are able to attend and present papers. We have a lot to share with the computer folks.

Friday, November 12, 2004


Technical Bulletin 251: Connexion WorldCat Searching is now available in HTML and PDF

TB 251 covers changes in search syntax, stopwords, and indexes for WorldCat searching in Connexion.

The changes described in the technical bulletin will be implemented in the Connexion interface on November 21, 2004.


This is what I love about open source, just how quick things can happen. The other day someone dropped a note to the Koha mail-list asking about a link to Amazon from Koha. I then mentioned a link to OpenWorldCat would be a nice option, if there were no results from a search. Just thinking aloud. Well, today both are available and a link to Project Gutenberg is next. For the WorldCat page, search "Once and Future Moon" or another unlikely title and see the results. For the Amazon link check out this page at the bottom. How long has Jenny been trying to get the idea of RSS into the major vendors heads? And with Koha this enhancement has taken a few days. Thanks to Chris for these features.

RSS in the Catalog

More serial publishers are making TOC and other information available via RSS. Nature, for example, has a huge selection of feeds. Has anyone started to place these links in their catalog records?

Dublin Core

Report of the meeting of the Dublin Core Libraries Working Group DC-2004, Shanghai, Tuesday 12 October 2004.

30 people attended this working group meeting

The agenda of the WG was

  1. Presentation of the revised DC-Lib - Robina Clayphan
  2. Feedback form the Usage Board - Rebecca Guenther
  3. Presentations from the floor - speakers
  4. Discussion and work plan for the next year - all and RC
  1. Presentation of the revised version of DC-Lib The main task of the WG over the past year was the production of a revised version of the library application profile, DC-Lib. This has been updated with the decisions taken at the meeting in 2003 and reformatted according to the CEN Guidelines for DCAPs. Robina Clayphan presented the revised version and explained the changes that had been made to conform to the Guidelines. There were several issues that were still awaiting the outcome of Usage Board meetings and one or two that are being handled by other working groups, notably the Date working group. The main outstanding issues were:
    • How to include terms from the MARC list of relators as refinements of DC.Contributor
    • How to declare or register needed encoding schemes
    • The issue of re-use of terms from other data models
    • Remaining Date requirements and a few open questions in the profile
  2. Feedback from the Usage Board Roles. To progress the desire to provide more specific roles for an agent in relation to a resource LoC has created an RDF version of the MARC Relators list. The entire list had been analysed to identify the subset of terms which refine the semantics of contributor (i.e making contributions to the content of the resource). For each term in the subset to which this applies the RDF document contains an assertion that they are subproperties of dc:contributor. These assertions have been endorsed by the DCMI Usage Board. Such terms will dumbdown to contributor if used as a dc term and the remainder of the Relator terms may be used if desired with the marcrel prefix e.g. marcrel:OWN for the term Owner. Guidelines will be issued for the use of the terms and to clarify the dumbdown situation. For future maintenance, LoC will contact the UB when a term is added to determine whether or not it is a subproperty of contributor and the assertion made accordingly. The agreed subset of terms can be seen.

    Additionally, the term 'distributor' has been declared a subproperty of dc:publisher and 'depicted' subproperty of dc:subject.

    This solution marks a whole new departure in using terms from another namespace: allowing DC descriptions to be enriched using terms from controlled vocabularies maintained by other trusted agencies. Moreover, the RDF document not only defines the terms but provides an identifier for each term in the form of an http URI. This means that these terms are uniquely, persistently and machine readably identified in a way that will enable use in semantic web developments.

    A remaining issue for DC-Lib is to specify a way in which to refer to the subset of role terms as a whole rather than have to list each one separately in DC-Lib. This issue will be taken back to the UB by Rebecca.

    Encoding Schemes. Where an encoding scheme has been approved by the UB it has been defined and uniquely identified in the DCMI namespace and can easily be referred to in an application profile. It was suggested that a formal registration process should be estqablished to enable further encoding schemes to be declared and used in DC application profiles. The UB feels the DCMI namespace cannot realistically be indefinitely expanded this way and agencies should be encouraged to use their own URIs rather than go through a formal registration process. This is an experimental time and relationships such as the one now established for the relators list should be set up. The UB will develop a policy, process and documentation for endorsing non-DCMI encoding scheme URIs.Two schemes that were already in process will be completed however, thse being ISO8601Basic (without hyphens) and NLM Classification.

    Use of terms from different data models. Although they use different data models DC-Lib incorporates three terms taken from MODS. It has been argued that XML elements cannot be included in DC Aps unless the assertion is made that they are RDF properties. It is as yet unclear whether to make these assertions or to propse new elements so this issue remains unresolved.

  3. Presentations Mary Woodley of the California State University discussed the San Fernando Valley History Digital Library and the use made of Dublin Core in describing the digital objects. Corey Harper, of the University of Oregon, discussed the development of digital collections with CONTENTdm as well as the UO's institutional repository, Scholars' Bank. Thanks to both speakers for putting DCMES in the context of real developments.
  4. Workplan
    • Update DC-Lib in line with UB decisions.
      Who: Robina Clayphan.
      When: Spring 2005
    • Submit DC-Lib to UB for "Registered" status.
      Who: Robina Clayphan.
      When: for next UB meeting in 2005
    • Suggest encoding schemes to include in DC-Lib.
      Who: all.
      When: on-going.
    • Write an XML schema for DC-Lib.
      Who: Ann Apps.
      When: Start with current version and complete when new version available.
    • Produce guidance documentation together with User Documentaion WG.
      Who: task to be sub-divided between WG members. RC to find volunteers.
      When: Autumn 2005
Respectfully submitted
Robina Clayphan
Chair, DC Libraries Working Group

Wednesday, November 10, 2004


Typos in the catalog can be a problem for our users. More Typographical Errors in Library Databases, a listing of typos found by a group out to correct these errors has been updated. This list contains more subtle errors. "The words in this list are correctly spelled for some uses, but not others. Some of the words are personal names, some are archaic and some are not English. Some are filing indicator errors." Check your catalog for some of these when you have a few odd minutes to fill. After a while you will have worked your way through the list. Thanks to all those involved in this effort to clean up our catalogs.


Version 1.20 of the Connexion client is now available for download from the OCLC web site. This release includes NACO support for authorities functionality, local files, batch processing, and more. See the client recent enhancements page for more information about the changes, to download the software, and to access updated tutorials and documentation.

If upgrading from client 1.10 to 1.20, be sure to review Section 5 of the Getting Started document. OCLC will discontinue version 1.10 on March 1, 2005.


A good introductory article on RSS is What is RSS and how can it serve libraries? by Zeki Celikbas in In Yalvac, Mesut and Gulsecen, Sevinc, Eds. Proceedings First International Conference on Innovations in Learning for the Future: e-Learning, pp. 277-292, Istanbul (Turkey).

Tuesday, November 09, 2004


The Nov. issue of the NISO Newsline has the article The 13-Digit ISBN: There's Time, But Act Fast!
Worldwide, the industry will move to from a 10-digit to a 13-digit ISBN on January 1, 2007. Experts from NISO and counterpart organizations, working within the ISO, have taken steps to make the transition as easy as possible for publishers and librarians. The NISO web site has answers to a range of basic question as well as links to further information.

January 1, 2005 is the first important date in the move to 13 digits. That's what the Uniform Code Council has established as the "sunrise date" for U.S. retailers to join the rest of the world in using a full 13-digit European Article Number (EAN) in place of the current US 12-digit Universal Product Code (UPC), or bar code. The change supports the sale of US books through all channels using a single identifier.

NISO urges publishers to submit title information to Books In Print right now. Every ISBN registered will automatically be converted to 13-digits.

Just an idea, would it make sense to convert the 10 digit numbers into the 13 and populate the catalog with those in field 024? All valid 020's could have 978 added to the start of the number and the last number recalculated. Surely a script like that could not be too difficult. Would it be worth the effort?

Topic Maps

Published Subjects for practitioners of RDF, Semantic Web, Topic Maps, Ontologies and Business Vocabularies! The OASIS Published Subjects TC is holding a nocturne, Monday, November 15th, 7:30 PM - 9:00 PM, Taft Room in the Marriott Wardman Park Hotel at the start of XML 2004.

Join us to find out how the Published Subjects TC is ramping up to complete work already underway and to take on new projects relevant to your subject area. Sorry, no free drinks but lots of discussion and planning of new activities for the Published Subjects TC.

Published Subjects is an open, distributed mechanism for defining unique global identifiers. Based on URIs, the Published Subjects mechanism has two unique characteristics: It works from the bottom up, and it works for humans AND computers. For more information see Published Subjects: Introduction and Basic Requirements.

The goal of the OASIS Topic Maps Published Subjects Technical Committee is to promote Topic Maps interoperability through the use of Published Subjects. A further goal is to promote interoperability between Topic Maps and other technologies that make explicit use of abstract representations of subjects, such as the Resource Description Framework (RDF) and the Web Ontology Language (OWL).

Published Subjects as defined in this Specification provide an open, scaleable, URI-based method of identifying subjects of discourse. They cater for the needs of both humans and applications, and they provide mechanisms for ensuring confidence and trust on the part of users. Published Subjects are therefore expected to be of particular interest to publishers and users of ontologies, taxonomies, classifications, thesauri, registries, catalogues, and directories, and for applications (including agents) that capture, collate or aggregate information and knowledge.


The monthly DDC updates, this month includes a message requesting comment:
The following new and changed entries are effective on November 1.

We have updated Options A and B in the 340 Law schedule in DDC 22. Each entry that includes information on Options A and B has been updated; all affected entries are included below. Changes are underlined; 349.9, 349(.91), 349(.92) and 349.93-349.99 are new entries.

We would like to know how many libraries use Option A, B, or C instead of the standard arrangement, and the reason the option is preferred over the standard arrangement. Please send us [] the name of your library, the edition of Dewey in use in your library, the option you use, and the reason you prefer the option. We would appreciate a response by December 10, 2004. We will report back our findings to Dewey users in early 2005.

Thanks Ian.

MARC Tag of the Month

At Follett the MARC Tag of the Month for November is MARC Record Sample -- Sound Recording (cassette) -- Non-musical Audiobook. Seems to be missing field 306. Do people use that field? I don't catalog much in the way of audiobooks, so it may be a non-issue.

Monday, November 08, 2004


The latest CONSERline Newsletter is a special issue devoted to the CONSER Summit on the Digital Environment. Contents:
  • Introduction
  • Personal Impressions:
    • Cindy Hepfer
    • Wen-ying Lu
    • Roxanne Sellberg
  • Action on Recommendations
  • Articles and Presentations

Library ELF

I have to mention again the great tool Library ELF.
ELF is an Internet-based tool for anyone who uses the library and would like to keep track of what they've borrowed. Subscribers are able to check one or more library accounts in one place -- it's a way to simultaneously keep tabs of the status of all of one's checked-out items.

Specify when you'd like to be notified and ELF will send email reminder notices of upcoming due dates (sometimes called pre-overdue or early notification) and overdue reminders. And by visiting the ELF website, you can view the current status of your library borrowings.

Whether by email or by website, ELF checks accounts (in real-time) and consolidates the information into one report. Instead of going to each library account for updates, ELF will automatically do the checking and sends courtesy email reminders.

Ideal for interlibrary library users (students or individuals with multiple library accounts) and intralibrary library users (families, clubs, etc.).

Pretty slick. I've been using it for 2 of the local libraries I frequent and appreciate having all the information on just one page. The developers seem to be very responsive.

ID3 Metadata

MP3 files have metadata, ID3 tags, associated with the audio file. I'm in need of an ID3 editor, free since I have only a very few files to edit. Any suggestions?

Friday, November 05, 2004

This 'Blog

I've made some changes to the layout and search of the 'blog. Improvements, I hope. I've changed the search tool, now it is in the bar across the very top. The old tool no longer searched the older pages, it had a page limit that Catalogablog exceeded. Please let me know if there are any problems.


Just a notice that now is the time to renew or join NASIG.
Established in 1985, the North American Serials Interest Group, Inc. (NASIG) is an independent organization that promotes communication, information, and continuing education about serials and the broader issues of scholarly communication. NASIG welcomes anyone interested in the serials information chain. Inspired by the United Kingdom Serials Group (UKSG), NASIG held its first conference at Bryn Mawr College in June 1986. The annual conference, usually held in late May or June, offers a premier opportunity to meet others representing the diverse interests of the serials community and to hear speakers who are on the cutting edge of scholarly communication.
Annual dues are a mere $25.00. Students only $5.00.

Thursday, November 04, 2004

Rights Metadata

Creative Commons announces new drag-and-drop software for rights metadata.
Leveraging the Internet Archive's generous offer to host Creative Commons licensed (audio and video) files for free, we recently completed the 0.96 beta version of The Publisher, a desktop, drag-and-drop application that licenses audio and video files, and sends them to the Internet Archive for free hosting. When you're done uploading, the application gives you a URL where others can download the file. It also is able to tag MP3 files with Creative Commons metadata and publish verification metadata to the Web.


Implementation Guidelines for the Open Archives Initiative Protocol for Metadata Harvesting Conveying rights expressions about metadata in the OAI-PMH framework has reached a beta release.
The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) provides a mechanism for data providers to expose metadata for harvesting over the Web. This metadata is disseminated in OAI-PMH records. Metadata harvested from one or more data providers using the OAI-PMH can be used by service providers for the creation of services (e.g. search, browse) based on the harvested data.

Data providers might want to associate rights expressions with the metadata to indicate how it may be used, shared, and modified after it has been harvested. This specification defines how rights information pertaining to the metadata should be included in responses to OAI-PMH requests.

Metadata for Digital Still Images

The Library of Congress is involved in creating metadata for digital still images.
The Library of Congress' Network Development and MARC Standards Office, in partnership with the NISO Technical Metadata for Digital Still Images Standards Committee and other interested experts, is developing an XML schema for a set of technical data elements required to manage digital image collections. The schema provides a format for interchange and/or storage of the data specified in the NISO Draft Standard Data Dictionary: Technical Metadata for Digital Still Images (Version 1.2). This schema is currently in draft status and is being referred to as "NISO Metadata for Images in XML (NISO MIX)". MIX is expressed using the XML schema language of the World Wide Web Consortium. MIX is maintained for NISO by the Network Development and MARC Standards Office of the Library of Congress with input from users.

New Books

Here are a couple of new books from O'Reilly that may be of interest.

XML in a Nutshell, 3rd ed.

Steal This File Sharing Book

I think we should be more aware of file sharing and how it could be used in a library setting. As part of ILL it makes sense. Maybe to share patron information between different institutions in a reciprocal borrowing consortium. There are privacy and security issues to consider but it may be something to consider. Could it be used for e-reserves?

Wednesday, November 03, 2004

Reference Tools

How cool is this:
Reference by SMS is an exciting new service designed specifically to allow libraries to expand their reference delivery methods to include SMS, the mobile phone text messaging system so popular with the youth of today.


The "Reference by SMS" service provides a mobile phone number, unique to your library, that you can advertise as the number for SMSing your library. SMS's received by that number are automatically delivered to an email address that your library specifies. The librarian monitoring that email address creates responses in email using their usual "Reply" facility and an integrated "Send by SMS" tool designed to assist with the short replies used by SMS. Your responses are automatically delivered to the client's mobile phone by SMS.

Seems like an open source tool could be constructed, maybe not with all the "other powerful features are provided to assist with implementation, usage, management and marketing of this service" but maybe more. An RSS feed for Steven Cohen? Seen on Peter Scott's Library Blog.


Recently I cataloged some scale models, something I've not often done. I thought the scale was important so I placed it in a 500 field. Today, while looking for something else, I happened to notice there is a specific field, 507, for scale. I've now fixed the records.

How often does this happen? How much information is in field 500 that belongs in a field dedicated to that information? I haven't noticed it much, except in very bad records where all notes are in 500. Still this gets me wondering. The calls for granularity (see Roy Tennant) require placing certain types of information in a particular field. However, the existence of such fields is moot if they are not used. How does a lone cataloger even know to look for the existence of a field?

Tuesday, November 02, 2004


Cybertesis, a portal developed by the University of Chile (Information Services & Library System), provides an easily accessible tool to full text electronic thesis published in different universities of the world.

Cybertesis.NET is a powerful tool of consultation that allows the simultaneous searches through a single Web interface, and to recuperate more than 27,000 full text theses stored in 27 different servers and university repositories, by means of the use of OAI protocol (Open Archives Initiative) as a service provider (metadata harvesting).

Each university is responsible of the production, preservation and diffusion of its theses. If you want your institution to participate, simply make your ETD collection available to OAI harvesting and drop them a line.

The interface is available in English, French, Portuguese, and Spanish.

Professional Reading

The latest issue of Ariadne contains:
  • ISBN-13: New Number on the Block
    Ann Chapman outlines the planned changes to the ISBN standard and its impact on the information community and the book trade
  • The Tapir: Adding E-Theses Functionality to DSpace
    Richard Jones demonstrates how the Theses Alive Plugin for Institutional Repositories (Tapir) has provided E-Theses functionality for DSpace.
  • What Do Application Profiles Reveal about the Learning Object Metadata Standard?
    Jean Godby assesses the customised subsets of metadata elements that have been defined by 35 projects using the LOM standard to describe e-learning resources.


Bringing to the OPAC: One library’s mission to enrich its catalog and improve search capabilities by Patricia R. Monk appears in AALL Spectrum November 2004. Seen on Library Stuff

Open WorldCat

Corey Murata has posted 3 Open WorldCat tools for use:
  • Open WorldCat Lookup Using Either Google or Yahoo
  • Mozilla Searchbar Open WorldCat script
  • Open WorldCat search using the Google Deskbar
Seen on Library Stuff.


SLA has made the rather short-sighted action of dropping their membership in NISO.

Scout Portal Toolkit

Version 1.3.1 of SPT has been released, and is now available for download on the Scout web site

Highlights of the release include:

  • new accessibility "mini-wizard"
  • support for OAI-SQ (new OAI-PMH extension for searching)
  • ability to page through user list and to remove forum posting privileges for a user from within forums
  • ability to delete Tree (classification) field values that have resources assigned to them and to delete entire sub-trees
  • support for searching controlled name variants
  • support for an expanded number of date entry formats
This is a stable production release, not a beta version. Capabilities that have been added since the last stable release include:
  • support for OAI sets
  • support for customizable templates for saved search e-mails
  • ability to export Tree, Controlled Name, and Option values
  • increased field size for adding/editing classifications
  • importing of data now translates \t and \n (tab and newline) to literal values
  • new optional unique field for data import

Sunday, October 31, 2004

Timex Data Link Watch

An appeal for help. My Timex Data Link watch can no longer receive info from the computer. The bar codes begin but then the screen is minimized and breaks the transmission. The watch is an older technology, but it still works for me. If any one has a suggestion on how to get it working again please let me know. The last time it was the date separator. XP was using the slash and the watch required the dash. I've checked that and that does not seem to be the problem this time. Would the notebook adaptor, if I could find one, be a good alternative? I have tried running it in 95 and 98 emmulation modes. Thanks.

Monday I have jury duty, so there may not be any posts for a few days.

Friday, October 29, 2004


Here is a publisher who gets it.

Nature Publishing Group (NPG) is pleased to announce the completion of the current phase of its RSS newsfeed collection which delivers tables of content for its journals and other timely information to scientists' desktops. This crop of newsfeeds covers the non-Nature-branded titles and complements our existing RSS offerings for the Nature-branded titles. Specifically the newsfeeds deliver information about NPG's Advanced Online Publication (AOP) series and announce articles published online ahead of being published in an archival issue. As with NPG's existing newsfeeds, this new batch of RSS newsfeeds comes with rich metadata. The titles covered by the new release are:

See for a complete listing of all Nature newsfeeds, as well as a brief introduction to the advantages of using RSS. Additionally the listing of newsfeeds is accessible as an OPML file to facilitate the ready import of NPG newsfeeds into RSS newsreaders. A master RSS newsfeed of all NPG newsfeeds is also available for alerting subscribers to new NPG newsfeeds.

These newsfeeds are all based on the RSS 1.0 format which builds on the W3C Resource Description Framework and allows rich metadata (both PRISM and Dublin Core) to be included at both the channel and item level. Linking to the article full text is effected using industry-standard mechanisms for persistent linking: DOI and CrossRef.

That's a lot of structured metadata. As metadata specialists maybe we should be doing something useful with it. Maybe the jake folks could use it? Just musing?

Electronic Theses and Dissertations

On my recent trip to Virginia I had the pleasure of a quick look into the VA Tech library. Nothing surprising, a large collection, well used, people waiting to get in when the library opened, an important part of the education and research mission of the university.

When I think of the VA Tech library I think electronic theses and dissertations. They were important in the early development of tools and standards in that area. Now they host the Electronic Thesis/Dissertation OAI Union Catalog as part of The Networked Digital Library of Theses and Dissertations (NDLTD).

Thursday, October 28, 2004

Metadata Object Description Schema

The Library of Congress MODS Team would like to assemble and make publicly available a list of both active and prospective MODS implementations.

The MODS Implementation Registry will contain descriptions of MODS projects planned, in progress, and fully implemented. It will provide the MODS community with important information about how MODS is being used in various projects throughout the world. The registry will be available online.

The LC MODS Team invites institutions and organizations who are implementing or planning to implement MODS to submit the following information to either the MODS list ( or to the Network Development and MARC Standards Office ( at the Library of Congress.

  1. Name of the institution or organization implementing MODS
  2. The MODS project name
  3. A short description of the MODS project
  4. Projected dates of implementation
  5. A URL to the MODS project web site (if available)
  6. A URL to any available documentation or specifications developed forthe MODS project
  7. A list of any MODS tools developed and or used as part of the MODSproject
  8. The MODS version used in the project
  9. Contact name and e-mail address
Please provide this information by November 15, 2004.

Metadata Crosswalks

"Metadata Crosswalks" by Carol Jean Godby appears in the latest issue (July/August/September 2004) of the OCLC Neswletter. It has yet to appear on the Web site, look for it soon. Or, if an OCLC member, look in your mail box. Progress on the Metadata Schema Transformations project can be seen on-line.

Wednesday, October 27, 2004

BioMed Central

BioMed Central provides MARC records to facilitate the cataloging of their large collection of Open Access journals. A delimited spreadsheet containing titles, URLs, ISSNs, journal abbreviation and date of initial publication is also available. The MARC records are very bare bones, less than minimal level, but they are something.

Open WorldCat

OCLC has decided to turn the Open WorldCat pilot into a program that will be an ongoing benefit for OCLC member libraries. The pilot has clearly demonstrated the value of making WorldCat records and library holdings available to the general public on the open Web. During the pilot phase, Open WorldCat brought millions of click-throughs from Web search engines such as Google and Yahoo! Search to the Open WorldCat interface, where users could find and link to library catalogs and other resources. This is consistent with OCLC's chartered objective of increasing the availability of library resources and goal of weaving libraries into the Web.

OCLC has released an updated interface for the Open WorldCat Program. Interface enhancements include:

  • A library distance indicator
  • An expandable view of holdings beyond local libraries (when available)
  • A more intuitive display of IP-authenticated, library services links
  • Book cover images, when available (for IP-authenticated users)
  • An easier-to-use layout to help web users find materials in library collections
OCLC plans the following enhancements to Open WorldCat by January 2005:
  • Expand the set of libraries exposed through Open WorldCat to all libraries with ownership information in the WorldCat database. This is scheduled to occur on November 7. Libraries that do not wish to participate in the Program may request removal using the "Update your links" feedback form. OCLC will continue to honor previous removal requests.
  • Make all records with holdings in WorldCat available for harvesting by Google and Yahoo! Search. Harvesting by partners will occur gradually over a period of time.
  • Release a usage statistics site for libraries. With these statistics, library staff may monitor the number of times someone reaches their catalog, Web site or various IP-based FirstSearch fulfillment links from the Open WorldCat interface.
A fact sheet with additional details about the Open WorldCat pilot and plans for the ongoing Open WorldCat Program is available on the Open WorldCat Web site.

Shifted Wiki

The Shifted Librarian is back and using a Wiki. A Wiki is a Web page that anyone can edit and change. If you have been curious or this is totally new, check this out and add links and suggestions. The Shifted Wiki is
a place to put your ideas for RSS feeds you think different types of librarians could benefit from reading in an aggregator. If we can pull this off, it will be a great way to help those who are new to RSS get a jumpstart.

So I'd like to ask that you think about which library feeds are most useful to you and ask that you add them to the appropriate library type on the wiki. There's also the start of the never-got-off-the-ground Honor Roll of Shifted Libraries, so feel free to add suggestions there, too.

Tuesday, October 26, 2004

Metadata for Digital Still Images

Capturing Technical Metadata for Digital Still Images by Robin L. Dale and Gunter Waibel appears in RLG DigiNews.
the community cannot afford to wait for a "magic bullet" file format because it is unlikely to come. Instead, institutions should become familiar with existing tools that will assist with metadata exposure and extraction. The Spotlight below contains a list of metadata harvesting tools that are available for use now. In addition, RLG will be releasing two new tools as a part for the Automatic Exposure initiative. The first tool, the "Automatic Exposure Scorecards," will profile and review the available technologies for capturing technical metadata. Scorecards will be available on the Automatic Exposure web site in the coming months. A second tool under development is a Z39.87-Adobe Extensible Metadata Platform (XMP) panel to allow the extension of the metadata handling capabilities of Adobe Photoshop, a commonly used software package in the cultural heritage digitization process. When completed, this tool will be announced in a future issue of RLG DigiNews and will be freely available on the RLG web site.

FRBR Institute, San Jose, California

Registration is still open for the California Library Association Institute "The FRBR Model: Influencing Future Development of Information Standards." Sponsored by the CLA Access, Collections, and Technical Services Section, the one-day institute will be held on Friday, November 12, 2004, 9 AM-5 PM (with a break from noon to 1:30) at the San Jose Convention Center in San Jose CA. All library staff are welcome to attend. Cost is $75. Pre-registration closes Monday, November 1.

Register online.
Printable registration form

The institute will feature presenters:

  • Barbara B. Tillett, Chief, Cataloging Policy and Support Office, Library of Congress
  • Thomas B. Hickey, Chief Scientist, OCLC
  • Merrilee Proffitt, Program Officer, RLG.
The Functional Requirements for Bibliographic Records (FRBR) model provides a new way of defining relationships between bibliographic items, their creators, and their subjects. While it embodies the fundamental laws of cataloging and librarianship, it also offers ways to further develop and enrich existing catalogs. This full day workshop will provide participants with an introduction to FRBR concepts, descriptions of FRBR research, and a project implementing FRBR concepts, as well as the opportunity to hear from some of the leading experts in the field. Informative handouts will be provided.

Registry of Digital Masters

New Registry of Digital Masters service available. This new service is a joint venture between OCLC and the Digital Library Federation. Its purpose is to provide a means to find digitized materials (or soon-to-be-digitized materials) that have been digitized according to established standards and best practices. The registry is a subset of WorldCat and is available to library users via FirstSearch.

For guidelines on using the registry--including information on how to contribute registry records, a glossary, future enhancements and more--visit the Registry of Digital Masters Record Creation Guidelines.

Background on the registry.

Monday, October 25, 2004

Weekend in Virginia

Pictures from my long weekend in Virginia. Off topic. Virginia is so pretty.

Thursday, October 21, 2004

Web Standards

I'm all for standards and this may help a bit.
The Web Standards Awards aims to promote web site design using W3C standards by seeking out and highlighting the finest standards-compliant sites on the Internet. By showing you standards-compliant sites that make your jaw drop, we hope to show you that web standards aren't a constraint, they are a liberation. About sixty winners of the Web Standards Awards are listed with links to their web sites.
Seen at Infomine.


The Library of Congress' Cataloging Policy and Support Office announces the availability of the basic presentation of the FRBR model (Functional requirements for bibliographic records) in both English and Spanish language versions. CPSO acknowledges that the excellent translation into Spanish by Ana Maria Martinez (Universidad de la Plata, Argentina), and the collaborative review efforts of Elena Escolano (Biblioteca Nacional de Espania) and Ageo Garcia (Tulane University) have combined to make this document the oficial authorized translation.

Dublin Core

This is the announcement of the publication of the first version (1,0 beta) of DescribeThis, a service designed for the automatic extraction of metadata from online resources. The site offers an easy to use interface where you can indicate the resource to analyze and how to download the results as XML, XHTML or RDF files.

In the current version (1.0 BETA), the site's engine is able to find the resources to process using keywords, full URLs or more complex queries with operators, like "ISBN", used to collect the bibliographic data for published documents. In the first case it works as a metasearch engine using other search engines to locate the best sites where the resource can be found. The results returned back contains all the recognized and generated Dublin Core elements for the requested resource and can be downloaded as RDF, XML or XHTML collections.

DescribeThis's main fields of applications:

  • To support and extend the application and development of the Dublin Core format as one of most appropriate metadata standards to describe or catalog resources, digitals or not.
  • To use the site as a tool to support the cataloguing of online resources, oriented to information specialists and Internet users in general.
  • To deliver services of automatic metadata management, designed for managers of bibliographic and content databases.
  • To create an efficient way for administrators and website authors to dynamically provide metadata information about their sites to page crawlers, bots, spiders, agents, worms and other automatic indexing and site classification systems, with the aim of contributing to the improvement of the whole Internet content organization.
In the front page you can find several samples to illustrate the normal operation of the service.

About Dublin Core Services

DescribeThis is a gateway to the functions of analysis, automatic conversion and filtration of digital resources and formats, included as part of a group of web services and tools called Sand Dublin Core Services (DCS). DCS provides support and software infrastructure to develop metadata management applications and services.

In this version, DCS can automatically analyze and to generate metadata registers for the following formats:

  • HTML and XHTML Documents
  • Dublin Core/RDF
  • Dublin Core/XML
  • Dublin Core/HTML (META tags)
  • GIF, JPG (EXIF) and other image formats
  • RSS
  • bibTex
  • proprietary Formats XML (ex.: Amazon XML Web Services)
Support for other well-known formats like PDF, MARC , stream formats (MP3, MPEG, etc), OAI directories, FOAF networks and others will be added in the near future. A more complete information about available features of DescribeThis and DCS will be added on the Sand corporate website as soon as possible.

Tuesday, October 19, 2004

Out of Town

This Friday through Monday I'll be in Roanoke VA. My wife, Cora, is giving a workshop to the Virginia Highlands Chapter of the American Orff-Schulwerk Association. I'll be there to demonstrate that anyone can do these dances and run errands. We are also taking some time to visit family in the area and do some sightseeing. If anyone has some suggestions for the Roanoke/Blacksburg area I'd love to hear them. Needless to say, there will be no postings those days.

Monday, October 18, 2004


I have to mention that my co-workers will be conducting a workshop in Illinois.

EXPLORE! Fun with Science Program

Wednesday & Thursday, December 1 & 2

8:30 a.m.– 5:00 p.m.

Target Audience: Public Librarians and School Librarians working with middle school youth:

  • Are you interested in strengthening your library’s role in providing space science information for patrons?
  • Are you interested in doing more than just surviving science fair time?
  • Are you interested in the potential to develop partnerships and collaborate with science teachers and space scientists?
Then this workshop is for you! Thirty participants will be selected to attend.

Mission: The Explore Fun with Science Program, funded by the National Science Foundation, is designed to bring space science resources and activities to the library environment. Libraries have long provided essential learning resources that strengthen and perpetuate formal and informal education. They have the potential to play an important role in helping bring space science information to everyone through printed matter and online activities. The Lunar and Planetary Institute will endeavor to cultivate and facilitate the development of strong and lasting partnerships between the space science community and libraries.

Overview: The program includes eight topics:

  1. Rockets - Getting Into Space
  2. Space Stations - Living and Working in Space
  3. Space Colonies - Living and Working in Space and on Other Planets
  4. Egg-stronauts - Returning from Space
  5. Solar System - How Did It Form and What is Included?
  6. Shaping the Planets - Impacts, Volcanoes, and Other Planetary Activity
  7. Comets - Dirty Snow Balls in Space
  8. Our Place in Space - How is Earth Unusual? What Influences Earth?
These eight topics are investigated through videos, facilitator PowerPoint presentations, hands-on activities, demonstrations, supporting resources and a Web site. The Lunar and Planetary Institute Education and Public Outreach staff will help workshop participants become familiar with the content and materials so that librarians can share space science with their children, family, and community programs.

Objectives: During the workshop participants will:

  • Meet space science researchers
  • Become acquainted with the Explore themes
  • Undertake Explore demos, activities, and resources designed for the library setting
  • Work with Explore developers and presenters to learn about their methods of presenting Explore to different audiences
Registration: The workshop is free, but registration is limited to 30 librarians. Therefore, you must apply to attend. Applications to Attend are due November 1, 2004. All factors being even, date and time of arrival will determine selection. Applications will be selected with consideration for 1) geographic distribution across Illinois, 2) with priority given to those who regularly work with middle school youth, and 3) applicants expressing interest in exploring or strengthening the library’s role in providing space science information. Attendance is limited to 30. Participants will be notified of acceptance starting November 8.

Each participant will receive the Explore video / DVD set that introduces the eight Fun with Science topics, the Explore CD with presentations, activities, demos, and resources, additional resources, and a $300 stipend for completion of the workshop. These materials are ready to be incorporated into existing library programs. Participants are expected to share the Explore Program through their library’s programs, and with other librarians following the workshop. The two-day workshop will be held Wednesday and Thursday, December 1 & 2, 8:30 am – 5:00 pm. Light morning and afternoon snacks will be provided.

The workshop will be held at:
Illinois State Library, Room 403 300 South Second Street Springfield, IL 62701-1796

For more information, contact Karen Egan 217-782-7749 or

Local hotel options and other logistical information will be supplied shortly after participants register; participants are responsible for travel, housing, and dinner costs and arrangements. A $300 stipend to cover expenses will be awarded to every attendee at the end of the workshop.

Make my co-workers happy and make this a full workshop.


Infrae has released extensions for Python, Zope and the Silva CMS for harvesting web-based repositories exposed using the OAI-PMH standard (Open Archives Initiative Protocol for Metadata Harvesting). In addition they are announcing an extension for the Railroad content repository software for exposing existing Railroad systems as OAI-PMH harvestable repositories. The individual components enable organizations to harvest, index and present data from any OAI-PMH repository, and also allow the setting up of a new repository with Railroad.

Visualizing Bibliographic Metadata

Visualizing Bibliographic Metadata - A Virtual (Book) Spine Viewer by Naomi Dushay appears in D-Lib Magazine (October 2004) vol. 10, no. 10.
User interfaces for digital information discovery often require users to click around and read a lot of text in order to find the text they want to read--a process that is often frustrating and tedious. This is exacerbated because of the limited amount of text that can be displayed on a computer screen. To improve the user experience of computer mediated information discovery, information visualization techniques are applied to the digital library context, while retaining traditional information organization concepts. In this article, the "virtual (book) spine" and the virtual spine viewer are introduced. The virtual spine viewer is an application which allows users to visually explore large information spaces or collections while also allowing users to hone in on individual resources of interest. The virtual spine viewer introduced here is an alpha prototype, presented to promote discussion and further work.


The Institute of Museum and Library Services (IMLS) has awarded a National Leadership Grant of $233,115 to the Texas Center for Digital Knowledge (TxCDK) at the University of North Texas for a project investigating the coding of information and metadata utilization in one million MARC records from the OCLC WorldCat database. TxCDK Fellows Dr. William E. Moen and Dr. Shawne D. Miksa, both from the UNT School of Library and Information Sciences (SLIS), are the Principal Investigators of the project entitled "Examining Present Practices to Inform Future Metadata Use: An Empirical Analysis of MARC Content Designation Utilization".

During the 2-year project the extent of catalogers' use of MARC 21, the mark-up language used by catalogers worldwide to create electronic catalog records, will be investigated. OCLC (Online Computer Library Center) will provide a sample of 1 million MARC bibliographic records. The records will be pulled from OCLC's WorldCat database.

Current MARC 21 specifications define nearly 2000 fields and subfields available to library catalogers. In a previous project, Z-Interoperability Testbed Project, strong indications were discovered that only 36 of the available MARC subfields accounted for 80% of all subfield utilization.

More information about the project is available on-line.

Interesting research at my alma mater. If only 36 subfields account for 80% of usage, how much of MARC is really necessary? Should Dublin Core be expanded to these 36 fields? When was the last time fields were tossed out of MARC because they were not being used? Should some fields be valid only for a community of users, not everybody? How are the fields chosen for the Core record? Should they be changed? Of the fields not being used are any critical for user access? How could we populate those missing fields? So many interesting questions. I'm proud my school is investigating these questions.