Monday, March 02, 2009

Institutional Repository Software

Yet another IR system.
The University of Rochester is pleased to announce the alpha version of its new institutional repository software platform named irplus. It contains the following features:
  • Personal authoring work space
  • Collaborative authoring and versioning
  • Publishing
  • Publication Versioning
  • Faceted searching
  • Researcher pages
  • Statistics
A further explanation of the highlights can be found here.

MarcEdit 5.1 Released

A new version of MarcEdit has been released. (Terry Reese is following the Hobbit habit and giving presents on his birthday.) Terry is presenting at TLA this year, I'm looking forward to that. He has some tutorials on YouTube. Thanks Terry and Happy Birthday.

Poll Results

Very unscientific poll results:

Do Library of Congress authority records from the mid-1980s that are still preliminary bother you?

Yes.
[ 52% (14 votes) ]
Not so much.
[ 26% (7 votes) ]
No.
[ 22% (6 votes) ]

Carbon Footprint

Reducing junk mail is a painless way to reduce your carbon footprint. YellowPagesGoesGreen allows you to stop delivery of telephone books. We use them almost never, Reduce comes before Recycle. Catalog Choice provides the same service for catalogs.

It seems the Wii is the most environmentally friendly of all the gaming consoles. However, if you turn off the others when not in use, they are not too bad.

Thursday, February 26, 2009

Preliminary Authority Records

I've set up a poll on old preliminary records. Just for fun or to vent.

Off Topic - Gaming

Well, those gaming librarians have recruited me. I've always thought libraries should have games. Why not? They have romance novels, DVDs, CDs, all sorts of popular materials. But I never had any interest in games. I'd tried Pong and Space Invaders and a couple of others over the years but they didn't appeal to me. Then we gave a home to two kids and last Christmas got a Wii for the family. These games weren't your parent's games. Super Mario Galaxy and Lego Star Wars: The Complete Saga were both imaginative and engaging. My TV time, never too much, disappeared for months.

Recently for my birthday I received a Nintendo DS Lite. Nice graphics, fast enough, small enough to enjoy. I started playing Brain Age: Train Your Brain in Minutes a Day!. But soon I also picked up Hotel Dusk: Room 215 and Lego Indiana Jones: The Original Adventures. If you haven't tried video games in a dozen years, you will be surprised at just how more enjoyable they have become. Since I'm new to this I'm open to suggestion from any more knowledgeable sources.

Wednesday, February 25, 2009

Preliminary Authority Records

I'm bothered by authority records at remaining as preliminary for 25 years or so. Seems like we should be wary of using preliminary things. If a piece of software is marketed as preliminary or beta, I'd only use it in controlled situations. Yet some of these preliminary headings have been used to create name/title authority records. Just how many are there? Well, in our small library here are the headings marked as preliminary in our catalog.
  • n 83825067 Spector, William S.
  • n 83825547 Library of Congress. Classification Division. Classification. Class R: Medicine.
  • n 83825719 Colloquium on the Optical Properties and Electronic Structure of Metals and Alloys (Paris, France)
  • n 83825864 Federation of American Societies for Experimental Biology. Committee on Biological Works.
  • n 83825968 Hyman, Charles J.
  • n 83827345 Linhart, J. G.
  • n 83827385 Solar Spectrum Symposium (1963 : Rijksuniversiteit te Utrecht)
  • n 83827671 Schwarzschild, Martin.
  • n 83827701 Space Age Astronomy Symposium (1961 : Pasadena, Calif.)
  • n 83827876 King, Gerald W.
  • n 83828167 Brandstatter, Julius J.
  • n 83828312 Heide, Fritz, 1891-
  • n 83828638 Watanabe, Hiroshi, 1927-
  • n 83828871 Jong, Wieger Fokke de, 1896-
  • n 83829154 International Symposium on the Origin of Life on the Earth (1957 : Moscow, Russia)
  • n 83829438 International Symposium on Basic Environmental Problems of Man in Space.
  • n 84800287 Conference on the Nature of the Surface of the Moon (1965 : Greenbelt, Md.)
  • n 84802687 I.A.U. Symposium on the Moon.
  • n 84803439 Zylka, Romuald.
  • n 84803672 Mineur, Henri, 1899-1954.
  • n 84806607 Sandner, Werner.
  • n 84806671 Lunar Surface Materials Conference (1963 : Boston, Mass.)
  • n 84806677 International Geophysics Committee.
  • n 85800342 Bonola, Roberto, 1874-1911.
  • n 85800347 Small, Robert, 1732-1808.
  • n 85800679 Kunkel, Wulf B.
  • n 85801170 Becvar, Antonin, 1901-
  • n 85801875 White, John Francis, 1921-
  • n 85802026 United States. National Committee for the International Geophysical Year.
  • n 85802469 Jubilee of Relativity Theory (1955 : Bern, Switzerland)
  • n 85802520 Maisak, Lawrence.
  • n 85803615 Chayes, Felix, 1916-
  • n 85803890 Hagihara, Yusuke, 1897-
  • n 85803956 McKinley, Donald William Robert, 1912-
  • n 85804249 Rankama, Kalervo, 1913-
  • n 85804653 Raimes, Stanley.
  • n 85804715 Jet Propulsion Laboratory Conference on the Solar Wind (1964 : Pasadena, Calif.)
  • n 85806886 American Association of Petroleum Geologists. Committee on Structural Nomenclature.
  • n 85814887 National Research Council (U.S.). Panel on Solid Earth Problems.
Will some of these no longer be preliminary when death dates are added? Not necessarily. Many of these have been updated over the years as a look at the 005 will show, yet they are still preliminary. /rant

MARBI

The proposals and discussion papers discussed at the Midwinter 2009 MARBI meetings have been updated to include brief summaries of discussions and decisions made.

Document Summarization using Wikipedia

Document Summarization using Wikipedia by Krishnan Ramanathan, Yogesh Sankarasubramaniam, Nidhi Mathur, and Ajay Gupta is a recent HP Technical Report. It seems the small screens used by mobile devices are creating a demand for document summarization.
Although most of the developing world is likely to first access the Internet through mobile phones, mobile devices are constrained by screen space, bandwidth and limited attention span. Single document summarization techniques have the potential to simplify information consumption on mobile phones by presenting only the most relevant information contained in the document. In this paper we present a language independent single-document summarization method. We map document sentences to semantic concepts in Wikipedia and select sentences for the summary based on the frequency of the mapped-to concepts. Our evaluation on English documents using the ROUGE package indicates our summarization method is competitive with the state of the art in single document summarization.

Monday, February 23, 2009

Google, Yahoo and MSN Agree on the Canonical Link Tag

Google, Yahoo and MSN Agree on the Canonical Link Tag. Nice and simple.
The latest news coming from the the three major search engines is a major improvement to how Websites are indexed by search engines. The idea of the Canonical Link Tag is that a website owner can specify a preferred version of a particular URL. What does that mean? If your site has identical or similar content (accessible through several different URLs), the Canonical link tag helps search engines calculate the most preferred URL. How Does it Operate? The tag is part of the HTML header on a web page, the same section you’d find the Title attribute and Meta Description tag. In fact, this tag isn’t new, but like nofollow, simply uses a new rel parameter. For example:

link rel="canonical" href="http://www.yoursite.org/yourpage.php?5473893993"

This would tell Yahoo!, MSN or Google that this page, where you place the tag will be treated as www.yoursite.com/yourpage.php. Therefore all links, as well as content metrics a search engine would apply should tie back to that URL as though it were one and the same.
Seem on Mark8t via Weibel Lines.

Metadata Remediation Tools

Future Directions in Metadata Remediation for Metadata Aggregators by Greta de Groat describes tools used by digital libraries on metadata aggregations.
This report will detail the current state of the art of remediation efforts, describe the additional services that aggregators could offer if the metadata were there to support them, and identify the types of tools that are needed to remediate the metadata in order to achieve the desired level of service. The report is aimed toward designers of metadata aggregations, including programmers, project planners, and metadata specialists. Knowledge domains such as computer science, informatics, information retrieval, information science, and library science are within the scope of the report. It is assumed that remediation efforts will be focused on working with the metadata itself, as many aggregators do not have access to the raw digital item.
Seen on Current Cites.

Friday, February 20, 2009

Additions to the MARC Code Lists for Relators, Sources, Description Conventions

The code listed below has been recently approved for use in MARC 21 records. The code will be added to the MARC Code Lists for Relators, Sources, Description Conventions.

The code should not be used in exchange records until after April 19, 2009. This 60-day waiting period is required to provide MARC 21 implementers time to include newly-defined codes in any validation tables they may apply to the MARC fields where the codes are used.

Term, Name, Title Sources

The following code is for use in subfield $2 in fields 600-651, 655-657 (Subject Added Entries/Index Terms) in Bibliographic and Community Information records; field 662 (Subject Added Entry) in Bibliographic records; subfield $2 in fields 700-754 (Index Terms) in Classification records; subfield $2 in fields 700-788 (Heading Linking Entries) in Authority records; and subfield $f in field 040 (Cataloging Source) in Authority records.

Addition:

embne
Encabezamientos de Materia de la Biblioteca Nacional de Espaa [use after April 19, 2009]

Wednesday, February 18, 2009

Additions to the MARC Code List for Relators

The codes listed below have been recently approved for use in MARC 21 records. The codes will be added to MARC Code List for Relators. They were submitted by the Arizona State University Libraries, prompted by current work with resource descriptions in the areas of stage performance, archeology, and data and grant administration.

The codes should not be used in exchange records until after April 18, 2009. This 60-day waiting period is required to provide MARC 21 implementers time to include newly defined codes in any validation tables they may apply to the MARC fields where the codes are used.

anl
Analyst
Use for a person or organization that reviews, examines and interprets data or information in a specific area.
ard
Artistic director
Use for a person responsible for controlling the development of the artistic style of an entire production, including the choice of works to be presented and selection of senior production staff.
dtc
Data contributor
Use for a person or organization that submits data for inclusion in a database or other collection of data.
dtm
Data manager
Use for a person or organization responsible for managing databases or other data sources.
elg
Electrician
Use for a person responsible for setting up a lighting rig and focusing the lights for a production, and running the lighting at a performance.
UF
  • Chief electrician
  • House electrician
  • Master electrician
fld
Field director
Use for a person or organization that manages or supervises the work done to collect raw data or do research in an actual setting or environment (typically applies to the natural and social sciences).
gis
eographic information specialist
Use for a person responsible for geographic information system
(GIS) development and integration with global positioning system data.
UF
  • Geospatial information specialist
lbr
Laboratory
Use for an institution that provides scientific analyses of material samples.
ldr
Laboratory director
Use for a person or organization that manages or supervises work done in a controlled setting or environment.
UF
  • Lab director
led
Lead
Use to indicate that a person or organization takes primary responsibility for a particular activity or endeavor. Use with another relator term or code to show the greater importance this person or organization has regarding that particular role. If more than one relator is assigned to a heading, use the Lead relator only if it applies to all the relators.
msd
Musical director
Use for a person responsible for basic music decisions about a production, including coordinating the work of the composer, the sound editor, and sound mixers, selecting musicians, and organizing and/or conducting sound for rehearsals and performances.
pma
Permitting agency
Use for an authority (usually a government agency) that issues permits under which work is accomplished.
pmn
Production manager
Use for a person responsible for all technical and business matters in a production.
pdr
Project director
Use for a person or organization with primary responsibility for all essential aspects of a project, or that manages a very large project that demands senior level responsibility, or that has overall responsibility for managing projects, or provides overall direction to a project manager.
rps
Repository
Use for an agency that hosts data or material culture objects and provides services to promote long term, consistent and shared use of those data or objects.
sds
Sound designer
Use for a person who produces and reproduces the sound score (both live and recorded), the installation of microphones, the setting of sound levels, and the coordination of sources of sound for a production
sh
Supporting host
Use for a person or organization that supports (by allocating facilities, staff, or other resources) a project, program, meeting, event, data objects, material culture objects, or other entities capable of support
UF
  • Host, Supporting
stm
Stage manager
Use for a person who is in charge of everything that occurs on a performance stage, and who acts as chief of all crews and assistant to a director during rehearsals.
tcd
Technical director
Use for a person who is ultimately in charge of scenery, props, lights and sound for a production

Tuesday, February 17, 2009

Displaying Searching Results

Starting from Zero: Winning Strategies for No Search Results Pages by Greg Nudelman gives advice for those creating commercial search engines. However, there are plenty of ideas we can also use.

Expert Community Experiment

News from OCLC.
Software changes needed for the Expert Community Experiment, which enables cataloging members to make more changes to WorldCat master records, were successfully installed on February 15th. During the experiment, members with full level cataloging authorizations have the ability to improve and upgrade more WorldCat master records than previously possible. The experiment is expected to last six months.

Introductory web information sessions will be held on Tuesday, February 17, 2009, 8:00 – 9:00 AM Eastern Time; Thursday, February 19, 2009, 1:00 – 2:00 PM Eastern Time; Tuesday, February 24, 2009, 4:00 – 5:00 PM Eastern Time; Wednesday, February 25, 2009, 8:30 – 9:30 AM Eastern Time.

For more information, including Guidelines for use during the experiment and an FAQ, and to register to attend a web session, go to the Expert Community Experiment page.

Cataloging Sessions @ TLA

Cataloging sessions at the Texas Library Library Association Annual Conferecne look excellent.

PRECONFERENCE – Tuesday – March 31
The Nuts & Bolts of RDA
9:00 AM - 5:00 PM
Explore a new approach to cataloging rules. Barbara Tillett of the Library of Congress and the Joint Steering Committee for Development of RDA discusses RDA implementation to prepare you for the future. Preregistration required.
Barbara Tillett, Library of Congress

Thursday – April 2
Cataloging 101 for School and Public Librarians
10:00 - 11:50 AM
What are the most important components of a good MARC record? This session will review basic concepts and present essential and inexpensive cataloging tools.
Joanna Fountain

Looking beyond Shelf Location: The Benefits of the Dewey Decimal Classification System in Libraries
2:00 - 3:50 PM
Take a look beyond Dewey as a shelf location device and expose the power of the underlying DDC data file, the interoperable translations, the associated terminologies, and the exciting research efforts that contribute to the ongoing benefits and relevance of the DDC in libraries. A business meeting follows the program.
Joan S. Mitchell and Renee Patzer

Friday – April 3
MarcEdit as a Cataloging Tool
10:00 - 11:50 AM
The creator of this popular open source MARC record editing tool presents how to fully utilize the program’s capabilities for database maintenance. Learn to streamline your cataloging processes by making global edits to large numbers of MARC records.
Terry Reese

Hope to see you there.

Electronic Resources and Libraries

Slides and audio for most presentations at ER&L 2009 will be available.

UCLA is a beautiful campus. A wide variety of trees, hills, and open spaces. Some of the rooms were a bit small for the meetings. Once I had miss a talk I wanted to hear because there was standing room only. However, since all the talks were good, my fallback option was excellent. There were not enough wall outlets for those blogging and Twittering. Laptop carriers should just bring a power strip and share. Maybe the conference could provide a couple of strips for each room. Fewer people were Twittering than I expected. About six out of a group of over three hundred. 2% or so I'd guess. Thanks to all involved in the Conference, it was a great experience.

Thursday, February 12, 2009

Electronic Resources and Libraries 2009

The Conference is over. Some excellent content and good presentations. The conference committee should be proud. Also a good group of attendees. Plenty to learn just talking to others between sessions. Some takeaways.
  1. Check into having our catalog hosted, Evergreen perhaps.
  2. OLE Project. SOA. It will be the backend, use Blacklight, whatever for the OPAC. Meant to integrate with other components, Shibbolith, LDAP, EDP, etc. Modular.
  3. Worldcat Local. Attempt to answer the question "What data should be at the network level and what at the local level?"
  4. Vicky Reich gave the most thought provoking talk. For many years we have been modeling our behavior on the business model. "Just in time, not just in case." Now looking at Circuit City, the auto industry, financial industry, realtor business it might not have been such a good idea. We could be next.
  5. Muse Single Search is working for someone. Only federated search that even had one happy person.
  6. LibGuides were mentioned several times. Have to investigate.
  7. Thought experiments: Tag clouds as a display. Give users the option to open some of their circ history for finding other similar readers. Enhancing digital texts.
  8. Use more than one pass of XSLT to get data in correct format. Share XSLT.
  9. What is it like to be a librarian? Its all about money and power.
  10. Link evaluator for FF from OCLC.
  11. Kill zombie budget items, those that fund the dead programs but continue on. One library cut entire paper reference collection.
  12. K. G. Schneider makes me proud to be a librarian.
Thanks to all who put this conference together. wonderful trees on the campus.

Tuesday, February 10, 2009

Electronic Resources and Libraries

Got a couple of things to think about at the welcome party last night.

Oracle comes with a built in link checker. Have to check just how good it is.

One library is no longer checking in journals. Time is best spent elsewhere.

Friday, February 06, 2009

OCLC's Expert Community Experiment

Good news from OCLC.
In response to requests from the cataloging community, OCLC is introducing the Expert Community Experiment which enables cataloging members to make more changes to WorldCat records.

During the Experiment, members with full level cataloging authorizations have the ability to improve and upgrade more WorldCat master records than has been previously possible. The Experiment begins in mid-February 2009, and is expected to last six months.

Introductory web information sessions will be held throughout February for those interested in participating in the Experiment.

Please see the Expert Community Experiment page to register to attend a web sessions. More information will be added to this page over the next few days.

Wednesday, February 04, 2009

Stimulus Package

Ask and you shall receive or at least find. It seems ALA was on the ball and has been active in asking for funding for libraries. Include Public Libraries in Recovery Funding! is a page where you can send an email to your senators. I've used it to send mine a message.

Tuesday, February 03, 2009

Los Angeles

Next weekend I'm heading out to Los Angeles for ER&L. I'm going in a couple days early and staying with my brother in the desert. Right now I plan to visit the La Brea Tar Pits and Griffith Observatory. Any other suggestions? How about a good place to eat near either of those places? Any English Country, Contra or International Folk dancing in the desert next weekend? At UCLA next week? Thanks.

Stimulus Package

Where are the libraries in the stimulus package? I've been swamped with petitions and write your rep forms from lots of medical, green, social services groups but not one has mentioned libraries. I know we could use the cash. Has LC, OCLC, or ALA or some other big player made a play for the funding? If so let me know, I'd gladly support and spread the word about the effort.

We know libraries are a good investment, they are underfunded, and the salaries are comparatively low. Sounds like a good place to invest some funds. Libraries are also counter-cyclical, business goes up in bad times. So our need and importance to our communities are greater now than a few years back.

Electronic Resouces and Libraries 2009

Next week I'll be at ER&L. Having some fun picking out sessions. Here are the presentations I currently plan to be at:
  • Metadata Crosswalking
  • Let's Stop Talking About Repositories
  • Holistic Budgeting
  • Sharing the Buck (aside, I'd love to find some libraries to share resources with)
  • Open Source ILS Panel
  • The Seismology of Google Scholar
  • Electronic Resources in the Next Gen Cat
  • Parterships
  • Managing Freely Available E-Resource Collections
I'll be at some other sessions too, but I've yet to choose between two I equally want to see. I hope to see some folks who follow Catalogablog there.

Monday, February 02, 2009

Cataloging Info by the Crowd

The LibraryThing weblog has a post about their users adding author information.
On Thursday we introduced a silly new "meme" page called "Dead or Alive?" which listed your authors by their mortal status--alive, dead, unknown or "not a person." (See the blog post or check out yours.) The feature drew on the birth and death dates of the authors in our Common Knowledge system, a free (Creative Commons) "fielded wiki" for miscellaneous "cataloging" information (think "Wikipedia for book info"). To move an author from the "unknown" column, members had to find their dates and enter them onto into Common Knowledge.
What are the implications? Would this be useful in disambiguation? If the links are stable, COOL, PURLS, or something like that would a link here be a useful result return on an author search in our catalog? Just thinking out-loud.

Very few of the folks writing on library topics have had their birth dates entered. Hint to Meredith, Terry, Andrew, ....

Thursday, January 29, 2009

Omeka Element Sets

Omeka now comes with the Dublic Core element set. CDWA-Lite is in the works.
Omeka is a free and open source collections based web-based publishing platform for scholars, librarians, archivists, museum professionals, educators, and cultural enthusiasts. Its “five-minute setup” makes launching an online exhibition as easy as launching a blog. Omeka is designed with non-IT specialists in mind, allowing users to focus on content and interpretation rather than programming. It brings Web 2.0 technologies and approaches to academic and cultural websites to foster user interaction and participation. It makes top-shelf design easy with a simple and flexible templating system. Its robust open-source developer and user communities underwrite Omeka’s stability and sustainability.

MODS XML Schema Tool

Hre is a tool to validate records against the MODS XML Schema.
The Digital Library Federation's Aquifer is pleased to announce a new online service, the "MODS and Asset Action Explorer,". This is an experimental service developed at the University of Illinois Grainger Engineering Library as part of the DLF Aquifer American Social History Online project with support from the Andrew W. Mellon Foundation.

The service allows anyone to upload MODS XML files, including modsCollection files, and verify that those records comply to the MODS XML Schema and also to check the uploaded records against the Aquifer project's MODS Levels of Adoption Guidelines. In addition to MODS records, the service also allows the upload of Asset Action Packages which is another experimental format being developed by the DLF Aquifer project. An Asset Action Package is an XML file containing a defined set of actionable URIs for a digital resource that delivers named, typed actions for that resource.

Anyone is welcome to get an account and upload their MODS records for validation and checking. However, note that the system is still in the research/development stages, so expect that any posted records could get mangled or disappear for unknown reasons.

Tuesday, January 27, 2009

Tagging Study

Do Tags Work? by Cathy Marshall is an interesting study comparing tags, titles and descriptions of photos in Flickr.
Have I convinced you that tags aren't all they've cracked up to be? I hope I have, but nonetheless there's a lingering fascination. Surely there's something to be done about tags: we don't want to just turn up our noses at Mr. Weinberger's argument. They could be a compact and efficient way of describing pictures. After all, picture archiving is difficult. Witness Art Spiegelman's fine graphical account in the New Yorker more than a dozen years ago; he described the difficult work of senior librarian Arthur Williams who curated the New York Public Library's extensive picture collection for over 30 years. Just how do you turn a library patron's question, “I want a picture that conveys rough times ahead” into a photo of a three-masted schooner sailing into a storm?

Friday, January 23, 2009

Additions to the MARC Code Lists for Relators, Sources, Description

The codes listed below have been recently approved for use in MARC 21 records. The codes will be added to MARC Code Lists for Relators, Sources, Description Conventions.

The codes should not be used in exchange records until after March 23, 2009. This 60-day waiting period is required to provide MARC 21 implementers time to include newly defined codes in any validation tables they may apply to the MARC fields where the codes are used.

Other Sources
Field 015 (National Bibliography Number) The following code is for use in subfield $2 in field 015 in the Bibliographic format.
dnb
Deutsche Nationalbibliografie [use only after March 23, 2009]
Field 887 (Non-MARC Information Field) The following code is for use in subfield $2 in field 887 in the Bibliographic format.
onix
ONIX (Online Information Exchange) [use only after March 23, 2009]
Term, Name, Title Sources

The following codes are for use in subfield $2 in fields 600-657 (Subject Added Entries/Index Terms) in Bibliographic and Community Information records; subfield 662 (Subject Added Entry) in Bibliographic records; subfield $2 in fields 700-754 (Index Terms) in Classification records; subfield $2 in fields 700-788 (Heading Linking Entries) in Authority records; and subfield $f in field 040 (Cataloging Source) in Authority records.
ept
Evropski pedagoski tezaver = European education thesaurus (EET) : slovenska razliica - izdelana po angleski razliici (Ljubljana: Zavod Republike Slovenije za solstvo) [use only March 23, 2009]

eurovocen
Eurovoc thesaurus (English) [use only after March 23, 20009]

eurovocsl
Eurovoc thesaurus (Slovenian) [use only after March 23, 20009]

mech
Iskanje po zbirki MECH [use only after March 23, 2009]

pmt
Project management terminology. Newtown Square, PA: Project Management Institute. [use only after March 23, 2009]

Serial Cataloging Guide

NASIGuide: MARC Coding for Serials by Elizabeth McDonald and Beverly Geckle is now available.
Aimed at helping in the creation and interpretation of MARC bibliographic records for serials, this guide focuses on how serial MARC records differ from records for other formats. While continuing resources include both serials and integrating resources such as looseleafs or websites, this guide discusses serials only. Cataloging Serials involves an understanding of both the MARC codes and cataloging rules and practices. Although cataloging rules and practices are referred to, the main focus of this guide is on MARC coding, and not all subfields are always covered.

Typographical Errors in Library Databases

Typographical Errors in Library Databases has a new home. There is also a email group and a weblog on the topic.

MARC Tool

yaz-marcdump is a free tool from Index Data to manipulate MARC records. Perhaps it is just the tool you need to convert those MARC21 records encoded in MARC-8 into UTF-8.
yaz-marcdump reads MARC records from one or more files. It parses each record and supports output in line-format, ISO2709, MARCXML, MarcXchange as well as Hex output.

This utility parses records ISO2709(raw MARC) as well as XML if that is structured as MARCXML/MarcXchange....

The following command converts MARC21/USMARC in MARC-8 encoding to MARC21/USMARC in UTF-8 encoding. Leader offset 9 is set to 'a'. Both input and output records are ISO2709 encoded.

yaz-marcdump -f MARC-8 -t UTF-8 -o marc -l 9=97 marc21.raw >marc21.utf8.raw

The same records may be converted to MARCXML instead in UTF-8:

yaz-marcdump -f MARC-8 -t UTF-8 -o marcxml marc21.raw >marcxml.xml

Wednesday, January 21, 2009

Moving Image Work-Level Records

The Moving Image Work-Level Records Task Force is look for comments about their work by Friday, February 13, 2009.
The Moving Image Work-Level Records Task Force of CAPC (OLAC's Cataloging Policy Committee) has posted a finalized version of parts 1-2 of our recommendations. This document covers definitions, boundaries, attributes, and relationships.

We have also posted a draft of part 4, which covers our proof-of-concept attempt to extract work/primary expression-level information from existing MARC manifestation bibliographic records. It also gives some recommendations for cataloging practice and changes to the MARC format based on our experience. We are particularly interested in feedback on readability of the report and on the recommendations that we're making.
The OLAC website has a nice new look. Have you renewed your membership in OLAC? It is a Best Buy.

OAIster Moving

OAIster, the union catalog for OAI records, is to move from the University of Michigan to OCLC.
The University of Michigan approached OCLC about managing future operations for OAIster, which has grown to over 19 million records contributed by over 1,000 institutions and organizations worldwide since the service launched in 2002.

OCLC welcomed the proposal because OAIster complements the types of resources already cataloged in WorldCat, broadens the scope of collections to include open archives, and reaches millions of information seekers every month through OCLC services including WorldCat.org and FirstSearch.

Tuesday, January 20, 2009

‡biblios.net

Nicole C. Engard has made this announcement.
I have been spending a lot of time these last few months working on getting a new web-based cataloging tool ready for you all. It's finally time! I'd like to invite you to sign up for free and try out ‡biblios.net a community cataloging tool from LibLime.

So, what the heck is it? ‡biblios.net is a web-based original and copy cataloging tool with built in federated search of any Z39.50 target (via an integrated search registry with over 2000 targets - or by adding your own) and a large (30 million strong) shared database of catalog records. This means that you can isit ‡biblios.net and benefit from the work of other catalogers who have gone before you. You can also edit and contribute to the database without any restrictions.

I have also worked on creating some macros (others can be written by users) to help streamline some of our cataloging processes and templates for common item types to make original cataloging a little bit easier :) Best of all, you can set ‡biblios.net to automatically add records to your Koha system with the click of a button!

I'm looking for both novice and professional catalogers to give me their opinions of the tools, services and overall user friendliness of ‡biblios.net. I am of course also looking for people to join the community so that this tool and grow and help us all with our cataloging work.

I have worked very closely with the development crew on this new tool and believe very strongly both in it and the ideas behind it. The fact that we all work so very hard on our cataloging makes the fact that the records in ‡biblios.net are freely-licensed under the Open Data Commons all that more appealing.

If you want to learn more you can read through the documentation on the site and/or take a peek at this great write up by Jonathan Rochkind.

I'll have to look up how to get those double daggers. Including a non-keyboard symbol in a product name might not be the best idea.

Let Your Fingers Do the Walking Through WorldCat

OCLC has announced WorldCat Mobile.
  • Search for library materials—Enter search terms such as keywords, author or title
  • Find a WorldCat library near you—Enter your ZIP, postal code or location in the Libraries Locator
  • Call a library—Highlight and click the phone number in a library listing to place a call
  • Map a route—Find the fastest way to a WorldCat library using the mapping software already on your device
Type this URL into your phone's Web browser:

www.worldcat.org/m/

Thursday, January 15, 2009

MARBI at Midwinter

The following papers are now available for review by the MARC community:

Proposal No. 2009-01/3: Identifying work, expression and manifestation records in the MARC 21 Bibliographic and Authority Formats.

Discussion Paper No. 2009-DP01/2: Relationship Designators for RDA Appendix J and K.

Wednesday, January 14, 2009

OCLC Record Sharing News

By now everyone must have seen something about the recent OCLC move to update their position on sharing records. It has been covered and discussed in many weblogs, podcasts and magazine articles. Now all that discussion has led OCLC to reconsider their position.
Members Council and the OCLC Board of Trustees will jointly convene a Review Board of Shared Data Creation and Stewardship to represent the membership and inform OCLC on the principles and best practices for sharing library data. The group will discuss the Policy for Use and Transfer of WorldCat Records with the OCLC membership and library community.
Seems a good idea. I think much of the heat was generated by the policy appearing out of nowhere and taking effect in a very short time. For a member institution, there was no membership involvement.

Tuesday, January 13, 2009

TechKNOW

The latest issue of TechKNOW is now available.
  • One Cataloger's NACO Participation: Comparing Funnel Participation and PCC NACO Classroom Training / Peter Lisius, Music and Media Catalog Librarian, Kent State University Libraries and Media Services
  • Coordinator's Corner / Ian Fairclough, George Mason University (Fairfax Virginia)
  • Eight Blogs Catalogers Should Know About / Michael Monaco, Senior Catalog Librarian, Cleveland Public Library
  • Book Review: Cataloging of Audiovisual Materials and Other Special Materials, Manual Based on AACR2 and MARC 21
  • IMHO: OCLC Policy for the Use and Transfer of WorldCat Records: What are the Implications? / Roman Panchyshyn, Catalog Librarian, Kent State University Libraries and Media Services
  • Fiction Cataloging for Better Access / Michael Christian Budd, Cataloger, Public Library of Cincinnati and Hamilton County
  • Book Review: KidzCat: A How-to-do-it Manual for Cataloging Children's Materials and Instructional Resources

Provider-Neutral E-Monograph Report

The Provider-Neutral E-Monograph Record Task Group Report has been issued.
Introduction The Provider-Neutral E-Monograph Record Task Group was formed shortly after the 2008 Annual Meeting of the American Library Association. The group's charge was to develop a monographic cataloging policy that would provide for a single electronic MARC bibliographic record to represent an online resource that is available from one or more providers. This proposal is only concerned with separate MARC records for the electronic resource-- it does not address the addition of electronic fields to the print record, otherwise known as the "Single Record Approach."

Electronic Resources and Libraries

The program schedule is now available for ER&L 2009
February 10-12, 2009
Pre-Conferences February 9, 2009
UCLA - Covel Commons
Los Angeles, CA

I'll be at this meeting, thanks to the scholarship. I'll be arriving a couple days early and staying with my brother in the Desert. Thinking about visiting the Griffith Observatory and the La Brea Tar Pits. Any other suggestions? Any contra/English country/folk dancing the weekend before the conference?

This software won't allow an ampersand in the labels or title. So I can't use ER&L in either of those places.

Monday, January 12, 2009

IFLA Cataloguing Section

The IFLA Cataloguing Section's annual report for 2008 is available on IFLANET.

A Spanish translation of the ISBD is also available.

COinS News

Swignition looks for Z3988 in a variety of places, not just the standard ContextObjects in Spans span tag. It looks for a rel="Z3988", blockquote class="Z3988", q class="Z3988" and cite class="Z3988".

Swignition is:
  • a Perl library for parsing files (what files?) into an RDF triple structure, and for outputting that data in a variety of serialisations and other formats (which formats?).
  • a TCP service that listens on port 26464 and uses the library to parse any URIs it's asked to.
  • a command-line client that acts as a simple interface for the TCP service (but calls the library directly if it detects that the service is not running).
  • a web interface (try it!) for the TCP service.

MARBI at Midwinter

Marbi News.

The following papers are available for review by the MARC community:

Discussion Paper No. 2009-DP03: Changing field 257 (Country of producing entity for archival films) of the MARC 21 Bibliographic Format to include non-archival materials

RDA Papers:
Proposal No. 2009-01/1: New data elements in the MARC 21 Authority Format

Proposal No. 2009-01/2: New content designation for RDA elements: Content type, Media Type, Carrier Type in the MARC 21 Formats

Discussion Paper No. 2009-DP01/1: Encoding URIs for controlled values in MARC records

A few more papers will be announced early next week.

Friday, January 09, 2009

eXtensible Catalog Paper

Supporting the eXtensible Catalog through Metadata Design and Services by Jennifer Bowen is now available.
The eXtensible Catalog (XC) is a unique set of software toolkits for libraries that is not directly comparable to either a traditional Integrated Library System (ILS) or a “next-generation” discovery interface. XC will go well beyond providing a discovery layer to also provide a metadata infrastructure for enriching and transforming metadata to make it usable in a variety of web environments. The XC Metadata Services Toolkit can also be used for experimentation and testing of new metadata standards and schemas and can be an invaluable tool for libraries as they become accustomed to these new standards, especially RDA. The library metadata environment is entering a period that will be characterized by significant change and uncertainty, and the eXtensible Catalog Project will provide a variety of useful tools to help the library community make informed decisions about the future.

IFLA Classification and Indexing Section Newsletter

The December 2008 issue of the IFLA Classification and Indexing Section Newsletter is now available.

Thursday, January 08, 2009

NISO Newsline

The Jan. issue of the NISO Newsline is now available to all Z39.n heads. They have been busy.

Changes to MARC Code List for Languages

The following change has been approved for use in the international language code standard, ISO 639-2 (Codes for the Representation of Names of Languages--Part 2: alpha-3 code) and is consequently also changed in the MARC Code List for Languages.

Language name Moldavian
New name Moldovan
Previous code mol
Now coded rum

The language code "mol" will no longer be used for the variant of the Romanian language that is spoken in the Republic of Moldova known as Moldovan or Moldavian. In the MARC Code List for Languages, Moldovan will be listed as follows:

Moldovan
Assigned collective code [rum]
(Romanian)
UF Moldavian

Semantics in HTML 5

Semantics in HTML 5 by John Allsopp has been published on A List Apart (the other ALA).
There is a very real problem that needs to be solved here. We need mechanisms in HTML that clearly and unambiguously enable developers to add richer, more meaningful semantics—not pseudo semantics—to their markup. This is perhaps the single most pressing goal for the HTML 5 project.

Friday, January 02, 2009

Interesting Collocation

Cutter's functions for the catalog are something we all learned in Cataloging 101. The FRBR functions seem pretty familiar. However, if you let others loose on bibliographic data they come up with some interesting ways to collocate works, say by a person's library. Over at LibraryThing the crowd is entering and tagging the personal collections of famous people. Want to see what Lawrence of Arabia had on his bookshelves? Or maybe Aaron Copland? Just more proof the everything is miscellaneous.

Monday, December 29, 2008

Testing RDA

RDA news.
The Library of Congress, the National Agricultural Library and the National Library of Medicine have jointly decided to test Resource Description and Access (RDA), the proposed new cataloging code, before making a decision on whether or not to implement this new standard. See the joint statement and accompanying letter from Deanna Marcum, Associate Librarian for Library Services, Library of Congress for more details.

GoodSearch

I just heard of the search engine, GoodSearch. It makes a small donation to a charity of my choice each time I search.
GoodSearch is a search engine which donates 50-percent of its revenue to the charities and schools designated by its users. It's a simple and compelling concept. You use GoodSearch exactly as you would any other search engine. Because it's powered by Yahoo!, you get proven search results. The money GoodSearch donates to your cause comes from its advertisers — the users and the organizations do not spend a dime!

GoodSearch: You Search...We Give!
I'll be using this whenever I used to use Yahoo.

Tuesday, December 23, 2008

More MARBI News

The draft agenda for the 2009 ALA Midwinter MARBI meeting is now available.

Also, the minutes for the 2008 Annual MARBI meeting are now available.

News from MARBI

The following papers are available for review by the MARC community:
  • Proposal No. 2009-02: Definition of new codes for legal deposits in 008/07 (Method of Acquisition) in the MARC 21 Holdings Format
  • Proposal No. 2009-03: Definition of field 080 in the MARC 21 Authority Format
  • Proposal No. 2009-04: Addition of Codes for Map Projections in 008/22-23 (Maps) in the MARC 21 Bibliographic Format
  • Proposal No. 2009-05: Adding subfield $u for Uniform Resource Identifier to field 510 (Citation/References note) of the MARC 21 Bibliographic Format
  • Discussion Paper No. 2009-DP02: Definition of field 588 for metadata control note in the MARC 21 Bibliographic Format
A few additional proposals and a discussion paper will be posted shortly. MARBI proposals and discussion papers related to RDA will be posted in early January.

The draft agenda for the 2008 ALA Annual MARBI meetings and the Annual 2008 MARBI minutes will be made available soon.

Monday, December 22, 2008

2009 Electronic Resources & Libraries

I'll be going to the 2009 Electronic Resources & Libraries. Hope to see some people there I've only "met" online. This is made possible by the scholarship I received from the conference. My sincere thanks to the conference organizers and the scholarship committee.

lcsh.info Gone

Some sad news. "On December 18th I was asked to shut off lcsh.info by the Library of Congress. As an LC employee I really did not have much choice other than to comply." This has been posted everywhere else, but deserves the widest exposure, so I'm posting here as well.

uClassify Contest

The folks at LibraryThing are interested in what could be done with the uClassify tool. They are offering a $100.00 prize for the best tool.
Our dream is to share hardcore classifier technology with everyone. We recognized that classifiers are mostly present at universities research departments and expensive commercial companies. We want to change that. We want everyone to have the possibility to use a top notch classifier - completely free. We find it enormously exciting to see what happens when a tool for creativity is given to the community. We hope to see all kinds of beyond-our-imagination classifiers and incredible web applications being built around the API.

Friday, December 19, 2008

PURLs

PURLZ Server version 1.2 has been released.
Purlz are Web addresses or Uniform Resource Locators (URLs) that act as permanent identifiers in the face of a dynamic and changing Web infrastructure. Instead of resolving directly to Web resources, PURLs provide a level of indirection that allows the underlying Web addresses of resources to change over time without negatively affecting systems that depend on them. This capability provides continuity of references to network resources that may migrate from machine to machine for business, social or technical reasons.

Cataloging Video Discs

DVDImage via Wikipedia

Another from the draft folder. The DVD Guide Update Task Force of the Cataloging Policy Committee (CAPC) of OLAC has completed the document, Guide to Cataloging DVD and Blu-ray Discs Using AACR2r and MARC 21 (2008 update).

Thanks to all involved.



Reblog this post [with Zemanta]

Cataloging Video Discs

The DVD Guide Update Task Force of the Cataloging Policy Committee (CAPC) of OLAC has completed the document, Guide to Cataloging DVD and Blu-ray Discs Using AACR2r and MARC 21 (2008 update).

Thanks to all involved.

Additions to the MARC Code Lists for Relators, Sources, Descriptions

The code listed below has been recently approved for use in MARC 21 records. The code will be added to MARC Code Lists for Relators, Sources, Description Conventions.

The code should not be used in exchange records until after February 16, 2008. This 60-day waiting period is required to provide MARC 21 implementers time to include newly-defined codes in any validation tables they may apply to the MARC fields where the codes are used.

Category Code Sources

The following code is for use in subfield $2 in field 072 in Authority and Bibliographic records (Subject Category Code) and in subfield $z in field 073 (Subdivision Usage) in Authority records.

Addition:
eflch
E4Libraries Category Headings
[use only after February 16, 2008]

Term, Name, Title Sources

The following code is for use in subfield $2 in fields 600-657 (Subject Added Entries/Index Terms) in Bibliographic and Community Information records; subfield 662 (Subject Added Entry) in Bibliographic records; subfield $2 in fields 700-754 (Index Terms) in Classification records; subfield $2 in fields 700-788 (Heading Linking Entries) in Authority records; and subfield $f in field 040 (Cataloging Source) in Authority records.

Addition:
eflch
E4Libraries Category Headings
[use only after February 16, 2008]

Nature Now has XMP

Nature now includes XMP semantic data in the PDF version of their articles.
We now have a complete bibliographic record (including DOI) embedded in the PDF using structured markup. And, moreover, we also have a solid bedrock for adding in any additional metadata should the need arise. This semantic labelling is available on all new issues of Nature and will be added to other NPG titles over the coming months.

XMP as a labelling technology could well go a long way towards addressing concerns raised by Olivia Judson in an op-ed piece earlier this week in the New York Times: Defeating Bedlam. The author decries that "downloading papers from journal Web sites" means that "access to information is easier and faster than ever before ... but there’s been no obvious way to manage it once you’ve got it." Those days may soon be over.

Now with XMP all manner of scholarly content - documents, images and other media types - can be properly labelled and many programs (not just Zotero and Papers which she reviews) can directly profit from the richness of semantic web descriptions.

Indexing 2.0

Unshelved is going to use use the crowd to index their strips. Ohnorobot.com is the tool they selected to do the indexing. It is also a tool for searching across over 91,000 Web comics.

Monday, December 15, 2008

Timeline and Plan for the Next Five Library of Congress Genre/Form Projects

News from LC.
In July, 2008, the Library of Congress Acquisitions and Bibliographic Access (ABA) management team approved five new genre/form projects to be undertaken by the Cataloging Policy and Support Office (now the Policy and Standards Division): cartography, law, literature, music, and religion. On October 31st, 2008, the Division presented its timeline and plan for those projects to the ABA management team, and it was approved on November 17th.

The plan follows the principles and recommendations for the management of the genre/form projects, as outlined in the moving image project report; provides opportunities for involvement by other libraries and organizations with an interest in genre/form headings; requests input from the broader library community at various points in each project; and, furnishes a timeline that will allow for the orderly roll-out of genre/form headings in each of the five disciplines under development during the next four years.

Tuesday, December 09, 2008

Bay Area Youth Singers Holiday Concert

Bay Area Youth Singers (BAYS) Holiday Concert
December 14 4:00 p.m.
University Baptist Church
Tickets $10 Adults $3 Students

Contact me for tickets.

IFLA GMD Paper

The IFLA Cataloguing Section, ISBD Review Group has the document Proposed Area 0 for ISBD up for review.
The Working Group on General Material Designations of the IFLA Meeting of Experts on an International Cataloguing Code (IME ICC) held in Frankfurt in 2003 suggested that the GMD seemed unsatisfactory because the presence of the content of the resource and of the presentation of the resource were mixed, confusing more than clarifying. Other comments were on its present location, interrupting the logical order of the title information. It was also thought that the GMD was important enough to be at the beginning of the record, and that it should not be optional as it currently is.
Comment by 30 January 2009.

GPO Separate Record Cataloging Policy

The Government Printing Office (GPO) has adopted a separate record cataloging policy.
At the request of the Federal Depository Library community, the Government Printing Office, Library Services & Content Services, Library Technical Information Services (LTIS) staff has formulated a policy for creating separate records for every manifestation of a document. This policy follows an internal review of the current approach of single record cataloging.

Monday, December 08, 2008

Preliminary Authority Records

Here is another preliminary authority record:

American Association of Petroleum Geologists. |b Committee on Structural Nomenclature

It was created in 1985 as shown by the ID n 85806886. It was updated in 2008 as shown by 005. But it is still PRELIMINARY. How long can these stay preliminary?

COinS in WordPress

The OpenBook Book Data plug-in for WordPress by John Miedema now supports COinS.
OpenBook is for book reviewers, book bloggers, library webmasters, anyone who wants to put book covers and data on their WordPress blog or website.

OpenBook gets its covers and book data from Open Library (http://openlibrary.org), the only source of bibliographic data that is both open source and open data, hence the OpenBook label.

About COinS
The goal is to embed citation metadata into html in such a way that processing agents can discover, process and make use of the metadata. Since an important use of this metadata will be to allow processing agents to make OpenURL hyperlinks for users in libraries (latent OpenURL), the method must allow the metadata to be placed any where in HTML that a link might appear. In the absence of some metadata-aware agent, the embedded metadata must be invisible to the user and innocuous with respect to HTML markup. To meet these requirements, the span element was selected. The NISO OpenURL ContextObject is selected as the specific metadata package. The resulting specification is named "ContextObject in SPAN" or COinS for short.

Thursday, December 04, 2008

Problems with Microformats

This is old news, but new to me and maybe someone else. There are some basic problems with many of the microformats, including the hCalendar. The BBC has stopped using microformats.
Since /programmes first went live we've been working to ensure that programme data was accessible to people and machines alike. The API design was baked in at the application design stage. Similarly we've worked on adding microformats to HTML pages as a lightweight API. All broadcasts use the hCalendar microformat to add start times, end times, broadcast channels etc.

Unfortunately there have been a number of concerns over hCalendar's use of the abbreviation design pattern.

They were considering RDFa as an alternative.

So, does anyone know of any tools to easily create RDFa? Something to just plug in the info and have it pop out?

Wednesday, December 03, 2008

Name Authority Records

News form LC concerning Name Authority Records.
The Library of Congress is pleased to announce that OCLC has completed the pre-population of the NACO authority file with non-Latin references (authority 4XX fields) derived from non-Latin bibliographic heading fields in WorldCat, a use of data-mining techniques originally developed for the WorldCat Identities project. The pre-population project, which began in mid-July, added non-Latin script references to 497,576 name authority records for personal names and corporate bodies.

**For NACO catalogers, this means that the moratorium on updating 100/110 authority records that existed prior to July 2008 to add non-Latin script references is now lifted. All name authority records are now candidates for the addition of non-Latin script references. Thanks for your patience during this period.**

LC hopes to announce soon a process by which catalogers that have been examining the non-Latin script references added by this project can contribute to the development of policies and practices for the future, such as the issues raised in the white paper on non-Latin script references in name authority records.

Special thanks to Robert Bremer, and colleagues at OCLC, for all the efforts to make this pre-population a reality.

Tuesday, December 02, 2008

RDFa

Elias Torres, Ben Adida discuss RDFa on Technometria with Phil Windley
While the web is primarily for human consumption, more sites are including machine readable data. However, this information is usually included separately. As the RFDa Primer states, RFDa provides a set of XHTML attributes to augment visual data with machine-readable hints. RDFa helps bloggers and website authors make their web pages smarter by adding computer-readable information to a site. Elias Torres and Ben Adida talk about it, including its history and what problems RFDa is attempting to solve.

Torres and Adida also discuss the technical details of RDFa and give a detailed technical description of how RDFa works. They review the mechanics of RDFa and give examples of its usage.

Wednesday, November 26, 2008

Metadata Extraction

Effective Metadata Extraction from Irregularly Structured Web Content by Baoyao Zhou, Wei Liu, Yu Yang, Weichun Wang Ming Zhang, (HPL-2008-203)
Metadata extraction is one crucial module for domain specific Web content discovery and management, because the accuracy and completeness of the extracted metadata would directly affect the quality of subsequent domain information services. Our Online Course Organization project aims to build an online course portal to serve the course information obtained from the Web. Since most course pages are irregularly structured, most existing approaches are not effective for extracting course metadata. In this paper, we proposed a novel hierarchical clustering approach to generate a web page semantic structure model from the DOM tree, called Logical Structure Model, such that the hidden patterns and knowledge can be revealed and used to facilitate identifying course metadata. The experimental results have shown that our solution can achieve effective metadata extraction

Library Weblogs

Now available, The Liblog Landscape 2007-2008 by Walt Crawford.
Liblogs--blogs written by library people, as opposed to official library blogs--provide some of today's most interesting and useful library literature. This book offers a broad look at English-language liblogs as they are and as they've changed between 2007 and 2008. The book includes more than 600 blogs with detailed analysis of 27 metrics for 2007 and 2008 and changes from 2007 to 2008--and, for 143 of them, 2006 as well. Through tables, charts and text, we explore the liblog landscape.

MODS Tools

The MODS users are collecting examples of tools using MODS. One example is Tellico.
Tellico is a KDE application for organizing your collections. It provides default templates for books, bibliographies, videos, music, video games, coins, stamps, trading cards, comic books, and wines.

Tellico allows you to enter your collection in a catalogue database, saving many different properties like title, author, etc. Two different views of your collection are shown. On the left, your entries are grouped together by any field you like, allowing you to see how many are in each group. On the right, selected fields are shown in column format, allowing you to sort by any field. On the bottom is a customizable HTML view of the current entry. The entry editor is a dialog box where you enter the data.

Tuesday, November 25, 2008

Cataloging Tools

LibLime has announced the beta test of a suite of cataloging tools, ‡biblios.net.
‡biblios.net is a subscription-based, hosted version of the open-source ‡biblios metadata editor that we released earlier this year. In addition to the editor, ‡biblios.net includes some extended community features such as integrated real-time chat, forums, and private messaging.

‡biblios.net also provides access to the world's largest database of freely-licensed library records. The database will be freely available to ‡biblios.net subscribers and non-subscribers alike via Z39.50, OAI, and direct download.

Furthermore, the database itself will be maintained by ‡biblios.net users similar to the way that Wikipedia's database is maintained by users.

We're now looking for enthusiastic participants to help shape the final production release of ‡biblios.net.

Ways you can help:

  • Become a beta tester for the ‡biblios.net platform by filling out the beta tester application form.
  • Donate your records to ‡biblios.net. Upload records to http://archive.org, and drop us an email at 'info AT liblime DOT com'
  • Get involved in the ‡biblios open-source community: get your copy of ‡biblios and join the development team at http://biblios.org
An aside, wouldn't it make more sense in that first paragraph to link "‡biblios metadata editor" rather than "released earlier this year?" Links are a form of mark-up and clean mark-up matters.

Monday, November 24, 2008

Metadata Matters

Diane Hillmann and Jon Phipps have started a new weblog, Metadata Matters. Based on the names, I'd call this a must read. Subscribed.

Friday, November 21, 2008

Better 404 Pages

Good idea from Dean Frickey writing in A List Apart (the other ALA), A More Useful 404.
Encountering 404 errors is not new. Often, developers provide custom 404 pages to make the experience a little less frustrating. However, for a custom 404 page to be truly useful, it should not only provide relevant information to the user, but should also provide immediate feedback to the developer so that, when possible, the problem can be fixed.

To accomplish this, I developed a custom 404 page that can be adapted to the look and feel of the website it’s used on and uses server-side includes (SSI) to execute a Perl script that determines the cause of the 404 error and takes appropriate action.

Thursday, November 20, 2008

Authority Record Access

Why doesn't LC offer Z39.50 access to the authority files? How about their other thesauri, like the Thesaurus For Graphic Materials? Easy access to these files would be useful. Maybe Z39.50 is "so yesterday" and SRU/SRW or an API is the answer. These are rich resources and access would be useful in ways we can't yet imagine. How about other institutions? AAT or the NASA Thesaurus, or... would be useful. This is not only about bibliographic access, but has wider issues in a Semantic Web environment.

[Later] OCLC does provide access via their Terminologies Project, see the comment for full details.

[21 Nov. 2008] Someone sent me a note saying that the Voyager software used does not support Z39.50 access to the authority records. That they are not a separate database and have very little indexing. Do check out the comments for some useful information.

Wednesday, November 19, 2008

RSS for TOCs

RSS and Scholarly Journal Tables of Contents: the ticTOCs Project, and Good Practice Guidelines for Publishers by Lisa Rogers provides some advise based on experience.
Publishers are using various versions of feeds such as RSS 1.0, RSS 2.0, RSS 0.91 and Atom. RSS 0.91 and RSS 2.0 are very simple XML formats, and typically only contain the fields for title, description and link. However, RSS 1.0 can easily be extended by the use of modules so as to not only deliver the content, but also provide structured metadata. One such module for extended RSS 1.0 is the Publishing Requirements for Industry Standard Metadata (PRISM) module. A variety of publishers such as Nature Publishing Group (6), Inderscience (7) and SAGE (8) are already using PRISM along with Dublin Core Metadata to provide rich metadata in their RSS feeds.

Tuesday, November 18, 2008

Comment on RDA

Form for comments from the US about RDA.

Koha User Group Meeting

On April 16-17th there will be a Koha innovations and sharing group in Plano Texas (suburb of Dallas/Fort Worth). The 2 day workshop would have lab access and presentation space. There would be a charge to cover lunch both days and other expenses. Any leftover money would be given to the KUDOS users group as seed money. Anticipated cost $100.

Algorithms for Clustering Tags

Clustering Tags in Enterprise and Web Folksonomies by Simpson, Edwin will be published and presented at the International Conference on Weblogs & Social Media, Seattle, March 31st, 2008 (HPL-2008-18 )
Tags lack organizational structure limiting their utility for navigation. We present two clustering algorithms that improve this by organizing tags automatically. We apply the algorithms to two very different datasets, visualize the results and propose future improvements.

Monday, November 17, 2008

RDA Draft

The full draft of RDA is now available for comment. Comments needed.

Indexing Tool

Library catalogs these days are mostly ralational databases and related indexes. LuSql is a tool to create an index from a relational databse.
LuSql is a simple but powerful tool for building Lucene indexes from relational databases. It is a command-line Java application for the construction of a Lucene index from an arbitrary SQL query of a JDBC-accessible SQL database. It allows a user to control a number of parameters, including the SQL query to use, individual indexing/storage/term-vector nature of fields, analyzer, stop word list, and other tuning parameters. In its default mode it uses threading to take advantage of multiple cores.

LuSql can handle complex queries, allows for additional per record sub-queries, and has a plug-in architecture for arbitrary Lucene document manipulation. Its only dependencies are three Apache Commons libraries, the Lucene core itself, and a JDBC driver.

LuSql has been extensively tested, including a large 6+ million full-text & article metadata document collection, producing an 86GB Lucene index.Lots of the Code4Lib folks are working with Lucene indexes.

Lemon8-XML

Adding semantic mark-up to text is something the cataloger in me always finds good. Microformats, XML, or RDF all make searches more precise. Lemon8-XML is a tool to chamge scholarly papers in MS Word or Open Office formats into XML. Sweet idea.
Lemon8-XML is a web-based application designed to make it easier for non-technical editors and authors to convert scholarly papers from typical word-processor editing formats such as MS-Word .DOC and OpenOffice .ODT, into publishing layout formats such as the open, industry-standard NLM Journal Publishing XML format.

To use Lemon8-XML, you don't need to understand XML, all you need is a little time and a general understanding of how scholarly articles are structured. In general, this means a document with:

  1. some information about the article and authors at the top
  2. usually an abstract
  3. several sections, often titled "introduction", "methods", "results", etc.
  4. optional figures or tables, either in-text or as appendices
  5. a list of references or citations in a standardized format (eg. MLA, APA, etc.)
It is from the Public Knowledge Project.

Thursday, November 13, 2008

Preliminary Authority Records

Just what does it take to upgrade a preliminary authority record? I ask because there are some about 25 years old that are still preliminary.

n 83827701
Space Age Astronomy Symposium (1961 : Pasadena, Calif.)

or

n 83827385
Solar Spectrum Symposium (1963 : Rijksuniversiteit te Utrecht)

OpenSearch and unAPI Enrichs the Cataloges

SeeAlso: A Simple Linkserver Protocol by Jakob Voss appears in Ariadne no. 57 (October 2008)
In recent years the principle of Service-oriented Architecture (SOA) has grown increasingly important in digital library systems. More and more core functionalities are becoming available in the form of Web-based, standardised services which can be combined dynamically to operate across a broader environment [1]. Standard APIs for searching (SRU [2] [3], OpenSearch [4]), harvesting and syndication (OAI-OMH [5], ATOM [6]), copying (unAPI [7] [8]), publishing, editing (AtomPub [9], Jangle [10], SRU Update [11]), and more basic library operations, either already exist or are being developed.

The creation of the SeeAlso linkserver protocol was occasioned by the need to enrich title views in library catalogues of the German Common Library Network (GBV) with links to additional information. However, instead of integrating those links into title records and tailoring the presentation to our specific OPAC software, we decided to create a general linkserver Web service.

Wednesday, November 12, 2008

Omeka 0.10

Omeka 0.10 was released yesterday.
Omeka 0.10b incorporates many of the changes you asked for: an unqualified Dublin Core metadata schema and fully extensible element sets to accommodate interoperability with digital repository software and collections management systems; elegant reworkings of our theme API and plugin API to make add-on development more intuitive and more powerful; a new, even more user friendly look for the administrative interface; and a new and improved Exhibit Builder. While the changes are extensive and represent a next-to-last step forward toward a 1.0 release in early 2009, existing users of Omeka should have little trouble switching to 0.10b. New users should have even less trouble getting started. Meanwhile, visitors to Omeka.org will find a new look, a more intuitive information architecture, easily browsable themes and plugins directories, improved documentation and user support, and new ways to get involved in the Omeka community.

Monday, November 10, 2008

OPML

How (and Why) to Create an OPML File by Marshall Kirkpatrick is only new to me. A PR person looks at the Outline Processor Markup Language.
There’s a billion other reasons to use OPML - just ask yourself in what circumstances you can imagine sending someone else one link or file that contains a collection of dynamic sources on any topic. I know these are the sorts of questions that keep me up at night.
I'm not seeing OPML icons as often as I'd expect. Is this another PICS, a good idea that just never gets adopted?

Thursday, November 06, 2008

WorldCat Hackathon

WorldCat Hackathon is the impetus for some tool development. From OCLC comes this notice
We added a few more features in this month's xID deployment, hopefully it could be useful in upcoming WorldCat Hackathon.
  • support LCCN query such as: http://xisbn.worldcat.org/webservices/xid/lccn/2004273129?fl=isbn,lccn
  • support deleted OCLCNUM (marc 019 field) http://xisbn.worldcat.org/webservices/xid/oclcnum/47139964?method=getMetadata In this case OCLCNUM 47139964 was merged into 33100112, and we use a flag "presentOclcnum" to mark present OCLC numbers.
  • xISSN project now supports tab-delimited and CSV dissemination http://xissn.worldcat.org/webservices/xid/issn/0036-8075?method=getEditions&format=csv&fl=issn,form,title http://xissn.worldcat.org/webservices/xid/issn/0036-8075?method=getEditions&format=txt&fl=issn,form,title
  • start to support php dissemination format in all XID projects http://xisbn.worldcat.org/webservices/xid/isbn/0596002815?method=getEditions&fl=*&format=php
Matienzo, Mark has announced that Python WorldCat Module v0.1.0 is now available.
In preparation for the upcoming WorldCat Hackathon starting this Friday, I've made a few changes to worldcat, my Python module for interacting with OCLC's APIs. Most notably, I've added iterators for SRU and OpenSearch requests, which (like the rest of the module) painfully need documentation.

isbn2marc

William Denton has written a program, isbn2marc, that takes and ISBN and returns a MARC record. It uses Z39.50 and is written in Ruby. Mr. Denton is the person responsible for the FRBR Blog, good stuff.

Wednesday, November 05, 2008

Changes to Dewey

973.931 Administration of George W. Bush, 2001–2009
973.932 Administration of Barack Obama, 2009–

Conference Presentations

Have you done a conference presentation lately? If so, let all that work continue to inform the library community by submitting it to the WebJunction conference page. They already have several presentations, both slides and audio, from several conferences. Well worth a look and listen. Great idea WebJunction, thanks.

Tuesday, November 04, 2008

Ten Do’s and Don’ts for Conference, Workshop, and Program Organizers

In the wake of Internet Librarian lots of folks have been posting tips for presenters. Conference organizers also have a nice list of hints, Ten Do’s and Don’ts for Conference, Workshop, and Program Organizers. Many of our conferences are arraigned by volunteers who change every year or two. A look at this list and the comments should make for happier speakers.

Thanks to Rachael Singer Gordon for pointing me to this again, I'd lost the link.

New DCMI Documnets

Two new documents from the Dublin Core Metadata Imitative. The first involves concepts that relate to RDA. (Although why we are still working on the intellectual foundation when it it nearly ready....) The second provides a model for interoperability on the Semantic Web. The DCMI folks are looking for comments on both.
Guidelines for Dublin Core Application Profiles describes the key components of an application profile and walks the reader through the process of designing a profile. Addressed primarily to a non-technical audience, the guidelines also provide a technical appendix about modeling the metadata interoperably for use in linked data environments. This draft will be revised in response to feedback from readers.

Interoperability Levels for Dublin Core Metadata, published today as a DCMI Working Draft, discusses the modeling choices involved in designing metadata applications for different types of interoperability. At Level 1, applications use data components with shared natural-language definitions. At Level 2, data is based on the formal-semantic model of the W3C Resource Description Framework. At Level 3, data is structured as Description Sets (i.e., as records). At Level 4, data content is subject to a shared set of constraints (as described in a Description Set Profile). Conformance tests and examples are provided for each level. The Working Draft represents work in progress for which the authors seek feedback.

Monday, November 03, 2008

OCLC News

OCLC has a new policy on sharing records. We have until Feb. to consider this policy and all the implications. There was lots of speculation about this before it was released.

Searching with Tags

Searching with Tags: Do Tags Help Users Find Things? by Margaret E.I. Kipp appears in Proceedings 10th International Conference of the International Society for Knowledge Organization, Montreal, Quebec, Canada.
This study examines the question of whether tags can be useful in the process of information retrieval. Participants were asked to search a social bookmarking tool specialising in academic articles (CiteULike) and an online journal database (Pubmed) in order to determine if users found tags were useful in their search process. The actions of each participants were captured using screen capture software and they were asked to describe their search process. The preliminary study showed that users did indeed make use of tags in their search process, as a guide to searching and as hyperlinks to potentially useful articles. However, users also made use of controlled vocabularies in the journal database.

Thursday, October 30, 2008

Wednesday, October 29, 2008

Web 2.0 Concepts to Enhance Digital Collections

The ‘Long Tale’: Using Web 2.0 Concepts to Enhance Digital Collections by Andrew Bullen appeared in the October 2008 issue of Computers in Libraries.
The wonderful Web 2.0 is a famously slippery concept to define. The very ambiguity of the term is Escheresque, self-referential to its ever-changing meaning. As Tim O’Reilly, CEO of O’Reilly Media, described it, “Like many important concepts, Web 2.0 doesn’t have a hard boundary, but rather, a gravitational core.” As Illinois State Library’s information technology coordinator, I have come to realize that embracing this essential Web 2.0 philosophy is a useful tool in unlocking the true potential of digital collections. In fact, the central premise behind this article is that until we embrace Web 2.0 concepts, digital repositories cannot evolve beyond very useful cataloging tools.