Friday, June 06, 2008

More MARBI News

Some more MARBI news.

The following papers are available for review by the MARC community:
  • Proposal No. 2008-04: Changes to Nature of entire work and nature of content codes in field 008 of the MARC 21 bibliographic format
  • Proposal No. 2008-09: Definition of Videorecording format codes in field 007/04 of the MARC 21 Bibliographic format
  • Proposal No. 2008-10: Definition of a subfield for Other standard number in field 534 of the MARC 21 bibliographic format
Additional proposals and discussion papers will be posted shortly.

The draft agenda for the 2008 ALA Annual MARBI meetings is available online.

Please note that there is a strong possibility that MARBI may meet during its Monday afternoon time slot of 1:30-3:30 for continuation of the discussion.

Skype News

Skype now lets you set your mobile number as your caller-id on outgoing calls. Very nice. I'm set up.

ALA Annual MARBI Meeting

Posted to many e-mail distribution lists.

The following papers are available for review by the MARC community:

  • Proposal No. 2008-06: Adding information associated with the Series Added Entry fields (800-830)
  • Proposal No. 2008-07: Making field 440 (Series Statement/Added Entry--Title) obsolete in the MARC 21 Bibliographic Format
  • Proposal No. 2008-08: Definition of subfield $z in field 017 of the MARC 21 Bibliographic and addition of the field to the MARC 21 Holdings formats
  • Discussion Paper 2008-DP06: Coding deposit programs as methods of acquisitions in field 008/07 of the MARC 21 holdings format
Additional proposals and discussion papers will be posted shortly.

The draft agenda for the 2008 ALA Annual MARBI meetings will be made available soon.

Wednesday, June 04, 2008

Yahoo Search Monkey

Another step towards the Semantic Web, Yahoo SearchMonkey.
SearchMonkey is fundamentally about transforming the way search results are compiled and displayed by leveraging the same structured data that powers the millions of pages indexed by Yahoo! Search. By sharing structured data with Yahoo!, site owners and content publishers can build more useful, relevant and visually appealing search results, which can increase the quantity and quality of traffic from Yahoo! Search....

You can share data by embedding microformats, using semantic web standards such as RDF, sharing an XML data feed directly with Yahoo! Search, or using the SearchMonkey developer tool to build custom data services that extract structured data from your pages.

LibriVox

LibriVox is becoming a valuable resource for free audio books. They just reached 1500 titles in the collection.
We’ve had a pretty extraordinary May. We cataloged our 1,500th book, James Baldwin’s children’s history book, Four Great Americans, which was a great accomplishment. (Considering seven months ago we were at 1,000).

But we also had an impressively productive month: we released 115 (!) audiobooks into the public domain, almost four per day. Our previous record for monthly production was 77, reached in July 2007.
Is anyone cataloging these and adding them to their collection? Burning them to CDs and adding those to the collection? A few months back the Nebraska Library Commission made news by adding a few books licensed under Creative Commons to their catalog. Anyone doing the same for the LibriVox materials? Adding the records to OCLC for sharing or making them available via OAI-PMH?

Code4Lib Conference

The video from the Code4Lib Conference is now on Archive.org. Note that you can get the MPEG2 high def format there. Some talks include:
  • MARCThing Casey Durfee discusses MARCThing, a self-contained web service which aims to do for MARC and Z39.50 what Solr did for searching.
  • OpenURL Ross Singer and Jonathan Rochkind describe Ümlaut, an open source OpenURL middleware layer intended to improve the link resolving chain by analyzing incoming citations and intelligently querying resources to better enable access to them.
  • Blacklight Bess Sadler describes Blacklight, a Solr based OPAC replacement being developed by University of Virginia Library.
  • Scriblio Casey Bisson describes Scriblio, the OPAC replacement based on the WordPress authoring system.
  • A Metadata Registry Jon Phipps gives an introduction to the Metadata Registry, an open source vocabulary, metadata schema, and DC application profile manager and registry.
And plenty more.

Tuesday, June 03, 2008

Object Reuse and Exchange (ORE ) Specifications

The Open Archives Initiative has announced the public beta release of Object Reuse and Exchange Specifications.
Over the past eighteen months the Open Archives Initiative (OAI), in a project called Object Reuse and Exchange (OAI-ORE), has gathered international experts from the publishing, web, library, and eScience community to develop standards for the identification and description of aggregations of online information resources. These aggregations, sometimes called compound digital objects, may combine distributed resources with multiple media types including text, images, data, and video. The goal of these standards is to expose the rich content in these aggregations to applications that support authoring, deposit, exchange, visualization, reuse, and preservation. Although a motivating use case for the work is the changing nature of scholarship and scholarly communication, and the need for cyberinfrastructure to support that scholarship, the intent of the effort is to develop standards that generalize across all web-based information including the increasing popular social networks of “web 2.0”.

Monday, June 02, 2008

FGDC Digital Cartographic Standard for Geologic Map Symbolization

Found this sitting in the draft folder for quite some time. Here it is at last. The PostScript version of the FGDC Digital Cartographic Standard for Geologic Map Symbolization is now available as a USGS Techniques and Methods publication.

Reblog this post [with Zemanta]

Geologic Map Symbolization

The PostScript version of the FGDC Digital Cartographic Standard for Geologic Map Symbolization is now available as a USGS Techniques and Methods publication.

Improving Subject Searching

Improving subject searching in databases through a combination of descriptors and UDC by Granados, Mariangels and Nicolau, Anna (2008) In Proceedings BOBCATSSS'08: Providing acces for everyone, Zadar (Croatia)
Problems with subject access to online catalogues and databases are not new. Studies on the use of OPACs have revealed two apparently endemic problems: on the one hand, the large number of searches with zero hits (failed searches) and on the other, the retrieval of an excessive amount of bibliographic records (information overload).

In this paper we describe a new information retrieval technique based on the combination of descriptor weighting and the use of the Universal Decimal Classification (UDC) call numbers.

The use of classification call numbers in order to search the catalogue has traditionally been very restricted. In most catalogues, call numbers are used only as topographical indicators and are not searchable. The new system described here makes much fuller use of them.

The system is based on the hypothesis that a set of descriptors correspond to a UDC call number. Through the analysis of the frequency of distribution of descriptors and call numbers, we create a set of clusters that allow increasing precision and recall. At the same time, these clusters offer alternative search modes, making it possible to systematize the indexing process and increase its consistency. Here we present a case study of the use of the system with the ERIC database.

Friday, May 30, 2008

Tag Cleaner

Bring some consistency to your tagging with Delicious Tag Cleaner
What would a "Delicious Tag Cleaner" be? It is tool for removing unnecessary tags from your del.icio.us account....

If you're like me, you probably have thousands of bookmarks collected over years and years of web surfing and hundreds of tags used to describe them. But the thing is that over these months/years you haven't been able to come up with a consistent taxonomy for your tags.

I have, for example, dozens of different tags for expressing links related to software development: "dev", "devel", "development" etc.

So this tool can suggest you tags to be merged together, so you can choose one by one and have this tool to merge the chosen tags on your delicious account.
As you clean-up tags doesn't that remove them from the stream-of-consciousness thing? Aren't they losing their value and becoming subject headings? Poor ones at that.

Statement of International Cataloguing Principles

A reminder from IFLA about the Statement of International Cataloguing Principles.
This is a reminder announcement that the Statement of International Cataloguing Principles developed by the five IFLA Meetings of Experts on an International Cataloguing Code is now available for worldwide review and comment.

A vote form is also available there and can be used by anyone to indicate whether they approve the statement or not and to make comments. The form can be printed out, filled in, and faxed, or it can be filled in electronically and sent as an e-mail attachment.

Wednesday, May 28, 2008

2.0 Speaking Opportunities

Any folks who want to represent the library community in an eduction 2.0 setting should check out CR 2.0. They are having a series of 20 workshops around the U.S. and are using an unconference format. Go to their website and suggest a topic and the folks attending vote on what they want to hear. Even if you don't become a facilitator for the discussion, at least they have seen that libraries are part of eduction 2.0. Just participating in the discussion might open some eyes to the role of libraries in education.

Tuesday, May 27, 2008

Tagging @ NASA

NASA is sporting a tag cloud on their home page. It is generated from words used to search the site. Look to the right a bit down. It sports a nice star field background.

Friday, May 23, 2008

Web Ontology Language (OWL)

Some papers from HP Labs concerning the Web Ontology Language (OWL)
  • An OWL Full Interpretation by Jeremy Carrooll HPL-2008-60

    This report is an appendix to report HPL-2008-59. It gives a worked example of the construction used in the proof from that report. For finiteness, a reduced datatype map consisting of only xsd:boolean is used. Each of the graphs in the construction is listed explicitly, with some redundancy eliminated. The final Herbrand graph contains about 15,000 triples.

  • The Consistency of OWL Full (with proofs) by Jeremy Carroll and Dave Turner HPL-2008-59

    We show that OWL1 Full without the comprehension principles is consistent, and does not break most RDF graphs that do not use the OWL vocabulary. We discuss the role of the comprehension principles in OWL semantics, and how to maintain the relationship between OWL Full and OWL DL by reinterpreting the comprehension principles as permitted steps when checking an entailment, rather than as model theoretic principles constraining the universe of interpretation. Starting with such a graph we build a Herbrand model, using, amongst other things, an RDFS ruleset, and syntactic analogs of the semantic "if and only if" conditions on the RDFS and OWL vocabulary. The ordering of these steps is carefully chosen, along with some initialization data, to break the cyclic dependencies between the various conditions. The normal Herbrand interpretation of this graph as its own model then suffices. The main result follows by using an empty graph in this construction. We discuss the relevance of our results, both to OWL2, and more generally to a future revision of the Semantic Web recommendations. This longer version contains the proofs.

  • The Consistency of OWL Full by Jeremy Carroll and Dave Turner HPL-2008-58

    We show that OWL1 Full without the comprehension principles is consistent, and does not break most RDF graphs that do not use the OWL vocabulary. We discuss the role of the comprehension principles in OWL semantics, and how to maintain the relationship between OWL Full and OWL DL by reinterpreting the comprehension principles as permitted steps when checking an entailment, rather than as model theoretic principles constraining the universe of interpretation. Starting with such a graph we build a Herbrand model, using, amongst other things, an RDFS ruleset, and syntactic analogs of the semantic "if and only if" conditions on the RDFS and OWL vocabulary. The ordering of these steps is carefully chosen, along with some initialization data, to break the cyclic dependencies between the various conditions. The normal Herbrand interpretation of this graph as its own model then suffices. The main result follows by using an empty graph in this construction. We discuss the relevance of our results, both to OWL2, and more generally to a future revision of the Semantic Web recommendations. Publication Info: Submitted to ISWC 2008 b1 s 7th International Semantic Web Conference, Karlsruhe

MARC 2 MODS Tool

The Digital Library Federation announces a revision to their MARCXML to MODS tool.
The DLF Aquifer Metadata Working Group announces an update to the XML stylesheet they have developed for the Aquifer project, for conversion of MARCXML records to MODS. The current stylesheet, DLF_MARC2MODS_1.34.xsl, can be found from a link on our MARC to Aquifer MODS XSLT Stylesheet page. Changes are briefly documented in the comments at the beginning of the stylesheet. We have also updated the Introduction pages that give more detail about some of the changes.

The changes include re-added mapping for tag 510 citations to the note element for monographs only; added subject:hierarchicalGeographic element mapping of tag 662 Subject - Hierarchical Place Name; added mapping of tags 561 (ownership) and 581 (publications) to the note element, removed mapping of 007 specific material designation to the genre element when the value is "remote", and a correction to no longer repeat mapping of dates from the Leader to originInfo:date when the date type is "questionable".

Tuesday, May 20, 2008

MARC Update

Update No. 8 (October 2007) was recently released in multiple document formats. It includes changes made to the MARC 21 formats resulting from proposals which were considered by the ALA ALCTS/LITA/RUSA Machine-Readable Bibliographic Information Committee (MARBI), the Canadian Committee on MARC (CCM) and the BIC Bibliographic Standards Group in 2007.

The printed update is available through the Cataloging Distribution Service.
It includes pages for fields that have been changed, with changes marked with side lining. PDF of those printed update pages are also available online

D-Lib Magazine

The May/June 2008 issue of D-Lib Magazine is now available.

Some articles of interest include:
  • PREMIS With a Fresh Coat of Paint: Highlights from the Revision of the PREMIS Data Dictionary for Preservation Metadata Brian F. Lavoie, OCLC Online Computer Library Center
  • Adding Value to the Library Catalog by Implementing a Recommendation System Michael Moennich and Marcus Spiering, Karlsruhe University Library
I found the one on the recommendation system interesting. They are selling the service as an add-on to the OPAC. LibraryThing for Libraries is doing the same with their data. Syndantics has been doing this for quite some time with cover images and reviews. Seems to be a trend here, 2nd party additions to the OPAC supplying services based on data collected elsewhere. In the article world, there was some research done collecting OpenURL data to rate papers.

Monday, May 19, 2008

xOCLCnum

A new service from OCLC.
I'd like to announce and invite you to try xOCLCnum, the latest in the xIdentifier family of Web services from OCLC.

Just as xISBN allows you to find all related editions of a book by entering its ISBN, xOCLCnum does the same thing using OCLC number.

xOCLCnum is queried using a simple URL format, and returns an XML response with both related OCLCnums and related ISBNs (if any). It is designed to be easily built in to your library application, so you can expand queries, find all related editions, or do whatever creative thing you want to do.

Background:
ISBNs have been assigned since 1970, to most but not all books published.

OCLC numbers are assigned whenever a record is added to WorldCat, OCLC's global union catalog. These records cover a large portion of all books, old and new, held by any library in North America and, increasingly other regions worldwide (most recently, National Library of China).

So the coverage range of OCLC numbers is, not surprisingly, far greater than that of ISBNs: in WorldCat, for example, around 100 million OCLCnums compared to about 20 million ISBNs.

More Information on xOCLCnum
xOCLCnum API description

1:30 Ratio for Information

The post at Librarian.net about the book containing thirty tables-of-contents reminded me of the 1:30 rule for information.
Dolby and Resnikoff found these relationships:
  • A book title is 1/30 the length of a table of contents in characters, on average
  • A table of contents is 1/30 the length of a back of the book index, on average
  • A back of the book index is 1/30 the length of the text of a book, on average
  • An abstract is 1/30 the length of the technical paper it represents, on average
Is this the result of living in the material world and this won't hold true online? Or is this a function of the brain and how it deals with information and likely to hold true where ever we function?