Friday, February 03, 2012

Code{4}lib Journal

The latest issue of the Code{4}lib Journal has some articles of interest.
HTML5 Microdata and Schema.org
Jason Ronallo

On June 2, 2011, Bing, Google, and Yahoo! announced the joint effort Schema.org. When the big search engines talk, Web site authors listen. This article is an introduction to Microdata and Schema.org. The first section describes what HTML5, Microdata and Schema.org are, and the problems they have been designed to solve. With this foundation in place section 2 provides a practical tutorial of how to use Microdata and Schema.org using a real life example from the cultural heritage sector. Along the way some tools for implementers will also be introduced. Issues with applying these technologies to cultural heritage materials will crop up along with opportunities to improve the situation.

Using VuFind, XAMPP, and Flash Drives to Build an Offline Library Catalog for Use in a Liberal Arts in Prison Program
Julia Bauder

When Grinnell College expanded its Liberal Arts in Prison Program to include the First Year of College Program in the Newton Correctional Facility, the Grinnell College Libraries needed to find a way to support the research needs of inmates who had no access to the Internet. The library used VuFind running on XAMPP installed on flash drives to provide access to the Libraries’ catalog. Once the student identified a book, it would be delivered from the Libraries to students on request. This article describes the process of getting VuFind operating in an environment with no Internet access and limited control of the computing environment.

Improving the presentation of library data using FRBR and Linked data
Anne-Lena Westrum, Asgeir Rekkavik, Kim TallerĂ¥s

When a library end-user searches the online catalogue for works by a particular author, he will typically get a long list that contains different translations and editions of all the books by that author, sorted by title or date of issue. As an attempt to make some order in this chaos, the Pode project has applied a method of automated FRBRizing based on the information contained in MARC records. The project has also experimented with RDF representation to demonstrate how an author’s complete production can be presented as a short and lucid list of unique works, which can easily be browsed by their different expressions and manifestations. Furthermore, by linking instances in the dataset to matching or corresponding instances in external sets, the presentation has been enriched with additional information about authors and works.

Presenting results as dynamically generated co-authorship subgraphs in semantic digital library collections
James Powell, Tamara M. McMahon, Ketan Mane, Laniece Miller, Linn Collins

Semantic web representations of data are by definition graphs, and these graphs can be explored using concepts from graph theory. This paper demonstrates how semantically mapped bibliographic metadata, combined with a lightweight software architecture and Web-based graph visualization tools, can be used to generate dynamic authorship graphs in response to typical user queries, as an alternative to more common text-based results presentations. It also shows how centrality measures and path analysis techniques from social network analysis can be used to enhance the visualization of query results. The resulting graphs require modestly more cognitive engagement from the user but offer insights not available from text.

On Dentographs, A New Method of Visualizing Library Collections
William Denton

A dentograph is a visualization of a library’s collection built on the idea that a classification scheme is a mathematical function mapping one set of things (books or the universe of knowledge) onto another (a set of numbers and letters). Dentographs can visualize aspects of just one collection or can be used to compare two or more collections. This article describes how to build them, with examples and code using Ruby and R, and discusses some problems and future directions.

Tuesday, January 31, 2012

Metadata Harvested

Jason Ronallo at Preliminary Inventory of Digital Collections writes about Common Crawl, Web Data Commons, and Microdata.
The other day I discovered the Web Data Commons, which is building on top of the Common Crawl to extract Microformat, Microdata, and RDFa data and make it available for free download. This means that there is starting to be free structured data from a big portion of the Web available for for anyone to play with at very low cost. Common Crawl takes care of the crawling and then Web Data Commons will do data extraction. This opens up new possibilities for services, specialized search, and aggregations of content. Big web data is being opened up for small startups and individuals.
Is your library being crawled? Does it have metadata able to be harvested? Should it? Just asking.

MARCXML to MODS 3.4 XSLT (Revision 1.75)

A logo of the Unites States Library of Congres...Image via WikipediaThe revised version of MARCXML to MODS 3.4 XSLT has been announced.
The Library of Congress' MARCXML to MODS 3.4 XSLT stylesheet (Revision 1.75) http://www.loc.gov/standards/mods/v3/MARC21slim2MODS3-4.xsl is now available--it incorporates edits made in response to comments received since the release of Revision 1.74.

The MODS 3.4 XSLT is based on the MARC to MODS 3.4 mapping made available by the Library of Congress in July of 2010 http://www.loc.gov/standards/mods/mods-mapping.html. The mapping and the XSLT are also available via the Library of Congress' MODS Web site. They They will be revised periodically as users' comments are received and as subsequent MODS Editorial Committee analysis and decisions evolve.

Monday, January 30, 2012

VuFind 1.3 Released

VuFindImage by nengard via FlickrVuFind, the library portal software, has a new version.
The latest version of the VuFind Open Source discovery software has just been released.

The new release includes several significant enhancements:
  • A new "book bag" feature has been added for shopping-cart-style bulk actions (save, email, export multiple records).
  • VuFind is now driven by Apache Solr 3.5, the latest version of the powerful index engine.
  • New optional search plug-ins have been added for visual timelines, Google Maps integration and Europeana searches.
  • Enhanced RSS feeds allow VuFind results to be easily shared with external services such as Elsevier's SciVerse platform.
  • Syndetics integration has been improved.
  • VuFind's default theme now uses jQuery and Blueprint for a more dynamic, polished interface.
Additionally, several bug fixes and minor improvements have been incorporated.