Friday, February 03, 2012

Code{4}lib Journal

The latest issue of the Code{4}lib Journal has some articles of interest.
HTML5 Microdata and Schema.org
Jason Ronallo

On June 2, 2011, Bing, Google, and Yahoo! announced the joint effort Schema.org. When the big search engines talk, Web site authors listen. This article is an introduction to Microdata and Schema.org. The first section describes what HTML5, Microdata and Schema.org are, and the problems they have been designed to solve. With this foundation in place section 2 provides a practical tutorial of how to use Microdata and Schema.org using a real life example from the cultural heritage sector. Along the way some tools for implementers will also be introduced. Issues with applying these technologies to cultural heritage materials will crop up along with opportunities to improve the situation.

Using VuFind, XAMPP, and Flash Drives to Build an Offline Library Catalog for Use in a Liberal Arts in Prison Program
Julia Bauder

When Grinnell College expanded its Liberal Arts in Prison Program to include the First Year of College Program in the Newton Correctional Facility, the Grinnell College Libraries needed to find a way to support the research needs of inmates who had no access to the Internet. The library used VuFind running on XAMPP installed on flash drives to provide access to the Libraries’ catalog. Once the student identified a book, it would be delivered from the Libraries to students on request. This article describes the process of getting VuFind operating in an environment with no Internet access and limited control of the computing environment.

Improving the presentation of library data using FRBR and Linked data
Anne-Lena Westrum, Asgeir Rekkavik, Kim TallerĂ¥s

When a library end-user searches the online catalogue for works by a particular author, he will typically get a long list that contains different translations and editions of all the books by that author, sorted by title or date of issue. As an attempt to make some order in this chaos, the Pode project has applied a method of automated FRBRizing based on the information contained in MARC records. The project has also experimented with RDF representation to demonstrate how an author’s complete production can be presented as a short and lucid list of unique works, which can easily be browsed by their different expressions and manifestations. Furthermore, by linking instances in the dataset to matching or corresponding instances in external sets, the presentation has been enriched with additional information about authors and works.

Presenting results as dynamically generated co-authorship subgraphs in semantic digital library collections
James Powell, Tamara M. McMahon, Ketan Mane, Laniece Miller, Linn Collins

Semantic web representations of data are by definition graphs, and these graphs can be explored using concepts from graph theory. This paper demonstrates how semantically mapped bibliographic metadata, combined with a lightweight software architecture and Web-based graph visualization tools, can be used to generate dynamic authorship graphs in response to typical user queries, as an alternative to more common text-based results presentations. It also shows how centrality measures and path analysis techniques from social network analysis can be used to enhance the visualization of query results. The resulting graphs require modestly more cognitive engagement from the user but offer insights not available from text.

On Dentographs, A New Method of Visualizing Library Collections
William Denton

A dentograph is a visualization of a library’s collection built on the idea that a classification scheme is a mathematical function mapping one set of things (books or the universe of knowledge) onto another (a set of numbers and letters). Dentographs can visualize aspects of just one collection or can be used to compare two or more collections. This article describes how to build them, with examples and code using Ruby and R, and discusses some problems and future directions.

Tuesday, January 31, 2012

Metadata Harvested

Jason Ronallo at Preliminary Inventory of Digital Collections writes about Common Crawl, Web Data Commons, and Microdata.
The other day I discovered the Web Data Commons, which is building on top of the Common Crawl to extract Microformat, Microdata, and RDFa data and make it available for free download. This means that there is starting to be free structured data from a big portion of the Web available for for anyone to play with at very low cost. Common Crawl takes care of the crawling and then Web Data Commons will do data extraction. This opens up new possibilities for services, specialized search, and aggregations of content. Big web data is being opened up for small startups and individuals.
Is your library being crawled? Does it have metadata able to be harvested? Should it? Just asking.

MARCXML to MODS 3.4 XSLT (Revision 1.75)

A logo of the Unites States Library of Congres...Image via WikipediaThe revised version of MARCXML to MODS 3.4 XSLT has been announced.
The Library of Congress' MARCXML to MODS 3.4 XSLT stylesheet (Revision 1.75) http://www.loc.gov/standards/mods/v3/MARC21slim2MODS3-4.xsl is now available--it incorporates edits made in response to comments received since the release of Revision 1.74.

The MODS 3.4 XSLT is based on the MARC to MODS 3.4 mapping made available by the Library of Congress in July of 2010 http://www.loc.gov/standards/mods/mods-mapping.html. The mapping and the XSLT are also available via the Library of Congress' MODS Web site. They They will be revised periodically as users' comments are received and as subsequent MODS Editorial Committee analysis and decisions evolve.

Monday, January 30, 2012

VuFind 1.3 Released

VuFindImage by nengard via FlickrVuFind, the library portal software, has a new version.
The latest version of the VuFind Open Source discovery software has just been released.

The new release includes several significant enhancements:
  • A new "book bag" feature has been added for shopping-cart-style bulk actions (save, email, export multiple records).
  • VuFind is now driven by Apache Solr 3.5, the latest version of the powerful index engine.
  • New optional search plug-ins have been added for visual timelines, Google Maps integration and Europeana searches.
  • Enhanced RSS feeds allow VuFind results to be easily shared with external services such as Elsevier's SciVerse platform.
  • Syndetics integration has been improved.
  • VuFind's default theme now uses jQuery and Blueprint for a more dynamic, polished interface.
Additionally, several bug fixes and minor improvements have been incorporated.

Friday, January 27, 2012

Additions to Source Codes for Vocabularies, Rules, and Schemes

The source code listed below has been recently approved. The code will be added to the applicable Source Codes for Vocabularies, Rules, and Schemes list. See the specific source code list for current usage in MARC fields and MODS/MADS elements.

The code should not be used in exchange records until 60 days after the date of this notice to provide implementers time to include the newly-defined code in any validation tables. Subject Heading and Term Source Codes

The following source code has been added to the Subject Heading and Term Source Codes list for usage in appropriate fields and elements.

Addition:
collett
Collett-bibliografi: litteratur av og om Camilla Collett (Oslo: Nasjonalbiblioteket)

Publication of RDA terms for Content, Carrier, Media type Vocabularies

RDA logoImage by American Library Association Publishing via FlickrNews about RDA vocabularies.
The Joint Steering Committee for Development of RDA (JSC), the DCMI Bibliographic Metadata Task Group (formerly DCMI/RDA Task Group), and ALA Publishing (on behalf of the co-publishers of RDA) are pleased to announce the publication of a second set of vocabulary terms as linked open data. The RDA Carrier Type, Content Type and Media Type vocabularies have been reviewed, approved, and their status in the Open Metadata Registry (OMR) changed to ‘published.’ The finished vocabularies can be viewed following the links from the terms above. (The links lead to the description of the vocabulary itself, the specific terms can be viewed under the tab for ‘concepts’).

Terms in the Content Type vocabulary refer to the intellectual or artistic content of a resource, such as text or notated music; terms in the Carrier Type vocabulary refer to the means and methods by which content is conveyed including volume, sheet, computer disk; terms in the Media Type vocabulary specify the general type of intermediation device (if any) required to view, play or run the content of a resource. These vocabularies are derived from the RDA/ONIX framework for resource categorization which established an extensible methodology for categorization of resources according to content and carrier.

Wednesday, January 25, 2012

Cute Catalog

The 1st operational eXtensible catalog is Cute.Catalog at Kyushu University Library.
Cute.Catalog completely covers the bibliographic information of academic resources in Kyushu University which contain not only library holdings but also research output produced by Kyushu University researchers.

Cute.Catalog http://catalog.lib.kyushu-u.ac.jp/en

Cute.Catalog includes:
  • Research Outputs by Kyushu University Researchers: 250 thousands
  • Library Holdings of Printed Materials in Kyushu University Bibliographies: 1.6 million, Holdings 4 million
  • Accessible e-Journals: 51 thousands, e-Books: 53 thousands
  • Institutional Repository records: 17 thousands
  • Digital Collection: 10 thousands
Key enhanced features are:
  1. advanced search
  2. online link with 360 Link XML API
  3. put a label of institutional production
  4. social links and exporting features and more...