Friday, August 21, 2009

Book Covers

I'd be interested in getting info about a good source for book covers. I've tried LibraryThing. It was simple but the hit rate for covers, for our collection, was about 1%. How about Google Books, or Open Library? Any legal problems with use? I know Amazon just changed the use agreement to make it much less open.

I'd like to see an example of the HTML link for any you use and recommend.

Thursday, August 20, 2009

OCLC News

News from OCLC.
On Sunday, August 16, 2009, OCLC implemented the changes related to the OCLC-MARC Bibliographic, Authority, and Holdings Formats Update 2009. This includes MARC 21 Updates No. 8 (October 2007) and No. 9 (October 2008), MARC Code List changes since July 2008, and user and OCLC staff suggestions. OCLC Technical Bulletin 257, which presents the details, is available... If you have read or printed a version of TB 257 before the implementation, you may want to do so again, as there have been several substantive corrections and changes. Among the points of interest:
  • Defining Videorecording 007/04 (subfield $e) code "s" for Blu-ray Discs.
  • Linking ISSNs (ISSN-L) in bibliographic, authority, and holdings fields 022.
  • Changing field 041 to separately subfield subtitles/captions for moving images (subfield $j).
  • Validating codes for subfield $2 in field 047 (musical form) and 048 (medium of performance) for the respective code lists maintained by IAML (International Association of Music Libraries).
  • Implementing two new Dewey fields: 083 (Additional Dewey Decimal Classification Number) and 085 (Synthesized Classification Number Components).
  • Implementing the repeatable 260 field.
  • Making field 440 obsolete and converting appropriate 4XX/8XX combinations.
  • Defining new subfields in field 502 for dissertation details (degree, school, date, etc.).
  • Implementing new field 542 for Information Relating to Copyright Status.
  • Implementing subfield $0 (zero) for the Authority Record Control Number in 28 bibliographic fields and three authority fields.
Appropriate data conversions and re-indexing of WorldCat will begin following the August installation.

All new searching and indexing capabilities; new fields, subfields, and indicators; and new codes can now be used in both Connexion browser and Connexion client.

Connexion browser:
  • Connexion browser users have changes to the dropdown lists for fixed field elements available for use immediately.
  • The National Bibliography Number index ("nn:") will not appear in the Connexion browser dropdown list until November 2009. In the meantime, it is available via the command line in Connexion browser. All other new indexes and new or changed language qualifiers appear in search screen dropdown lists.
Connexion client:
  • Beginning on Tuesday, August 18, 2009, Connexion client users will be prompted when they start the Connexion client software to download a new file with changes to the dropdown lists for fixed field elements. The window with the prompt is labeled "New Components Available." You do not need Administrative Privileges on your workstation to download the file. Say "Yes" to download the file immediately. Say "Remind me later" if you do not wish to download right away. You will be prompted to download the file each time you open Connexion client until you complete the download.
  • In Connexion client, all new indexes and new or changed language qualifiers do not appear in search screen dropdown lists, but may be input manually via the command line. The search screen dropdown lists of indexes and language qualifiers will be updated in the next version of the Connexion client.

Tuesday, August 18, 2009

Genre/Form Headings Presentation

Janis Young's presentation at the Library of Congress is now available. Expanding the Power of the Library's Family of Vocabularies: Genre/Form Headings
In 2007, the Library of Congress embarked upon a project to create a system of genre/form headings, which describe what a work is rather than what it is about, as subject headings do. This presentation will explain the motivations for undertaking the project, including the need to anticipate the linked data requirements of the new generation of search engines and user interfaces, and also enumerate the authority record distribution channels, which furnish data for both human use and for data mining and computer manipulation. In addition, the presentation will address the practical impacts of this project on LC staff and users alike.
It requires Real Player (or Real Alternative) for playback.

Dewey Classification as Linked Data

News from the Dewey office.
For a long time, we wanted to do something with Linked Data. That is, apply Linked Data principles to parts of the Dewey Decimal Classification and present the data as a small “terminology service.” The service should respond to regular HTTP requests with either a machine- or a human-readable presentation of Dewey classes. There should be a URI (and, even better, a web page that delivers a useful description) for every Dewey concept, not just single classes. The data should be presented in a format that is capable of handling rich semantic information and in a way that allows users or user agents to just follow their nose to explore the data. For more complex stuff, the service should offer an API-like query access. Finally, the data that are presented should be reusable by anyone for non-commercial purposes.

Some results of these efforts are now available as dewey.info. We had to come up with a URI pattern for the DDC that would generate persistent identifiers for DDC concepts in a distributed environment. Secondly, we wanted to test out the RDF vocabulary SKOS for creating a representational model to express some of the best nuggets of DDC data (language-independent identifiers, multilingual terminology, and semantic relationships). And finally, because Linked Open Data is not really open when you have to ask someone before you can use it, we wanted to test out a Creative Commons license for easier reuse of DDC data for non-commercial purposes.

We chose the DDC Summaries as a first data set to publish according to Linked Data principles. The latest version of the Summaries, i.e., the top three levels of DDC 22, has been available as a web document for some time. To broaden the possible applications of what now essentially is just tag soup (in only one language), every class had to be identified with a URI and the data had to be presented in a reusable way. Please give it a try at dewey.info. An extended overview of the service can be found here, a slightly more technical description is available on the OCLC Developer Network wiki.

What you see now is only the first step. The intention of dewey.info is to be a platform for Dewey data on the web; more is to come in terms of languages, deeper data, and links to other datasets. The DDC has been widely used as a knowledge organization tool, and the way the URIs have been set up should allow the construction of links based on existing metadata in resource descriptions like bibliographic records.

An example: As the World Digital Library is getting ready to deploy RDF views of its data, instead of just including the Dewey number as a literal (or pointing to some other data source), they could include URIs to dewey.info to tap into the SKOS relationships pointing to broader and narrower classes for retrieval interactions, etc. In turn, this establishes a link from a Dewey class to other vocabularies used concurrently for that resource, to Wikipedia, etc. Links can go both ways and benefit all participating data sets, establishing a graph of Linked Data.

Monday, August 17, 2009

Semantic Web @ SXSW 2010

Here are some proposed Semantic Web sessions proposed for South by Southwest 2010. Talis seems to represent the library pretty well. If you want to see any of these on the final program, vote for it.

  • Set your data free
    Ian Davis, CTO - Talis
    Data isn't like content: it's infinitely remixable, machines churn through it by the bucketload and it isn't covered by copyright. But there are other rights that get in the way of reuse. This panel will tackle how we can free our data more effectively.
    Semantic Tagging and Blogging
  • Andraz Tori, CTO - Zemanta
    How can bloggers and social media websites take benefit of the rise of the Semantic Web? Efforts such as CommonTag and Rich Snippets are offering bloggers new options to add semantics to their blogs. This panel will discuss how bloggers and social media sites can leverage semantic tagging for their benefit.
  • What the hell is the Semantic Web?
    Juan Sequeda, Co-Founder - Semantic Web Austin
    In the past year, the Semantic Web has gained a lot of publicity. However, many may still not understand what the Semantic Web is. This panel of experts will address the myths, realities and all the open issues that the public may have about the Semantic Web
  • The Semantic City
    John De Oliveira, Co-Founder - Semantic Web Austin
    Imagine a metropolitan area with highly coordinated residents, where rich online and real world experiences amplified each other. Economic and social improvement would dramatically outpace other cities. This is the vision of Semantic Web Austin, the most active and well-funded Semantic Web organization in the United States.
  • Bin the Browser? Interacting with Linked Data
    Tom Heath, Researcher – Talis
    In among the Web of documents we've built a Web of Linked Data. It's huge, it's heterogeneous and it's here. So what are we going to do with it? Is the search/browse paradigm the right basis for Linked Data applications, or are we selling ourselves short?
  • Big Data, Big Dream
    Juan Sequeda, PhD Student - University of Texas at Austin
    How can we have applications that can scale with large amounts of data? Are relational databases sufficient? What other technologies are out there that can scale? This panel will talk about existing technologies that manage large amounts of data.
  • I Have Never Believed in the Semantic Web
    Leigh Dodds, Program Manager – Talis
    It turns out a six-year old can understand the basic idea of the Semantic Web. So why do so many developers think it's so complicated? If you're a skeptic then come and have your assumptions challenged. Find out how the web of data is being built today.
  • Metadata Wars: Untangling Microformats, RDFa and Microdata
    John de Oliveira, Co-Founder - Semantic Web Austin
    Microformats, RDFa and microdata are largely incompatible ways of annotating HTML documents with metadata. What is the difference and why do we need them all? Organizations such as Google, The Associated Press and Yahoo all have their opinions about metadata. Where is this all going?
  • Semantic Search: Life Beyond Ten Blue Links
    Peter Mika, Yahoo!
    Ten blue links with a title and an abstract have dominated the lives of search users for over a decade now. Semantic technologies have the potential to change the face of search through a deeper understanding of the needs of users and the content on the Web. Will it be a revolution in search?
  • Semantic Search: Off to a Good Start
    Peter Mika, Yahoo!
    Pursued by a number of search companies both large and small, semantic search turned into one of the hottest trends in search innovation. What's the benefit for publishers, end-users and developers? This presentation examines the case for semantic search.
  • Semantic Music
    Yves Raimond, BBC
    By publishing music information on the web as Linked Data, artists ensure that their material can be reused and discovered in new ways. Sites such as BBC Music and Myspace have been publishing structured web data enabling a wide range of innovative third party applications and mashups.
  • Making Dollars And Sense Out Of The Semantic Web
    Nik Daftary, CEO – Turn2Live
    With the advent of the semantic web, powerful new ways to consume and disseminate information will emerge. Information that once proved difficult to contextualize will now become commonly easy. So, what does that mean for consumers? In this panel discussion, we will cover what the Semantic web means to you as well as how it will change online advertising as we know it today.
Voting Closes September 4.