What would a "Delicious Tag Cleaner" be? It is tool for removing unnecessary tags from your del.icio.us account....As you clean-up tags doesn't that remove them from the stream-of-consciousness thing? Aren't they losing their value and becoming subject headings? Poor ones at that.
If you're like me, you probably have thousands of bookmarks collected over years and years of web surfing and hundreds of tags used to describe them. But the thing is that over these months/years you haven't been able to come up with a consistent taxonomy for your tags.
I have, for example, dozens of different tags for expressing links related to software development: "dev", "devel", "development" etc.
So this tool can suggest you tags to be merged together, so you can choose one by one and have this tool to merge the chosen tags on your delicious account.
Friday, May 30, 2008
Tag Cleaner
Bring some consistency to your tagging with Delicious Tag Cleaner
Statement of International Cataloguing Principles
A reminder from IFLA about the Statement of International Cataloguing Principles.
This is a reminder announcement that the Statement of International Cataloguing Principles developed by the five IFLA Meetings of Experts on an International Cataloguing Code is now available for worldwide review and comment.
A vote form is also available there and can be used by anyone to indicate whether they approve the statement or not and to make comments. The form can be printed out, filled in, and faxed, or it can be filled in electronically and sent as an e-mail attachment.
Labels:
Cataloging,
IFLA
Wednesday, May 28, 2008
2.0 Speaking Opportunities
Any folks who want to represent the library community in an eduction 2.0 setting should check out CR 2.0. They are having a series of 20 workshops around the U.S. and are using an unconference format. Go to their website and suggest a topic and the folks attending vote on what they want to hear. Even if you don't become a facilitator for the discussion, at least they have seen that libraries are part of eduction 2.0. Just participating in the discussion might open some eyes to the role of libraries in education.
Tuesday, May 27, 2008
Tagging @ NASA
NASA is sporting a tag cloud on their home page. It is generated from words used to search the site. Look to the right a bit down. It sports a nice star field background.
Labels:
Tagging
Friday, May 23, 2008
Web Ontology Language (OWL)
Some papers from HP Labs concerning the Web Ontology Language (OWL)
- An OWL Full Interpretation by Jeremy Carrooll HPL-2008-60
This report is an appendix to report HPL-2008-59. It gives a worked example of the construction used in the proof from that report. For finiteness, a reduced datatype map consisting of only xsd:boolean is used. Each of the graphs in the construction is listed explicitly, with some redundancy eliminated. The final Herbrand graph contains about 15,000 triples.
The Consistency of OWL Full (with proofs) by Jeremy Carroll and Dave Turner HPL-2008-59
We show that OWL1 Full without the comprehension principles is consistent, and does not break most RDF graphs that do not use the OWL vocabulary. We discuss the role of the comprehension principles in OWL semantics, and how to maintain the relationship between OWL Full and OWL DL by reinterpreting the comprehension principles as permitted steps when checking an entailment, rather than as model theoretic principles constraining the universe of interpretation. Starting with such a graph we build a Herbrand model, using, amongst other things, an RDFS ruleset, and syntactic analogs of the semantic "if and only if" conditions on the RDFS and OWL vocabulary. The ordering of these steps is carefully chosen, along with some initialization data, to break the cyclic dependencies between the various conditions. The normal Herbrand interpretation of this graph as its own model then suffices. The main result follows by using an empty graph in this construction. We discuss the relevance of our results, both to OWL2, and more generally to a future revision of the Semantic Web recommendations. This longer version contains the proofs.
The Consistency of OWL Full by Jeremy Carroll and Dave Turner HPL-2008-58
We show that OWL1 Full without the comprehension principles is consistent, and does not break most RDF graphs that do not use the OWL vocabulary. We discuss the role of the comprehension principles in OWL semantics, and how to maintain the relationship between OWL Full and OWL DL by reinterpreting the comprehension principles as permitted steps when checking an entailment, rather than as model theoretic principles constraining the universe of interpretation. Starting with such a graph we build a Herbrand model, using, amongst other things, an RDFS ruleset, and syntactic analogs of the semantic "if and only if" conditions on the RDFS and OWL vocabulary. The ordering of these steps is carefully chosen, along with some initialization data, to break the cyclic dependencies between the various conditions. The normal Herbrand interpretation of this graph as its own model then suffices. The main result follows by using an empty graph in this construction. We discuss the relevance of our results, both to OWL2, and more generally to a future revision of the Semantic Web recommendations. Publication Info: Submitted to ISWC 2008 b1 s 7th International Semantic Web Conference, Karlsruhe
Labels:
Ontologies,
OWL
MARC 2 MODS Tool
The Digital Library Federation announces a revision to their MARCXML to MODS tool.
The DLF Aquifer Metadata Working Group announces an update to the XML stylesheet they have developed for the Aquifer project, for conversion of MARCXML records to MODS. The current stylesheet, DLF_MARC2MODS_1.34.xsl, can be found from a link on our MARC to Aquifer MODS XSLT Stylesheet page. Changes are briefly documented in the comments at the beginning of the stylesheet. We have also updated the Introduction pages that give more detail about some of the changes.
The changes include re-added mapping for tag 510 citations to the note element for monographs only; added subject:hierarchicalGeographic element mapping of tag 662 Subject - Hierarchical Place Name; added mapping of tags 561 (ownership) and 581 (publications) to the note element, removed mapping of 007 specific material designation to the genre element when the value is "remote", and a correction to no longer repeat mapping of dates from the Leader to originInfo:date when the date type is "questionable".
Tuesday, May 20, 2008
MARC Update
Update No. 8 (October 2007) was recently released in multiple document formats. It includes changes made to the MARC 21 formats resulting from proposals which were considered by the ALA ALCTS/LITA/RUSA Machine-Readable Bibliographic Information Committee (MARBI), the Canadian Committee on MARC (CCM) and the BIC Bibliographic Standards Group in 2007.
The printed update is available through the Cataloging Distribution Service.
It includes pages for fields that have been changed, with changes marked with side lining. PDF of those printed update pages are also available online
The printed update is available through the Cataloging Distribution Service.
It includes pages for fields that have been changed, with changes marked with side lining. PDF of those printed update pages are also available online
Labels:
MARC
D-Lib Magazine
The May/June 2008 issue of D-Lib Magazine is now available.
Some articles of interest include:
Some articles of interest include:
- PREMIS With a Fresh Coat of Paint: Highlights from the Revision of the PREMIS Data Dictionary for Preservation Metadata Brian F. Lavoie, OCLC Online Computer Library CenterAdding Value to the Library Catalog by Implementing a Recommendation System Michael Moennich and Marcus Spiering, Karlsruhe University Library
Monday, May 19, 2008
xOCLCnum
A new service from OCLC.
I'd like to announce and invite you to try xOCLCnum, the latest in the xIdentifier family of Web services from OCLC.
Just as xISBN allows you to find all related editions of a book by entering its ISBN, xOCLCnum does the same thing using OCLC number.
xOCLCnum is queried using a simple URL format, and returns an XML response with both related OCLCnums and related ISBNs (if any). It is designed to be easily built in to your library application, so you can expand queries, find all related editions, or do whatever creative thing you want to do.
Background:
ISBNs have been assigned since 1970, to most but not all books published.
OCLC numbers are assigned whenever a record is added to WorldCat, OCLC's global union catalog. These records cover a large portion of all books, old and new, held by any library in North America and, increasingly other regions worldwide (most recently, National Library of China).
So the coverage range of OCLC numbers is, not surprisingly, far greater than that of ISBNs: in WorldCat, for example, around 100 million OCLCnums compared to about 20 million ISBNs.
More Information on xOCLCnum
xOCLCnum API description
Labels:
Identifiers
1:30 Ratio for Information
The post at Librarian.net about the book containing thirty tables-of-contents reminded me of the 1:30 rule for information.
Dolby and Resnikoff found these relationships:
Dolby and Resnikoff found these relationships:
- A book title is 1/30 the length of a table of contents in characters, on averageA table of contents is 1/30 the length of a back of the book index, on averageA back of the book index is 1/30 the length of the text of a book, on averageAn abstract is 1/30 the length of the technical paper it represents, on average
Labels:
TOC
XML Workshop
A couple of years ago I had the pleasure of taking the XML workshop offered by Eric Lease Morgan. One of the best workshops I've experienced. Now the notes have been revised and are available online.
XML is about distributing data and information unambiguously. Through this hands-on workshop you will learn: 1) what XML is, and 2) how it can be used to build library collections and faciliate library services in our globally networked environment.An introduction to XMLActivity - Beyond MARCIndexes make search easierActivity - Indexing/searching MODSActivity - Writing XMLFlavors of XMLActivity - Writing XML, reduxActivity - Full-text indexesClient/server computingDatabases for data storage and maintenanceOAI-PMH - a de-centralized OCLCActivity - Being an OAI service providerActivity - Being an OAI data repositoryWeb ServicesActivity - Creating a "mash-up"Workshop summaryExternal links
Labels:
Congresses,
XML
Friday, May 16, 2008
MARC Online
More news from LOC.
The Network Development and MARC Standards Office is pleased to announce that the Full versions of the all five MARC 21 formats are now available online, along with the Online Concise.
The "full" version of a format contains detailed descriptions of every data element, along with examples, input conventions, and history sections - all of the information from the printed formats. There are no textual differences between the Online Full and the printed documentation. The Concise still contains all of the elements and enough description to serve many lookup needs. Changes from the most recent update of the formats are indicated in the text of both the Online Concise and the Online Full.
Labels:
MARC
Links in LC Records
News about 856 links from LOC.
I've received a couple of questions recently about the 856 links in LC records for the TOCs, descriptions, bios, sample texts, etc. and wanted to spread the word about what we did.
Every month, around the first of the month, folks run their link checkers to validate the links in their copies of LC records. The volume of traffic against our web server was tremendous. A couple of times it nearly brought the server down. We tried several things to minimize the impact if it looked like a link checker was running against the web server, but this didn't seem to help the problem. In the end, we moved all of the files that are in the 856 fields to a different, larger, more robust server. Apparently this is causing link checkers to report that there is a redirect and people are asking if they need to change the URL for the links. I would say that there is no need to change the 856 links from http://www.loc.gov... to http://catdir.loc.gov.... In fact, I am still adding the URLs as http://www.loc.gov...
LC is committed to maintaining these URLs, you should not be experiencing access problems with them except when running link checkers or maybe harvesters. I appreciate any reports of wrong connections or other serious problems with the files. By my count, we have over 710,000 links in the LC catalog now, so you can see this is a major commitment for LC.
Wednesday, May 14, 2008
Manifestations and Near-Equivalents
Martha M. Yee continues to make her work readily available.
The two articles about 'manifestation' (the word everyone used to mean 'expression' until FRBR came along) that I published in 1994 are now available at the University of California eScholarship Repository, as follows:
Manifestations and Near-Equivalents: Theory, with Special Attention to Moving-Image Materials. Library Resources & Technical Services 1994; 38:227-256.
Manifestations and Near-Equivalents of Moving Image Works: a Research Project. Library Resources & Technical Services 1994; 38:355-372.
Labels:
FRBR
Re: Recommendation and Ranganathan
I hope everybody here is also reading Lorcan Dempsey's weblog. However, just in case there are some who don't, begin with the excellent post Recommendation and Ranganathan. I thought the description of the four types of metadata a very good place to start thinking and discussion.
Labels:
Metadata
Tuesday, May 13, 2008
eXtensible Text Framework (XTF)
The California Digital Library (CDL) is pleased to announce a new release of its search and display technology, the eXtensible Text Framework (XTF) version 2.1. XTF is an open source, highly flexible software application that supports the search, browse and display of heterogeneous digital content. XTF offers efficient and practical methods for creating customized end-user interfaces for distinct digital content collections.
Highlights from the 2.1 release include:
Since the first deployment of XTF in 2005, the development strategy has been to build and maintain an indexing and display technology that is not only customizable, but also draws upon tested components already in use by the digital library and search communities - in particular the Lucene text search engine, Java, XML, and XSLT. By coordinating these pieces in a single platform that can be used to create multiple unique applications, CDL has succeeded in dramatically reducing the investment in infrastructure, staff training and development for new digital content projects.
XTF offers a suite of customizable features that support diverse intellectual access to content. Interfaces can be designed to support the distinct tools and presentations that are useful and meaningful to specific audiences. In addition, XTF offers the following core features:
Posted to many e-mail distribution lists.
Highlights from the 2.1 release include:
- Extensive interface improvements, including new search forms, built-in faceted browsing, and a new look and feel.Increased support for document and information exchange formats.
- XHTML and OAI-PMH outputNLM article format indexing and outputMicrosoft Word indexing
adaptation.Updated documentation that has been moved to the XTF project wiki, allowing XTF implementers to share solutions with entire user community."Freeform" Boolean query language, offered as an experimental feature.Backward compatibility with existing XTF implementations.
Since the first deployment of XTF in 2005, the development strategy has been to build and maintain an indexing and display technology that is not only customizable, but also draws upon tested components already in use by the digital library and search communities - in particular the Lucene text search engine, Java, XML, and XSLT. By coordinating these pieces in a single platform that can be used to create multiple unique applications, CDL has succeeded in dramatically reducing the investment in infrastructure, staff training and development for new digital content projects.
XTF offers a suite of customizable features that support diverse intellectual access to content. Interfaces can be designed to support the distinct tools and presentations that are useful and meaningful to specific audiences. In addition, XTF offers the following core features:
- Easy to deploy: Drops directly in to a Java application server such as Tomcat or Resin; has been tested on Solaris, Mac, Linux, and Windows operating systems.Easy to configure: Can create indexes on any XML element or attribute; entire presentation layer is customizable via XSLT.Robust: Optimized to perform well on large documents (e.g., a single text that exceeds 10MB of encoded text); scales to perform well on collections of millions of documents; provides full Unicode support.Extensible:
- Works well with a variety of authentication systems (e.g., IP address lists, LDAP, Shibboleth).Provides an interface for external data lookups to support thesaurus-based term expansion, recommender systems, etc.Can power other digital library services (e.g., XTF contains an OAI-PMH data provider that allows others to harvest metadata, and an SRU interface that exposes searches to federated search engines).Can be deployed as separate, modular pieces of a third-party system (e.g., the module that displays snippets of matching text).
- Spell checking of queriesFaceted displays for browsingDynamically updated browse listsSession-based bookbags
Posted to many e-mail distribution lists.
Labels:
Open Source,
XTF
Non-Latin Data in Name Authority Records
From LC:
As previously announced, MDS- Name Authority records will be enhanced with non-Latin script data in 4XX fields and selected notes beginning June 1, 2008, (see earlier announcements at http://www.loc.gov/catdir/cpso/nonroman_announce.pdf and http://www.loc.gov/catdir/cpso/nonlatin_whitepaper.html for additional information.) An additional FAQ related to the project will be posted at http://www.loc.gov/aba/ shortly.
An effort to automatically pre-populate existing authority records with non-Latin references by OCLC, Inc. will also begin in early June 2008. The initial rate of pre-population will be limited to several hundred records per week, and will grow to a rate of approximately 25,000 records per week. Note that other clean-up projects that have recently increased the volume of name authority records (http://www.loc.gov/cds/notices/2008-02-14.pdf ) will be suspended during this pre-population effort. It is estimated that approximately 400,000 pre-population records will be distributed over a number of months.
CDS is making available a file of name authority test records containing non-Latin script data. The file of 110 test records can be found on the Library of Congress rs7 server under the /emds/test subdirectory with file names of names.nonlatintest.records for the MARC 8 version and names.nonlatintest.records.utf8 for the UTF8 version.
Labels:
Name authority records,
Unicode
Spam
I've been blasted with comment spam. So I've had to turn on the comment moderation function.
It is a shame how these few folks can ruin things for all. A few years back a e-card was a fun thing to receive and send. now so many are spam, I've stopped sending and opening them. Open comments seem ready to go the same way.
It is a shame how these few folks can ruin things for all. A few years back a e-card was a fun thing to receive and send. now so many are spam, I've stopped sending and opening them. Open comments seem ready to go the same way.
Labels:
Spam
Friday, May 09, 2008
Metadata for Learning Resources
Metadata for Learning Resources: An Update on Standards Activity for 2008 by Sarah Currier appears in the latest issue of Ariadne.
The major areas of development covered in this article are:LOM Next: plans for the next version of the IEEE LOMThe Joint DCMI/IEEE LTSC (Learning Technology Standards Committee) Taskforce: bringing together the two major metadata standards used for learning resources, and providing an RDF translation for the LOMDC-Education Application Profile (DC-Ed AP): a modular application profile purely looking at educational aspects of resources, based on community requirementsThe United Kingdom’s Joint Information Systems Committee Learning Materials Application Profile (JISC LMAP) scoping study: working alongside a number of similar projects looking at application profiles for repositories in other areas, e.g. images.International Standards Organisation Metadata for Learning Resources (ISO MLR): based primarily in Canada, this international standards body is devising a new international standard for educational metadata, in response to perceived limitations of the IEEE LOMThe European Commission’s PROLEARN Harmonisation of Metadata project: a study into the issues and challenges of achieving harmonisation in metadata, given the heterogeneous landscape
Labels:
Metadata
Thursday, May 08, 2008
Metadata Advocates
I had an Ah-Ha moment while listening to John Udell's show Interviews with Innovators. The episode was Working with Data Sources with Raymond Yee.
Raymond Yee is a lecturer at the UC Berkeley School of Information and the author of Pro Web 2.0 Mashups: Remixing Data and Web Services. In this conversation he talks about teaching students how to work with existing data sources, and speculates with Jon Udell on ways to expand the supply of available sources.What struck me was that we should be advocates for metadata standards. If the local geneology society puts up a calendar on their website, help them get it into iCal or hCal format. Then we could drop their info into a pathfinder. Or geocoding the local bird-watchers sightings, or school district's lunch menu, or .... We could offer our understanding of the importance of standards and data reuse to our community. The library benefits by becoming the go-to-place for information management. The community benefits because they get the word out more effectively. It would be a very different job description for a cataloger to become the community data standard outreach person. But, not a bad place to be.
Subscribe to:
Posts (Atom)


