Thursday, November 20, 2008

Authority Record Access

Why doesn't LC offer Z39.50 access to the authority files? How about their other thesauri, like the Thesaurus For Graphic Materials? Easy access to these files would be useful. Maybe Z39.50 is "so yesterday" and SRU/SRW or an API is the answer. These are rich resources and access would be useful in ways we can't yet imagine. How about other institutions? AAT or the NASA Thesaurus, or... would be useful. This is not only about bibliographic access, but has wider issues in a Semantic Web environment.

[Later] OCLC does provide access via their Terminologies Project, see the comment for full details.

[21 Nov. 2008] Someone sent me a note saying that the Voyager software used does not support Z39.50 access to the authority records. That they are not a separate database and have very little indexing. Do check out the comments for some useful information.


Andrew Houghton said...

Access to TGM I, TGM II and other controlled vocabularies can be accomplished with SRU/W through OCLC Research's Terminology Service Project. The service provides SRU/SRW access in one of two mechanisms:{code}/{code}.sru?{sru-params}{code}.srw

The {code} is based on the MARC thesaurus code list.

The first two URI's must be done with an HTTP GET method and the second URI must be done with an HTTP POST method. If your browser supports client-side XSL transforms, then the first two URI's will generate an HTML page that you can test the service with.

So if you wanted to access TGM I, then you would use the URIs:

And if you wanted to access TGM II, then you would use the URIs:

Specific representation can be generated for each concept in the vocabulary using SRU/W or by using the URI pattern with the concept's identifier:{code}/{id}.{format}


Acoustical engineering (in MARC-XML)
Acoustical engineering (in SKOS)
Acoustical engineering (in Zthes)

Other controlled vocabularies are available such as: FAST, GSAFD, MeSH, LCSH, LCSHac. Discussions continue with other vocabulary maintainers about making their vocabularies available.

Jonathan Rochkind said...

I was under the impression that LC specifically chose NOT to make the authorities available, because they offered them for sale instead. I've been curious how many sales they actually make. Maybe I had the wrong idea all along though, or maybe they've changed their approach.

I believe that some of the authorities are available in SKOS format, as a result of some work Ed Summers did. See:

So they are in fact available, and someone could take them and make them available through other methods. But the format they are in there is in fact quite useful. I'm not positive whether is kept up to date or not though.

The one significant authority corpus that, so far as I know, is not available by any convenient machine access method, at a reasonable price, is the Name/National Authority File (NAF). We could really use access to that too.

Jonathan Rochkind said...

Oh, and also the LCC and Dewey authorities, would be greatly appreciated.

Conveniently, with regard to the OCLC Terminologies Service, I believe it is OCLC itself that owns Dewey, so one would think that would make it fairly easy for OCLC to talk to itself about making Dewey available.

But LCSH is indeed available through several sources. If we had the NAF, LCC, and Dewey, we'd have all of the most widespread controlled vocabularies used in our aggregate corpus. MeSH, I believe, is actually available for free in machine accessible form from the National Library of Medicine, and possibly from other sources as well.

Mike Kreyche said...

The Terminology Service is nice but the terms of use ( seem pretty restrictive to me. Or at least vague on the side of restrictive.

LC is considering LCCN permalinks for authority records, which would be a step in the right direction (