Tuesday, February 13, 2007

MARC to Solr

There is a preconference on Lucene and Solr before the Code4lib conference. Because of that Andrew Nagy has made available his MARCXML2SOLR XSLT document. If you have some MARC records in XML and want to get them into a format Solr understands, now you can. MarcEdit, among other tools, will get your MARC into MARCXML format. Can't make it to the preconference, use this tool to play along at home.

1 comment:

Terry Reese said...

Actually, you could use MarcEdit to go straight from MARC to the Solr syntax -- though, you'd want to modify the posted stylesheet to include the marc: namespace. This way, the tool could process files with or without that namespace.

The way that you make it work is simply register the crosswalk with MarcEdit. Since some folks aren't sure how this works -- I've quickly recorded a quick avi file of what that looks like. See: Adding and Using MARC=>Solr crosswalk for the AVI file showing how to register the MARC=>Solr crosswalk. BTW, the avi file is ~29 MB.

I also modified the crosswalk that you'd linked to so that it works better in MarcEdit. Since MarcEdit uses the marc namespace by default, xslt stylesheets work best in MarcEdit if they include the namespace. This way, MarcEdit can process items with namespace and without. Here's the stylesheet with the revisions made (BTW, this is the stylesheet I used in my example): Modified MARC21XML=>Solr XSLT