The revised Character set specifications are now posted on the MARC site. They take into account the use of the full Unicode repertoire, as opposed to only the MARC-8 subset of Unicode, and also include the loss-less and lossy techniques for converting full Unicode to MARC-8 repertoire that were approved this year.The MARC-8 specifications are still part of the document and the MARC-8 character code tables and mappings have some improved formatting, but no changes have been made to the MARC-8 to Unicode character set mappings.The XML (all MARC-8 repertoire) and comma-delimited (East Asian MARC-8 only) files are still downloadable, but we plan to improve the XML file in the near future. We are interested to know whether the comma-delimited file is used, as we may only need to offer the XML for download.
Thursday, December 06, 2007
News from LC.