Nowadays, XML is usually thought of as a markup technique utilized by programmers to encode computer-oriented data. Even DocBook and similar document-oriented DTDs focus on preparation of technical documentation. However, the real roots of XML are in the SGML community, which is largely composed of publishers, archivists, librarians, and scholars. In this installment, David looks at Text Encoding Initiative, an XML schema devoted to the markup of literary and linguistic texts. TEI allows useful abstractions of typographic features of source documents, but in a manner that enables effective searching, indexing, comparison, and print publication -- something not possible with publications archived as mere photographic images.
Monday, November 24, 2003
XML Matters: TEI -- the Text Encoding Initiative : An XML dialect for archival and complex documents by David Mertz.