Thursday, July 28, 2011

Tematres 1.4

We have the glad to invite to test the beta version of Tematres 1.4

TemaTres is a web tool to manage, publish and exploit controlled vocabularies and other formals representation of knowledge (thesauri, taxonomies, glossaries, etc).

This release includes the following fixes and improvements:
  • Quality indicators about controlled vocabularies The Quality assurance was improved with reports about the following quality indicators:
  • Free Terms
  • Terms without hierarchical relationships
  • Average number of words per term
  • Terms per N Broader terms
  • Terms per N narrower terms and depth
  • Terms words with not supported prefixes or suffixes
For more details: MARTÍNEZ, A.M.a et al. Concepto, forma y longitud de los términos preferentes del tesauro: una propuesta de indicadores de calidad. Anales de Documentación, 2010, vol. 13, p. 185-195. http://revistas.um.es/analesdoc/article/view/107151

MARTÍNEZ, A.M.a et al. Indicadores para evaluar el vocabulario y la estructura sistemática de un tesauro. I Jornada de Intercambio y Reflexión acerca de la Investigación en Bibliotecología, La Plata, 6-7 de diciembre de 2010. Facultad de Humanidades y Ciencias de la Educación de la Universidad Nacional de La Plata. http://www.jornadabibliotecologia.fahce.unlp.edu.ar/jornada-2010/martine
  • User-defined notes
    Has been added management capabilities to manage and create user-defined notes. (Thanks to Observatorio Estatal de la Discapacidad Spain: http://www.observatoriodeladiscapacidad.es/)
  • Advanced configuration options
    Has been added detailed configuration options available to the administrator of the controlled vocabulary.
  • Import controlled vocabularies
    Now TemaTres can import controlled vocabulary from plain tagged text.
For example:
  • IMS VDEX Scheme (Vocabulary Definition and Exchange) Now with TemaTres you can Display, export and publish terms and controlled vocabularies through VDEX IMS XML schema (Vocabulary Definition and Exchange). http://www.imsglobal.org/vdex/
  • Controlled vocabularies RESTful Services TemaTres have support for web services accessible through a clear and simple syntax. The service support a wide variety of queries and data can be viewed in XML, JSON or SKOS-Core.
Examples
Status:
http://www.vocabularyserver.com/gemet/en/api/

Terms beginning with the letter B:
http://www.vocabularyserver.com/gemet/en/api/letter/b
http://www.vocabularyserver.com/gemet/en/api/letter/b/skos
http://www.vocabularyserver.com/gemet/en/api/letter/b/json

Search Terms:
http://www.vocabularyserver.com/gemet/en/api/search/fish/
http://www.vocabularyserver.com/gemet/en/api/search/fish/json
http://www.vocabularyserver.com/gemet/en/api/search/fish/skos

Vocabulary Data
http://www.vocabularyserver.com/gemet/pt/api/fetchVocabularyData

Top of vocabulary terms
http://www.vocabularyserver.com/gemet/pt/api/fetchTopTerm
  • Minor bugs was solved and was added some minor features
Seen on the Code4Lib email.

Monday, July 25, 2011

Corporate Names

A new paper from HP discusses the problems and an automated solution to distinguishing corporate names. Company Names Matching in the Large Patents Dataset by Timofey Medvedev and Alexander Ulanov, HP Laboratories, HPL-2011-90R1.
This paper addresses the name matching (duplicate detection) problem in the US patent dataset. It contains more then 400K unique company names spellings. In order to solve the matching problem we choose appropriate string similarity measure and clustering approach and estimate their parameters. Finally we apply them to the whole dataset and estimate the positives and negatives rates.