Article,

Chemical Name to Structure: OPSIN, an Open Source Solution

, , , and .
Journal of Chemical Information and Modeling, 51 (3): 739--753 (Mar 28, 2011)
DOI: 10.1021/ci100384d

Abstract

We have produced an open source, freely available, algorithm (Open Parser for Systematic IUPAC Nomenclature, OPSIN) that interprets the majority of organic chemical nomenclature in a fast and precise manner. This has been achieved using an approach based on a regular grammar. This grammar is used to guide tokenization, a potentially difficult problem in chemical names. From the parsed chemical name, an XML parse tree is constructed that is operated on in a stepwise manner until the structure has been reconstructed from the name. Results from OPSIN on various computer generated name/structure pair sets are presented. These show exceptionally high precision (99.8\%+) and, when using general organic chemical nomenclature, high recall (98.7-99.2\%). This software can serve as the basis for future open source developments of chemical name interpretation.

Tags

Users

  • @dblp
  • @fairybasslet

Comments and Reviews