Monday, September 11, 2006

InChI's in CML

I've seen some incorrect uses of <identifier> for adding InChI's in CML. The bad code I have seen in the wild looks like:

<molecule>
<identifier convention="iupac:inchi">InChI=1/CH4/h1H4</identifier>
</molecule>

However, the <identifier> does not allow content other then elements. In XML terms: it does not allow mixed content. You might wonder why it allows any element. This is because the first InChI betas had an XML syntax (InChI still has an XML syntax too).

If you want the one-line InChI format most of us know, it needs to be put into the @value attribute. Thus:

<molecule>
<identifier convention="iupac:inchi" value="InChI=1/CH4/h1H4"/>
</molecule>


No comments: