Wednesday, June 27, 2007

There can be only one (namespace)

Today I had a discussion with indexing CML files, such as extraction of molecular formula, author, title, compound names etc. And we came to the point of CML namespaces. And CML has seen a few in the past, listed below. However, only the latest should be used: http://www.xml-cml.org/schema.

http://www.xml-cml.org/schema
This is the namespace of the latest CML schema. This one must be used for creating any new CML file.

http://www.xml-cml.org/schema/cml2/core
This was the first namespace used in the XML Schema definition for CML. It has been used by at least OpenBabel and the CDK, but is outdated and should no longer be used.

http://www.xml-cml.org/dtd/cml1_0_1.dtd
I have no idea if this namespace has actually been used, but it is mentioned in this official FAQ. I think it might actually be remnant from the early days of XML Schema: CML 1.0 (DOI:10.1021/ci990052b) was DTD based, and DTD has no concept like XML Namespaces. But namespaces are oh so useful, and used in CML 2.0 (DOI:10.1021/ci0256541) applications like CMLRSS (DOI:10.1021/ci034244p). I am not aware of applications using this cml1_0_1.dtd CML namespace.

No comments: