CML is a rather flexible format, which allows, however, to specify it's specifics in form of conventions, as specified in the @convention attribute. Such conventions can put extra restrictions on hierarchy and expected content. For example, we may define a 'simpleMolecule' convention which only allows <atomArray> and <bondArray> inside a <molecule>, and <atom> in <atomArray> and <bond> in <bondArray>:
<molecule convention="simpleMolecule">
<atomArray>
<atom id="c1" elementType="C" hydrogenCount="3">
<atom id="n1" elementType="N" hydrogenCount="2">
</atomArray>
<bondArray>
<bond atomRefs="c1 n1" order="s"/>
</bondArray>
</molecule>
Now, such conventions need to be specified. One way of doing this is writing it up in a LaTeX document, which would include explanations and likely code examples. To ensure that these code examples are actually valid CML, the following setup may be used, which separates code examples from the LaTeX source.
Including CML examples in the LaTeX sourceBecause the framework makes use of the LaTeX package listings, it cannot make use of the \include command to include the code examples into the PDF. Instead, it makes use of a preprocessor that includes the examples instead. This is the directory layout I have:
.
|-- Makefile
|-- spec.tex.in
|-- examples
| |-- Makefile
| |-- schema.xsd
| |-- simple1d.valid.xml
`-- preproces.pl
The Makefile creates the PDF from the LaTeX source by first running
preproces.pl on the .tex.in file, and then running
pdflatex on the created .tex file:
all: spec.pdf
spec.pdf: spec.tex
pdflatex spec.tex
pdflatex spec.tex
pdflatex spec.tex
spec.tex: spec.tex.in
perl preproces.pl < spec.tex.in > spec.tex
The LaTeX source in the .tex.in file looks like:
\begin{lstlisting}[language=XML,
caption={Simple 1D ${^13}C$ NMR spectrum.},
label={list:simple1d}]
% INPUT: simple1d.valid.cml
\end{lstlisting}
The string "% INPUT:" is picked up by the Perl script to include that file. The full script looks like:
#!/usr/bin/perl
use diagnostics;
use strict;
while (my $line = <STDIN>) {
if ($line =~ /^\%\sINPUT:\s(.*)/) {
my $file = $1;
die "Cannot find file 'examples/$file' to insert!\n" if (!(-e "examples/$file"));
open (INPUT, "<examples/$file");
while (<INPUT>) { print STDOUT $_; };
} else {
print STDOUT $line;
}
}
CML ValidationNow that the examples are split out, but tightly integrated in the LaTeX source, validation of the CML examples is easy. I use a simple Makefile for that, which makes use of
xmllint:
all: validate
validate: *.valid.cml
@for f in *.valid.cml; do \
echo "** Validating $${f} against XML Schema..."; \
xmllint --noout --schema schema.xsd $${f}; \
done