In this last part of writing chemistry with LaTeX, we will look at the chemscheme portion of chemstyle.  It offers automatic tracking for schemes and compounds, and is a major time-saver for synthetic chemists.  Download example packages here: version 1, version 2.

Before we look at how to work with chemscheme, let me show you what it can do.  Imagine you are writing the following report: Fischer Esterification.

If you’re writing this in a word processor, your work-flow would likely be to draw and number the structures in ChemDraw.  Then you will either drag it directly into the word processor, or prepare a .tiff of it and link/embed that into the document.  You would be careful to make sure that duplicated compounds (e.g., 1 here) gets the same numbering.  You would be very careful in making sure that the numbers in the main text are the right numbers throughout, and you’re very proud that it all works out just right.

And then, your supervisor made a thoughtful comment that you should include what the real target and its starting material is.  And those should go in front of all your structures.  Now you’re in a not-so-great place: the two compounds will displace all your numbering, and you have to renumber all your figures, and then track down all the stray numbers.  Our example is trivial, but imagine doing that with a paper that has over 100 compounds, spreading across 200 pages!

chemstyle provides a behind-the-scenes mechanism (by way of Stephan Schenk’s chemcompounds, or Bjørn Pedersen’s bpchem) to automatically track down the numbering of compounds and schemes, so the numbers are always right after you hit typeset.  To make the change to what the supervisor suggested, all I did here is just adding the additional scheme, and LaTeX-chemstyle does the hard work:

And in the rest of this post, we’ll look at the syntax and workflow that make this happen.  WordPress.com doesn’t color LaTeX syntax, so I’ll settle with screenshots.  You may want to follow with the examples, and they can be downloaded here: version 1, version 2.  Version 1 is the initial, version 2 is revised with the supervisor’s suggestion.

Structure Drawing

Draw the structures in your drawing program as usual (I’m using ChemDraw here).  However, instead of manually numbering the molecules, place a tag where there number would be; the default behaviour in chemstyle is to search for “TMP”.  Your structure file would thus look like:

You would probably want to save a copy as ChemDraw file as you usually do (in case you need to edit it), and you will need to save an additional copy as .eps (Encapsulated PostScript).  .eps, unlike .jpg or .png, is a vector file that LaTeX (through PSTricks) knows how to open up and replace content with.

In ChemDraw, choose Save As, and select .eps. On OSX, you’re given a choice of Mac or text format, and it does not matter which one you choose to use.  I like to organize my files in a /Figures/ directory, instead of saving directly with the paper.


Repeat for each scheme or structure.  When there are more than one structure you would want to label, use TMP1, TMP2… to denote them from left to right.

Annotating in LaTeX

The chemstyle package provide many different options to format the numbers.  For example: numbering with 1a, 1b, 1c; custom formatting for the labels (I use magenta here to make them stand out better); see the manual for more details.  We will only be concerned with getting the basics working here.

We need to first insert two lines in the preamble of our file (i.e., before begin{document}:

A – you’re already familiar with the chemstyle line.  The auto-pst-pdf allows you to use .eps files with pdfLaTeX by converting them into .pdf on the fly.  The 2 runs is required for the text-substitution (I think – if someone can clarify, that’d be great).  auto-pst-pdf is included in major LaTeX implementations.

B – these lines modify the schemerefformat command with the instructions to make it magenta and italics.  I tried this as part of the tutorial in Wright’s documentation, and I like it.  It stands out better, I think.  (When you are ready to submit your manuscript, just take out these lines to make the compound numbers bold, black, and boring again.)

The preamble is thus set up.  In your main text, whenever you need a scheme, all you need to do is:

  • The begin/end{scheme} indicates the scope of the scheme and that it is a scheme at first place – LaTeX counts quietly in the background to make sure the numbering in the caption and the textcomes out right.
  • begin/end{center} tells LaTeX you want the scheme to be in the middle (as opposed to being on the left or right side of the page).
  • schemeref[TMP]{cmpd:PhCOOiPr} indicates that the words “TMP” in the .eps file is to be substituted with a number, and that number is for a compound called “cmpd:PhCOOiPr”.  You could have name it anything you like – “isopropylbenzoate”, “1methyl2ethylbenzoate” – but it should be unique for that compound.  This is the handle you will refer to it in the main text.  (I like to append cmpd: to tell myself it’s a compound; sch: for schemes, and fig: for figures, etc.)
  • includegraphics line simply tells LaTeX which graphic to use here.

If you have multiple tags to substitute (TMP1, TMP2), then you will need one schemeref[]{} line for each tag.

In the main text, wherever you want the number for a compound to appear, you will use the command compound{name}; for example, for the ester above, it would be compound{cmpd:PhCOOiPr}.

Note that the first time you typeset this, a bunch of ?? is in the manuscript where I promised you numbers, even though the compounds in the scheme received the right numbering:

This is because LaTeX needs to go through the manuscript once to generate the numbers, and as it do so, it keeps a running tally of the name:number pair in an auxiliary file.  The second time you typeset, it looks and see that there is an auxiliary file available, and will use that to fill in the ??s.

There is a bug related to this.  If the manuscript had been typeset before, and new compounds are added in, the main text does not update correctly.  To avoid that, you’ll need to trash the .aux file so that the whole numbering is redone from scratch.  (Be careful about when you trash .auxs, especially when you’re generating a bibliography as well.)

And that will get your numbering right all the time.  A very acceptable trade-off for a short learning curve… thank you to Joseph Wright and others whose work he built upon.

Addendum April 6th 2010: I heard back from Joseph about the misnumbering of compounds.  This may be of interest to everyone using chemscheme/chemcompounds:

You’re the second person who’s mentioned this to me recently. The way that chemstyle works is just by loading the chemcompounds package. Due to the way floats are processed it’s safest to include compound* instructions before the floats to keep the numbering “in order”. I tend to have a block of compound* lines at the start of each chapter/file so that everything stays in order.

Joseph

Addendum May 29 2011: To get the numbering right for the ~250 compounds in the 1k page thesis, I built a tool to automatically generate the compound* from a TeX file. It’s entirely browser-based and you can find it here.  It’s not fully functional in that you need to delete the comma from the last entry; the fully functional version got mucked up in the coffeescript->JS->wordpress integration.  It should be a simple fix, but I’m not sure when I’ll have time to get back in the programming mode.