Skip to content

Rhaptos Software Development

Personal tools
You are here: Home » Documentation » Developer Documentation » Collection Printing Documentation

Collection Printing Documentation

Document Actions
This first cut of collection printing developer instructions was taken from an email from Chuck Bearden to partners working on Vietnamese printing. No formatting was attempted so some things might look funny.

We have a command-line system that we use for building collection/course PDFs of Connexions content. It's not part of the Zope/Plone system, so it is not available as a product. Do you have access to our SVN repository? In any case, you can download a tarball from here:

http://www.owlnet.rice.edu/~cbearden/vef/collection_printing.tar.gz

In these instructions, I will use the word collection to refer to what we sometimes call a course and sometimes a collection. Since these objects aren't always really a course, we are trying to use the more general term now.

Dependencies: - GNU make - pdfetex/pdflatex (in my distro it's part of the tetex-bin package) - ImageMagick (for the convert program) - gif2png (if your Linux distro doesn't have it, you can also get it at http://catb.org/~esr/gif2png/)

The tarball will contain two directories: printing and scripts. From scripts you will need only imagefix (not imagefixer.py) and replace.py. Put them somewhere in your executable path. These are two helper programs used by the main PDF generation system.

The PDF generation system is in the printing directory. The makefile is course_print.mak. You'll need to edit the PRINT_DIR path at the top of the file to point to the final location of printing.

You will also need one directory for each collection for which you want to build a PDF, to contain the workfiles generated in that process. Typically what I do is have a directory like

/home/cbearden/collection_printing

and make a printing subdir (and later a subdir for each collection)

/home/cbearden/collection_printing/printing

The printing subdir contains what's in the printing subdir of the tarball.

Once you have the directory surcture in place, do the following to generate a PDF of a collection:

  1. Create a directory for the collection you want to print. I always name them by the collection ID, so for Rich's big book I'd call it col10064. So the directory structure would look like this:

    /home/cbearden/collection_printing/printing /home/cbearden/collection_printing/col10064

  2. Either make a symlink from the collection PDF directory pointing to the makefile in the printing directory, or copy the makefile into the collection subdirectory. This is a convenience so that you don't have to give the full path to the makefile when you build the PDF.

    /home/cbearden/collection_printing/printing/course_print.mak /home/cbearden/collection_printing/col10064/course_print.mak -> ../printing/course_print.mak

  3. Download the RDF description of the desired collection into the directory for that collection. The RDF file is the first input into the printing pipeline. For any URL pointing to a collection, you can append the argument ?format=rdf to retrieve the RDF description of it. So for Rich's course at

    http:///content/col10064/latest/

    you can get the RDF description at

    http:///content/col10064/latest/?format=rdf

    I usually use wget to to retrieve the RDF

    wget -O col10064.rdf http://<your repository address>/content/col10064/latest/?format=rdf

    but you could also pull it up in a web browser and save it to the filesystem

  4. Run the make command with the target as the PDF file with the same basename as the RDF file:

    make -f course_print.mak col10064.pdf

You can also build intermediate targets as well. One of the most useful intermediate targets is the final LaTeX stage before PDF generation, which in the case of our example would be col10064.tex. This is the thing to do if you need to make any final edits to correct problems in our pipeline, or to handle things like page breaks, before building the PDF. The make process will always check to see if any input files are newer than the target file, and if so, it will start with them. So if you build a PDF that looks funny and you want to correct it by editing the LaTeX (if for instance that there is no alteration to the CNXML that would fix things), and you edit and save the LaTeX file and run the make command again, it will start with the LaTeX file since it is the only input file newer than the target file.

Sometimes the PDF build will fail catastrophically, leaving either a broken or an empty PDF file. In this case you need to examine the log file (col10064.log in our example case) to see what you can figure out about what pdflatex didn't like about the LaTeX input file . I'm not a LaTeX expert, but I can often figure out something about the problem from this file.

As you look through the resulting PDF, you will sometimes see images that are too large or too small. We have a way of dealing with this problem, but it's rather ugly. In the directory containing the printing workfiles, create a file with the collection ID as the basename and an extension of .width. This file should contain lines each of which has the basename of an image file, a space, and a width for that file expressed in some measure that pdflatex understands (I use inches, you would probably use cm). The entries would look like this:

m10790_fig1a 6.5in m10790_fig1b 6.5in

If you omit a unit, the number is treated as a scaling factor:

m10757_mfilt_3 .75 m10764_fig3 .75

These numbers are simply stuffed into the LaTeX file by the imagefix program, into the width argument of the \includegraphics commands, e.g.:

\includegraphics[width=6.5in]{col10064/m10790_fig1a.png}

If there is no entry present in the width file, then the \includegraphics is printed without the width argument, e.g.:

\includegraphics{col10064/m10790_fig1a.png}

So you can use any width specification in the width file that you can use in the LaTeX \includegraphics command.

These instructions should be enough to get you started. I know that there are many problem situations that they won't cover, so let me know when you do have problems. I'm actively working on this code as I also work on the PDFs we are generated, so if you can use our svn, you can svn up to get any new changes. You could also create a branch of the code in which you make your changes to handle Vietnamese correctly, and merge those changes into our code after you svn up.

The whole build process is rather ugly and ad hoc, and I know that some of the LaTeX constructions are bad, but we will improve this bit by bit. LaTeX and TeX handle math so well that it's hard to give up this process even if XSL-FO would be much easier in other respects.

By the way, you may find it helpful to look at the workfiles and PDFs of our collections that I have committed to our svn:

https://trac.rhaptos.org/trac/rhaptos/browser/printing_workfiles

You can see examples of width files, and how I use patches to capture manual edits to the LaTeX to re-apply after code and content updates.

Created by kef
Last modified 2007-10-18 15:08