The objective is to design a package management system for TeX which is directly connected to CTAN. The document describes the structure of the system and the two file formats which are to be used in the system, CTPD files which should be kept on CTAN and TPM files which should be kept in user’s TDS tree.
(CTAN stands for Comprehensive TeX Archive Network. CTAN is a worldwide system of internet-based repositories which contain various TeX-related materials like compete TeX distributions, TeX macropackages, fonts, various utilities, documents, etc. Actually, it contains almost all publicly available TeX-related materials.
TDS stands for TeX Directory Structure. “TDS tree” /aka “texmf tree”/ is a directory tree of files to be used directly by TeX installation. See )
In what follows PM is an abbreviation for “package manager” or “package management”.
CTAN files are not directly usable. One have to unpack, compile, install them at the right location. There exist TeX distributions (TeXLive, MiKTeX) which connect the final user to CTAN. Distribution build archived packages and these packages are then can be installed/uninstalled by the PM software of the distribution. There is always some lag between package update on CTAN and its update in the distributions. If a user wants to get a newer version of TeX package, he has to go to CTAN, take files and then install the files manually on his TDS tree. (Preferably, this should be done after uninstalling the previous version of the same package).
Another problem which can be solved by PM system is that the rules of a distribution (example is TeXLive) can forbid to include some categories of packages due to their not very free licensing terms. Of course, it is upon the user of such a distribution to add any questionable packages if he needs them.
The objective of the current document is to outline a solution for the problem of upgrading directly from CTAN without unnecessary manual operations which need technical knowledge, are time-consuming and prone to errors. Also, the document provides for some unification of package management of TeX distributions and a possibility of building some of the distributions’ packages automatically.
Structure of the System
To describe an enhancement to the current PM, let us first describe what is the structure of the current PM system, as exemplified by TeXLive.
?? The following is incorrect and needs editing!
When TeXLive distribution is build ctan2tds (TeXLive “sausage machine”) takes a package from CTAN and puts it in a temporary TDS tree for packaging (it also creates some derivative files). TPM file is created as a guide for the distribution installer/uninstaller. The order is as follows:
CTAN directory → (ctan2tds, etc.) → archived package and TPM → (TPM-based PM engine) → user
TeXLive way of building distribution files is based on heuristics.
From the point of view of package management current TPM files are distribution-oriented. It is planned to leave TPM files their current function of providing information to user PM engine and introduce a new complementary metadata format, “CTAN Package Description” (CTPD).
The idea of CTPD files are that they should be CTAN-oriented and provide a direct projection of CTAN on TDS. Such a CTPD file can be hold on CTAN together with the package it describes or in a separate repository of CTPD files (possibly, some version control system).
The new structure is as follows:
CTAN directory (with CTPD) → (CTPD-based distribution-building engine) → archived package and TPM → (TPM-based PM engine) → user
CTAN directory (with CTPD) → (CTPD-based PM engine) → user
The function of “TeX package description” (CTPD) is to describe a package located on CTAN and to provide guidance on how to build a TDS representation for this package. In particular, CTPD file dictates where files should be installed in a TDS tree. Also, it contains instructions for obtaining derivative files (like running TeX program to unpack .dtx file, produce documentation, etc.).
One takes CTAN directory (or several directories) for a package and corresponding CTPD file and then produces TDS representation (with all files ready to use) and TPM file for the package. The general package information (like ‘Description’, ‘Author’, ‘Version’) is just copied from CTPD to TPM.
TPM file is for user TDS tree and for including in distributions. TPM contains information needed for deinstallation (and also installation, for distributions). It contains list of files, which belong to a package, as they are located in a TDS tree.
Suppose there are a.sty, b.sty, c.sty and zzz.tex in /macros/latex/contrib/zzz. CTPD file for zzz can, for example, specify the following process for building TDS representation of zzz:
- “take all *.sty files from /macros/latex/contrib/zzz and put them into /texmf/tex/latex/zzz”
- “take zzz.tex, run it through latex twice in a temporary directory and put zzz.tex and zzz.dvi into /texmf/doc/latex/zzz”
Then the engine used should take these files and put them in the TDS tree. Also it should write the corresponding list into TPM file:
/texmf/tex/latex/zzz/a.sty; /texmf/tex/latex/zzz/b.sty; /texmf/tex/latex/zzz/c.sty
(for “run files” element of TPM file) and
(for “document files” element of TPM file).
While doing this the engine can check file names validity. So TPM file describes which files in user TDS tree belong to zzz package. The generated zzz.tpm file then should be put into /texmf/tpm directory (or an appropriate subdirectory). If it is needed to uninstall the package then uninstaller should take this zzz.tpm file, read files list and remove the files. Then uninstaller should remove zzz.tpm itself.
There is an important difference between CTPD and TPM files. For a description of a CTAN package (that is, CTPD file) one does not care about the full list of files which would be placed in a TDS tree. For a description of an installed package (that is, TPM file) one does not care about the niceties of package building; there are some files on the system and one needs information on how to remove them when needed.
How installation based on CTPD file should work from user's perspective?
- User requests installation of CTAN package.
- CTPD-based PM engine checks whether there exists a CTPD file, whether it is outdated or not, and complains if not.
- PM engine fetches all corresponding files from repository (unarchives if the package is in the form of one archive file).
- If CTPD file contains a file list then PM engine checks available files against the list.
- PM engine performs instructions from CTPD file and by doing this builds a TDS representation of the package in a user’s TDS tree.
- PM engine produces corresponding TPM file and stores it in TDS tree.
Specification of CTPD File Format
Format of CTPD files is based on XML.
As any XML file CTPD file can have comments. Comment is some human-readable text ignored by PM engines. Comments are written as follows:
<!-- text of comment -->
XML elements to be used inside CTPD file and there attributes are listed below. The root element is ‘CTPD’.
The ‘Header’ XML element contains general information for CTAN package (description of the package, version number, and so on). The information is stored in corresponding attributes (key=value pairs).
<Header ‹key=value pairs› />
- Creators of CTPD file. The name(s) of the person(s) who created the file and/or modified it
- The name of the package. (Most frequently the name of the CTAN directory).
- Example: Creator='zzz'
- The common version format would be ‹major version›.‹minor version›. (See “Dependencies” below for details.) If there is no version, use ‘Date’ and leave ‘Version’ blank or just drop ‘Version’ attribute. If there is no version, and ‘Date’ can not be used for version control for some reason, use date in ‘Version’ attribute (like Version='2002-07-11').
- Example: Version='1.23'
- The date of package release, entry into CTAN or the date of the most recent file in the package CTAN directory. The only requirement is that the date is not before the date of the most recent file and before the date of any subsequent release. The date format is ISO YYYY-MM-DD with optional hh:mm:ss part.
- Examples: Date='2007-12-31', Date='2007-12-31 23:59:59'
- The name(s) of package author(s), maintainer(s) or the organization that distributes the package.
- Examples: Author='John Smith', Author='ZZZ foundation'
- One-line description (summary) of the package.
- Example: Title='Macros for writing Z-Z-Z files'
- A more detailed description of the package than the ‘Title’
- Location of package on CTAN. Can include several directory names separated by commas.
- Examples: Location='/macros/latex/contrib/zzz', Location='/macros/latex/contrib/zzz, /fonts/zzz'
- URL of package homepage outside CTAN where additional information can be found.
- Example: Homepage='http://www.zzz-foundation.org/index.html'
- Package copyright information.
- Example: Copyright='GPL', Copyright='LPPL'
The majority of information in preamble is meant for reading by humans. Only the ‘Name’, ‘Version’, ‘Date’ and ‘Location’ attributes can have a direct bearing on PM process.
Dependency is a general feature of package management. TeX is not an exception. It is not uncommon that the use of one TeX package depends on the existence of some other package in TDS tree. The dependency between packages can be complicated by backward compatibility and version requirements. Thus, CTPD file should include dependency information. In general, it is rather hard to design a PM system which is fully consistent and efficient. However, providing to PM engine some basic dependency information can help to solve the most common problems.
The CTPD file can have several ‘Requires’ and ‘Conflicts’ XML elements, each naming a package and optionally a version number.
- Used to name some other package which is needed for the functioning of the package being described. The name is added as an attribute. Additionally, ‘version’ attribute shows specific version requirements. This is formed from a comparison sign ge, gt, eq, lt, or le and a version number. Example:
<Requires package='yyy' version='ge 4.32.1'/>
- Used to name some other package which is known to be incompatible with the package being described. This element has the same format as the ‘Requires’ element. Example:
<Conflicts package='aaa' version='lt 0.2'/>
File version is a sequence of integer numbers separated by points. Letters and other signs should be converted to numbers. “1.-1.2” can stand for “ver.1 beta, build 2”. Exception is a single letter at the end, like in “1.2h” which is treated as an equivalent of “1.2.8”.
Versions are compared by first positions, then by second positions, etc. Empty position should be treated as zero. For packages without version information a relevant date could be used instead of version number (like "ge 2001-09-14"). The format of the date is YYYY-MM-DD (ISO standard). ?? Is hh:mm:ss part necessary?
‘Actions’ is a container XML element which describes the sequence of actions needed to produce a TDS representation for the package.
<Actions> ‹List of actions› </Actions>
- Specifies an ID for an additional directory for “run files”
- Specifies an ID for an additional directory for “document files”
- Specifies an ID for an additional directory for “source files”
- This elements must have ‘id’ attribute to specify an ID and ‘path’ attribute to specify directory name in TDS (path relative to TDS tree root). ID starts with “@”. Example:
<run_dir id='@run2' path='tex/zzz/zzz'/>
- Copy files from CTAN directory. Must have ‘d’ attribute to specify destination directory.
<a_copy d='@lsrc'> zzz.dtx, zzz.ins </a_copy>
- Copy files between directories. Must have ‘d’ attribute to specify destination directory.
<copy d='@lrun'> @tmp/zzz.sty </copy>
- Similar to ‘copy’. Move files between directories.
- Unpack archive. Can have ‘f’ attribute to specify format; otherwise archive format is inferred from extention. Can have ‘d’ attribute to specify destination directory.
<unpack> @tmp/zzz.zip </unpack>
- Make archive. Must have ‘f’ attribute to specify format. For more then one file it must have ‘d’ attribute to specify destination directory.
<pack f='gzip'> @tmp/zzz.ps </pack>
- Clean directory. ‘d’ attribute specifies the directory to clean.
- tex, pdftex, latex, pdflatex, bibtex, makeindex, dvipdfm, dvips. The attributes: ‘runs’ = number of times to run, ‘options’ = options string to add to command line. Example:
<latex> sample.tex </latex> <makeindex options='-s gind.ist'> sample.idx </makeindex> <bibtex> sample </bibtex> <latex runs='2'> sample.tex </latex>
- Text describing additional manual operations.
<man> Add “Hello world!” line to the end of zzz.cfg after installation. </man>
- Actions can have attributes describing them as being specific to distribution, operating system or target documentation format. E.g. DVI/PDF/PS, Win/Linux, TeXLive/MiKTeX.
- Predefined directories
- ?? Needs to be completed
- Temporary directory
- Used with ‘move’ to delete files ?? is this really needed?
- “Run” directories
- “Documentation” directories
- “Source” directories
Package author might want to provide a list of files which are included in the package. Such a list can be used to check the integrity of package distribution. It is the responsibility of PM engine to check the availability of files according to the list and warn the user that some files are missing or, possibly, there are superfluous items. File and subdirectory names can be complex paths like “documentation/mydoc.dvi” or “tex/latex/zzz”.
- This XML element is a container for files specification. First, there is a list of file names, separated by commas. Second, there could be ‘subdir’ XML elements.
<Files> ‹Items› </Files>
- XML element to be used inside ‘Files’ or another ‘subdir’. It must have a ‘path’ attribute. First, there is a list of file names, separated by commas. Second, there could be ‘subdir’ XML elements.
<subdir path='‹relative path for subdirectory›'> ‹Items› </subdir>
<Files> INSTALL, README, texinput/diagnose.sty <subdir path='doc'> diagnose.dvi, diagnose.pdf, diagnose.ps, diagnose.tex, mls-diag.tex </subdir> </Files>
- Size ??
Example CTPD File
<CTPD version = '0'> <Header Creator = 'tsy' Name = 'bigfoot' Version = Date = '2006-07-16' Author = 'David Kastrup' Title = 'The bigfoot package' Location = '/macros/latex/contrib/bigfoot' Copyright = 'GPL version 2 (or later)' Homepage = 'http://sarovar.org/projects/bigfoot' /> <Description> This is the bigfoot bundle for critical edition typesetting and other concerns. ..........snip....... </Description> <Requires>etex</Requires> <Requires>ncctools</Requires> <Actions> <a_copy d='@ldoc'> README </a_copy> <a_copy d='@tmp'> bigfoot.ins, bigfoot.dtx, perpage.dtx, suffix.dtx </a_copy> <a_copy d='@lsrc'> bigfoot.ins, bigfoot.dtx, perpage.dtx, suffix.dtx </a_copy> <latex> @tmp/bigfoot.ins </latex> <copy d='@lsrc'> @tmp/bigfoot.drv, @tmp/perpage.drv, @tmp/suffix.drv </copy> <copy d='@lrun'> @tmp/bigfoot.sty, @tmp/perpage.sty, @tmp/suffix.sty </copy> <latex runs='2'> @tmp/bigfoot.drv, @tmp/perpage.drv, @tmp/suffix.drv </latex> <copy d='@ldoc'> @tmp/bigfoot.dvi, @tmp/perpage.dvi, @tmp/suffix.dvi </copy> </Actions> <Files> README, bigfoot.dtx, bigfoot.ins, perpage.dtx, suffix.dtx </Files> </CTPD>
Structure of TPM File
TPM format is based on XML and it has the corresponding document type declaration (DTD). The namespace used is TPM, but TPM structure is contained inside document in RDF format with rdf namespace. That is, TPM itself is nested in rdf:Description element.
The RDF format has been chosen for describing the components. RDF is an XML application that allows one to express sentences of the OAV (Object – Attribute – Value) kind. One can use it to claim that such url on the Internet has an attribute of such value. This model is powerful enough to express complex sentences, but we don’t need the advanced features for the moment. However, relying on this language is a best bet for the future when we might need them. We have chosen a dedicated namespace for the so-called TPM (TEX Package Management ) description files.
The General structure of TPM file is as follows. The nesting of XML elements is shown by indentation.
<?xml version="1.0"?> <!DOCTYPE rdf:RDF SYSTEM "(dtdpath)/tpm.dtd"> <rdf:RDF> <rdf:Description> ‹General description section: <TPM:Name>, etc.› <TPM:Build> ‹Build section› </TPM:Build> <TPM:Requires> ‹Requires section› </TPM:Requires> <TPM:Installation> ‹Installation section› </TPM:Installation> ‹Files section› </rdf:Description> </rdf:RDF>
Topmost XML Tags
- Shows XML version. Redundant
- MiKTeX only
- This gives a DTD for TPM format. DTD define constraints on the logical structure of the document and can be used to check a TPM file for validity as any other XML document.
- “(dtdpath)/tpm.dtd” is “../../Tools/tpm.dtd” in TeXLive and “../tpm.dtd” in fptex
- This is the first element in the TPM file. Its attribites fix XML namespaces using URIs. There are two namespaces, rdf and TPM:
- Can have ‘about’ attribute
- Examples: about="http://texlive.dante.de/texlive/package/seminar.zip",
- The name of the package
- binary, package, collection, scheme, support or TLCore
- (MiKTeX does not use this)
- The format is yyyy/mm/dd hh:mm:ss. (Is it entry into CTAN date or something else?)
- Example: “2007/12/31 23:59:59”
- (MiKTeX does not use this)
- Package version
- Example: “2.1”
- Sometimes can be both version number and date or just date
- Example: “0.221 2004-04-14”
- Creator of TPM file
- Example: “rahtz”
- Short description of the package
- Example: “Typeset C and Pascal programs.”
- Size of package in bytes (??Which size is this?)
- Example: 417566
- A longer description
- The name(s) of package author(s). (Absent in MiKTeX files.)
- Flags are declared as attributes
- Examples: default=yes
- treeroot=yes (for texlive.tpm)
This section goes inside TPM:Build container element.
This section goes inside TPM:Requires container element.
- TPM-TLCore (TeXLive)
This section goes inside TPM:Installation container element.
- Example of attributes: function="BuildFormat" parameter="getafm"
- function="addDvipsMap" parameter="slantcm.map"
Entries from this section specify exactly which files in TDS tree are a part of the package. This section goes without a special container element.
- Example of attribute:
- Example of contents:
- Contens is a file list
- Example of attribute:
- Contens is a file list
- Contens is a file list
Other XML elements
This section goes without a special container element.
- Examples: package/seminar
- MD5 checksum used by MiKTeX
- Example: 67d4ba33bf4a2f22adc339335f940a8a
- Used by MiKTeX
- Example: 1113158024
Any package installation system is potentially vulnerable to security risk. Arguably, this should not be a serious concern for TeX packages. (It is much easier to write a malicious Perl script and put it on CPAN than to write some malicious code for TeX and put it on CTAN.) To ensure some level of security, TeX package manager can restrict its operations by TDS tree. Then only TDS tree would be at direct risk.
Development and Dissemination
CTAN already has uploading instructions and there is a non-negligible share of package authors who read instructions and provide README and documentation in pdf as required. A recommendation to write a CTPD file can also be added. Possibly, CTPD upload form can be modified to simplify providing CTPD files. If author fails to provide a CTPD file CTAN can hold a third-party CTPD suggestion. Then at least the most popular packages will have CTPD files.
A quote by Fabrice Popineau:
“It is true that a merge between the Catalogue and the description files of the TEXLive system is highly desirable. But this is a huge work to undertake, and it could be done only by setting up a new procedure to submit files to CTAN. In this procedure package authors would be asked to build the XML description file for their package and also probably to follow very strict rules to package their contribution. Maybe this procedure can be automated by providing people with the right set of scripts, but this needs to be investigated.”
One alternative to keeping author-provided CTPD files on CTAN is to keep CTPD files in a separate repository and use some public version control system. Then the development pace would be faster. (Think of Wikipedia as a model.)
The approach to package description can be minimalist: describe what can be reliably done, otherwise write nothing.
One can start by providing the facility and then brave users can start to experiment with a local TDS tree. TeX distributions can accommodate this later.
Possibly, TeXLive scripts can be modified to write CTPD-like instructions instead of actually building TDS tree.
- ↑ 1.0 1.1 1.2 Fabrice Popineau. Directions for the TEXLive system.
- ↑ Sebastian Rahtz and Fabrice Popineau, “The Occult Secrets of TeXLive”, April 2003
- ↑ Edward C. Bailey. Maximum RPM — Taking the Red Hat Package Manager to the Limit
- ↑ 4.0 4.1 A Directory Structure for TEX Files
- ↑ Benjamin Bayart. The description language chosen for FDNTEX.
TUGboat, Volume 21 (2000), No. 3 — Proceedings of the 2000 Annual Meeting
- ↑ Jim Hefferon. The CTAN Package System
- ↑ Peter Flynn. Formatting information: A beginner’s introduction to typesetting with LATEX
`– — “” ‘’ • · … ‹›