Table of Contents

Name

codex - the Code Extractor for OpenOffice.org

Synopsis

codex [-dDqvVsSwW?] [-c string] [-C string] [-e ext] [-f file] [-n number] [-p prefix] [-r prefix] [-t string] [-x ext] [--chapter-number=number] [--comment-end=string] [--comment-start=string] [--doxygen] [--nodoxygen] [--configfile=filename] [--extension=string] [--file-tag=string] [--prefix=string] [--quiet] [--snippets] [--nosnippets] [--snippet-extension=string] [--snippet-prefix=string] [--verbose] [--noverbose] [--warning] [--nowarning] [--help] [--usage] DOCUMENTS...

Description

Each file name named on the command line is opened as an OpenOffice.org 2.0 document, that is, as a ZIP file that contains XML streams. The main content stream (content.xml) is extracted according to the OASIS OpenDocument format standard. Sections that use code style names are extracted to text files. A code listing starts with a caption style, followed by code styles. A snippet lacks a caption. Snippets are usually short.

Each listing or snippet is extracted to a separate text file in the current directory. The file name is formed from a prefix, the listing number, and an extension. Snippets and listings have different prefixes and extensions. Listings captions must have the form tag cc-nn. caption, where tag is a word such as Listing, Example, Figure, etc., cc is the chapter number, nn is the listing number, and caption is arbitrary text. The separator between cc and nn can be any single character. The separator between nn and caption can be anything. The listing filename is <pprefix><cc><nn>.<ext>, where cc and nn are taken from the caption and padded to 2 zero-filled places. If you supply a comment string, the first line of the file starts with the comment string, followed by the caption.

Snippets lack a caption, so they are numbered sequentially, starting at 1. You can specify a chapter number, in which case that number is used to name snippet files. The chapter number can also be used to verify the correctness of listings.

The style names come from a configuration file. The configuration file can also provide default values for extension, prefix, and other options. The command line overrides values from the configuration file. The file format is XML. The schema is provided in a separate file that accompanies this program. If the file name cannot be found in the local directory, a standard prefix is prepended, to look in a system directory.

Specify a configuration file name on the command line or in the document. To specify information in the document, define document properties. Set the info field name to one of the names in the list below, and set the field value to the desired option value:

codex-configfile filename
Open filename as the configuration file, just as though you specified -f on the command line. If you use this property, and have not specified -q, the name of the configuration file is echoed to the standard output, to remind you that the document sets this property.
codex-chapter-number number
Set the chapter number for snippets and for checking listing captions to number, just as though you had used the -n command line option. If you use this property, and have not specified -q, the name of the configuration file is echoed to the standard output, to remind you that the document sets this property.
codex-doxygen
Enables doxygen-style @file tags. The command line options -d and -D override the document property.
codex-snippets
Enable extraction of code snippets. This property is just like the -s command line option.
codex-nosnippets
Disable extraction of code snippets. This property is just like the -S command line option.

In all cases, command line options override document properties. If a property name is not one of the above, but starts with codex-, you get a warning message. Otherwise, unknown property names are ignored.

Options

Here are detailed descriptions of all the command line options.
-c, --comment-start=string
Characters that begin a comment. If a comment start is specified (in the configuration file or on the command line), the first line of a listing file is a comment that contains the listing caption. The command line overrides the configuration file.
-C, --comment-end=string
Characters that end a comment. At the end of every comment, codex always prints a newline. Thus, comments that have start characters but no end character do not need to use the -C option. If you specify the --doxygen option, and the comment end string is empty, codex prints two new lines after the doxygen @file comment, to ensure doxygen keeps it separate from the caption comment.
-e, --extension=string
Extension for generated file names. Typically, the extension starts with a dot (.), but codex does not enforce any limitations or restrictions on the extension. The command line overrides the configuration file.
-f, --configfile=filename
Specify a configuration file. The file can be an absolute or relative path. If it is a relative path and the file cannot be found, codex looks up the file in the system directory. Thus, you can specify any standard configuration by tail file name only.
-n, --chapter-number=number
Chapter number for snippet files and to verify listing. If you do not specify a chapter number, snippets files are named with just the prefix and snippet number. With a chapter number, codex can verify that listings are numbered correctly in the document. (See -w.)
-p, --prefix=string
Prefix for generated listing file names. The command line overrides the configuration file.
-q, --quiet, --noverbose
Turn off verbosity, that is, do not echo anything except error messages.
-r, --snippet-prefix=string
Prefix for snippet filenames. The command line overrides the configuration file.
-s, --snippets
Enable extraction of snippets. Default is enabled.
-S, --nosnippets
Disable extraction of snippets. Only listings are extracted.
-t, --file-tag=string
Specifies an alternate tag to use when you also use --doxygen. The default tag for doxygen file comments is @file, and I cannot imagine any reason ever to change it. Nonetheless, I included this option in case you have a better imagination that I.
-v, --verbose
Echo file names as they are processed; when repeated, also echoes generated file names. Repeat this option again to print the caption for each listing.
-V, --version
Print the program name and version, and then exit with a successful exit status.
-w, --warning
Enable checking for valid listing numbers. Default is enabled.
-W, --nowarning
Disable checking for valid listing numbers.
-x, --snippet-extension=string
Extension for snippet filenames. The command line overrides the configuration file.
-?, --help
Type out help for the program and options, and then exit with a successful exit status.
--usage
Give a short usage message, and then exit with a successful exit status.

Exit Status

Zero for succes, non-zero for failures such as not opening a document file or malformed configuration file.

Files

The following directories and files are significant:
/usr/local/lib/codex
Directory where the default configuration and some other configuration files are kept. If a configuration file cannot be opened, codex looks in this directory before giving up. Thus, you can specify just the file name, e.g., apress-cpp.xml, to use one of the standard configurations.
default.xml
The default configuration file. It lists "caption" as the sole caption style. Both "code" and "preformatted text" are code styles. The default extension is .txt, and the default prefix is list for listings and snip for snippets.
apress-cpp.xml
Standard configuration to use the Apress style sheet, with C++ extensions and comments.
apress-cpp-dox.xml
Standard configuration to use the Apress style sheet, with C++ extensions and comments, with a doxygen @file tag.
apress-java.xml
Standard configuration to use the Apress style sheet, with Java extensions and comments.
apress-javadoc.xml
Standard configuration to use the Apress style sheet, with Java extensions and and javadoc comments.
apress-perl.xml
Standard configuration to use the Apress style sheet, with Perl extensions and comments.
oreilly-cpp.xml
Standard configuration to use the O'Reilly style sheet, with C++ extensions and comments.
oreilly-cpp-dox.xml
Standard configuration to use the O'Reilly style sheet, with C++ extensions and comments, with a doxygen @file tag.
oreilly-java.xml
Standard configuration to use the O'Reilly style sheet, with Java extensions and comments.
oreilly-javadoc.xml
Standard configuration to use the O'Reilly style sheet, with Java extensions and javadoc comments.
oreilly-perl.xml
Standard configuration to use the O'Reilly style sheet, with Perl extensions and comments.

Notes

OpenOffice.org stores documents as zip archives, albeit with different file name extensions. Within each archive are several files, such as meta.xml, which contains document metadata, and content.xml, which contains the document contents. Each file is an XML document. The schema is available from OpenOffice.org.

Author

Ray Lischner (codex@tempest-sw.com)

See Also

codex(5), doxygen(1), javadoc(1)


Table of Contents