Introduction

This document describes how to use pydocgen 0.3, a Python script to generate Microsoft Word (TM) - Documents and HTMLHelp files from HTML source.

Installation

To install pydocgen, you need to first make sure your system meets the following requirements:

Requirements

If you want to generate Microsoft Word (TM) documents, the following additional requirements must be met:

Installing pydocgen

Copy the files

pydocgen.py
htmlhelpgen.py

to a directory of your choice.

If you want to generate Word documents, you also must copy the default doc template

pydocgen.dot

to your Templates folder, which is usually located at

c:\Documents and Settings\<Username>\Application Data\Microsoft\Templates

Usage

The following sections explain how to create a HTML source, and generate both Microsoft Word (TM) Documents and HTMLHelp files.

Creating the source document

The source document is just a plaintext HTML document. You should make sure that you use only the supported HTML-Tags.

Please note: All tags have to be closed properly. In many cases, the parser interprets the closing task rather than the opening task. You can check your document using something like the HTML Validation Service

Creating the pydocgen project

Use a text editor of your choice (SciTE is my personal recommendation) and create a new file, that looks like this:

TARGET_DIRECTORY = "c:\\foo\\bar"
PROJECT_NAME = "test help"
PROJECT_SOURCE = "c:\\foo\\bar\\test.htm"
GENERATE_DOC = 0
ADD_LINKS_TO_INDEX = 1
PAGE_FORMAT_LANDSCAPE = 0
USE_TOPLEVEL_PROJECT = 0
DEFAULT_TOPIC = "default.htm"
USE_DOC_TEMPLATE = "pydocgen.dot"

As you can see, its a list of project options.

TARGET_DIRECTORY specifies the directory, where the files are to be generated. This directory must already exist. Please note that a single \ has to be encoded as \\.

PROJECT_NAME is the name of the project.

PROJECT_SOURCE is the path-and-filename of the source HTML document. It is not necessary that this file resides in the TARGET_DIRECTORY.

GENERATE_DOC specifies, whether to create Word documents (1) or not (0).

ADD_LINKS_TO_INDEX specifies, whether to add links to the index tab (1) or not (0).

PAGE_FORMAT_LANDSCAPE specifies, whether the Word document should have landscape layout (1) or not (0).

If USE_TOPLEVEL_PROJECT specifies, whether the Helpfile has a toplevel project (1) or not (0).

DEFAULT_TOPIC is the name of the default topic file.

USE_DOC_TEMPLATE is used to specify the template file used when generating the Word document.

Running pydocgen

On Windows: From a command line prompt, type:

c:\> python pydocgen.py <name-of-project-file>

On other OS: From a shell prompt, type:

~ # pydocgen.py <name-of-project-file>

Note: When you want to create Word documents, you must first start Microsoft Word 2000 in the background, otherwise the OLE operations will fail.

Supported HTML-Tags

The following sections describe the HTML tags supported by pydocgen.

General Notes

Warning: All tags have to be closed properly. In many cases, the parser interprets the closing task rather than the opening task. You can check your document using something like the HTML Validation Service

Headings

The Header-Tags (H1, H2, H3 etc.) are used for representing th nested structure of the document (and helpfile). For example, this topic "Headings" is a subtopic of "Supported HTML-Tags", because it is enclosed in <h2> tags. A nesting-depth of up to eight levels is supported.

Example:

<h1>Theme</h1>
blablabla
    <h2>Chapter 1</h2>
    blablabla
    <h2>Chapter 2</h2>
    blablabla
        <h3>Subchapter a</h3>
        blablabla
    <h2>Chapter 3</h2>
    blablabla
    ...

Paragraphs

Paragraphs are started by <p> and ended by </p>. In a paragraph, only Bold font, Italic font and Links are valid.

Example:

<p>bla bla bla bla bla bla bla bla bla 
bla bla bla bla bla bla bla bla bla bla bla bla 
bla bla bla bla bla bla bla bla bla bla bla bla 
bla bla bla bla bla</p>

Bullet lists

Bullet lists can be created:

In a bullet list, only Bold font, Italic font and Links are valid.

Example:

<ul>
<li>bla bla bla</li>
<li>bla bla bla</li>
</ul>

Links

To use Links, you must first define link targets. This is done by using the tags <a name="#unique-name"> and </a>.

Example: For the chapter "Supported HTML Tags" you can define a link target like this

<h1><a name="#supported-tags">Supported HTML Tags</a></h1>

You can link to the link target by using the tags <a href="#unique-name"> and </a>.

Example: To reference the link target described above, use

<p>bla bla bla <a href="#supported-tags">Supported Tags</a> bla bla bla</p>

Bold font

Bold text is started by <b> and ended by </b>. Please do not use <strong> tags, instead use <b>

Example:

<p>bla bla bla <b>IMPORTANT</b>
bla bla bla bla bla bla bla bla bla bla bla bla 
bla bla bla bla bla bla bla bla bla bla bla bla 
bla bla bla bla bla</p>

Italic font

Italics are started by <i> and ended by </i>

Example:

<p>bla bla bla <i>Important</i>
bla bla bla bla bla bla bla bla bla bla bla bla 
bla bla bla bla bla bla bla bla bla bla bla bla 
bla bla bla bla bla</p>

Images

You can insert pictures using <img src="filename">

Example:

<img src="some.jpg">

Preformatted text

You can add preformatted text (most commonly used for sourcecode or other text that needs to be formatted with a fixed-width font) using the tags <pre> and </pre>

Example:

<pre> what you ASCII-see is what you get </pre>

Tables

There is preliminary support for tables in this version of pydocgen. Please do not use nested tables.