This document describes how to use pydocgen 0.3, a Python script to generate Microsoft Word (TM) - Documents and HTMLHelp files from HTML source.
To install pydocgen, you need to first make sure your system meets the following requirements:
If you want to generate Microsoft Word (TM) documents, the following additional requirements must be met:
Copy the files
pydocgen.py htmlhelpgen.py
to a directory of your choice.
If you want to generate Word documents, you also must copy the default doc template
pydocgen.dot
to your Templates folder, which is usually located at
c:\Documents and Settings\<Username>\Application Data\Microsoft\Templates
The following sections explain how to create a HTML source, and generate both Microsoft Word (TM) Documents and HTMLHelp files.
The source document is just a plaintext HTML document. You should make sure that you use only the supported HTML-Tags.
Please note: All tags have to be closed properly. In many cases, the parser interprets the closing task rather than the opening task. You can check your document using something like the HTML Validation Service
Use a text editor of your choice (SciTE is my personal recommendation) and create a new file, that looks like this:
TARGET_DIRECTORY = "c:\\foo\\bar" PROJECT_NAME = "test help" PROJECT_SOURCE = "c:\\foo\\bar\\test.htm" GENERATE_DOC = 0 ADD_LINKS_TO_INDEX = 1 PAGE_FORMAT_LANDSCAPE = 0 USE_TOPLEVEL_PROJECT = 0 DEFAULT_TOPIC = "default.htm" USE_DOC_TEMPLATE = "pydocgen.dot"
As you can see, its a list of project options.
TARGET_DIRECTORY specifies the directory, where the files are to be generated. This directory must already exist. Please note that a single \ has to be encoded as \\.
PROJECT_NAME is the name of the project.
PROJECT_SOURCE is the path-and-filename of the source HTML document. It is not necessary that this file resides in the TARGET_DIRECTORY.
GENERATE_DOC specifies, whether to create Word documents (1) or not (0).
ADD_LINKS_TO_INDEX specifies, whether to add links to the index tab (1) or not (0).
PAGE_FORMAT_LANDSCAPE specifies, whether the Word document should have landscape layout (1) or not (0).
If USE_TOPLEVEL_PROJECT specifies, whether the Helpfile has a toplevel project (1) or not (0).
DEFAULT_TOPIC is the name of the default topic file.
USE_DOC_TEMPLATE is used to specify the template file used when generating the Word document.
On Windows: From a command line prompt, type:
c:\> python pydocgen.py <name-of-project-file>
On other OS: From a shell prompt, type:
~ # pydocgen.py <name-of-project-file>
Note: When you want to create Word documents, you must first start Microsoft Word 2000 in the background, otherwise the OLE operations will fail.
The following sections describe the HTML tags supported by pydocgen.
Warning: All tags have to be closed properly. In many cases, the parser interprets the closing task rather than the opening task. You can check your document using something like the HTML Validation Service
The Header-Tags (H1, H2, H3 etc.) are used for representing th nested structure of the document (and helpfile). For example, this topic "Headings" is a subtopic of "Supported HTML-Tags", because it is enclosed in <h2> tags. A nesting-depth of up to eight levels is supported.
Example:
<h1>Theme</h1> blablabla <h2>Chapter 1</h2> blablabla <h2>Chapter 2</h2> blablabla <h3>Subchapter a</h3> blablabla <h2>Chapter 3</h2> blablabla ...
Paragraphs are started by <p> and ended by </p>. In a paragraph, only Bold font, Italic font and Links are valid.
Example:
<p>bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla</p>
Bullet lists can be created:
In a bullet list, only Bold font, Italic font and Links are valid.
Example:
<ul> <li>bla bla bla</li> <li>bla bla bla</li> </ul>
To use Links, you must first define link targets. This is done by using the tags <a name="#unique-name"> and </a>.
Example: For the chapter "Supported HTML Tags" you can define a link target like this
<h1><a name="#supported-tags">Supported HTML Tags</a></h1>
You can link to the link target by using the tags <a href="#unique-name"> and </a>.
Example: To reference the link target described above, use
<p>bla bla bla <a href="#supported-tags">Supported Tags</a> bla bla bla</p>
Bold text is started by <b> and ended by </b>. Please do not use <strong> tags, instead use <b>
Example:
<p>bla bla bla <b>IMPORTANT</b> bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla</p>
Italics are started by <i> and ended by </i>
Example:
<p>bla bla bla <i>Important</i> bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla</p>
You can insert pictures using <img src="filename">
Example:
<img src="some.jpg">
You can add preformatted text (most commonly used for sourcecode or other text that needs to be formatted with a fixed-width font) using the tags <pre> and </pre>
Example:
<pre> what you ASCII-see is what you get </pre>
There is preliminary support for tables in this version of pydocgen. Please do not use nested tables.