11.6. sdfget - Documentation Extraction Utility


sdfget extracts documentation embedded in source code.


usage  : sdfget [-h[help]] [-o[out_ext]]
         [-l[log_ext]] [-O[out_dir]]
         [-f formatting_filename] [-g[get_rule]]
         [-r[rpt_file]] [-s scope] [-i]
         [-v[verbose]] file ...
purpose: extract documentation embedded in source code
version: 2.000    (SDF 2.001)

The options are:

Option Description
-h display help on options
-o output file extension
-l log file extension
-O output to input file's (or explicit) directory
-f filename to use when formatting the output
-g rule to use to get documentation
-r report file
-s scope of documentation to be extracted
-i only output lines not extracted
-v verbose mode


The -h option provides help. If it is specified without a parameter, a brief description of each option is displayed. To display the attributes for an option, specify the option letter as a parameter.

By default, generated output goes to standard output. To direct output to a file per input file, use the -o option to specify an extension for output files. If the -o option is specified without a parameter, an extension of out is assumed.

Likewise, error messages go to standard error by default. Use the -l option to create a log file per input file. If the -l option is specified without a parameter, an extension of log is assumed.

By default, generated output and log files are created in the current directory. Use the -O option to specify an explicit output directory. If the -O option is specified without a parameter, the input file's directory is used.

The -f option can be used to specify a filename to use when formatting the output. This is useful when the text is coming from the standard input stream.

The get-rule nominates the formatting of the embedded documentation to be extracted. All currently defined get-rules assume the documentation is in comment blocks in one of the following formats:

 text of section 1, line 1
 text of section 1, line ..

 text of section 2, line 1
 text of section 2, line ..

 >>section_title3:: text of section 3

The first form is most commonly used. In this format, the text in a section extends until the end of the current "comment block" or the start of the next section, whichever comes first. The second form (i.e. explicitly specifying where the section ends) is useful if you wish to add some normal comments (i.e. non-documentation) which you do not want extracted. If the text is short, the third form can be used. Regardless of the format, if a section is found which is already defined, the text of the section is concatenated onto the existing text. This permits the documentation for each entity to be specified immediately above where it is defined in the source code.

The -g option specifies the get-rule to use. The available get-rules differ on the prefix expected at the front of each line as shown below.

Rule Prefix
perl #
cpp //
c * or /*
fortran c (with 5 preceding spaces)
eiffel --
bat rem

Within C code, a trailing space is required after the characters above. For other languages, a trailing space is optional. Within FORTRAN code, the "c" character must be preceded by exactly 5 spaces. For other languages, zero or more whitespace characters are permitted before the characters above.

For example, embedded documentation within C code looks like:

 /* >>Purpose::
  * This library provides a high level interface
  * to commonly used network services.

If the -g option is not specified, perl is the default get-rule. If the -g option is specified without a parameter, the extension in lowercase of the filename (or the formatting filename if the text is coming from standard input) is used to guess the get_rule as shown below.

Rule Extensions
cpp cpp, c++, cc, hpp, hpp, h, java, idl
c c
fortran fortran, for, f77, f
eiffel eiffel, ada
bat bat, cmd

A report filename can be specified using the -r option. If the name doesn't include an extension, sdg is assumed. Reports provide a mechanism for:

  • selectively extracting sections, and
  • rudimentary reformatting (e.g. to SDF)

If no report is specified, all sections are output in the following format:



If -r is specified on its own, default.sdg is assumed. This report selects the set of sections (within the SDF documentation standards) which form the user documentation and formats them into SDF. Details on the report format are specified below. Reports are searched for in the current directory, then in the stdlib directory within SDF's library directory.

The -s option can be used to specify the scope of the documentation to be extracted. (This is an experimental feature and may change so most users should avoid using it.)

The -i option outputs only those lines which the get-rule did not match. This option is useful for extracting non-documentation from a file to give just the code.

Note: The -r option is ignored if -i is specified.

The -v option enables verbose mode. This is useful for seeing which rule is being used for each file.


To extract the user documentation from a SDF application written in C++ (xyz, say) and save it into xyz.sdf:

      sdfget -gcpp -r -osdf xyz.cpp

Limitations and future directions

It would be nicer if the get-rule was always guessed from the filename extension but changing the default from perl could break existing scripts. Therefore, get-rule guessing must be explicitly enabled by specifging the -g option without a parameter.