Google

NAME="GENERATOR" CONTENT="Modular DocBook HTML Stylesheet Version 1.76b+ ">

Typedefs and Constants

Name

Typedefs and Constants -- Enca library typedefs, enums and constants.

Synopsis



struct      EncaEncoding;
#define     ENCA_CS_UNKNOWN
enum        EncaSurface;
enum        EncaCharsetFlags;
enum        EncaNameStyle;
enum        EncaErrno;
#define     ENCA_NOT_A_CHAR

Description

Details

struct EncaEncoding

struct EncaEncoding { int charset; EncaSurface surface; };

Encoding, i.e. charset and surface.

This is what enca_analyse() and enca_analyse_const() return.

The charset field is an opaque numerical charset identifier, which has no meaning outside Enca library. You will probably want to use it only as enca_charset_name() argument. It is only guaranteed not to change meaning during program execution time; change of its interpretation (e.g. due to addition of new charsets) is not considered API change.

The surface field is a combination of EncaSurface flags. You may want to ignore it completely; you should use enca_set_interpreted_surfaces() to disable weird surfaces then.

int charset Numeric charset identifier.
EncaSurface surface Surface flags.


ENCA_CS_UNKNOWN

#define ENCA_CS_UNKNOWN (-1)

Unknown character set id.

Use enca_charset_is_known() to check for unknown charset instead of direct comparsion.


enum EncaSurface

typedef enum { /*< flags >*/
  ENCA_SURFACE_EOL_CR    = 1 << 0,
  ENCA_SURFACE_EOL_LF    = 1 << 1,
  ENCA_SURFACE_EOL_CRLF  = 1 << 2,
  ENCA_SURFACE_EOL_MIX   = 1 << 3,
  ENCA_SURFACE_EOL_BIN   = 1 << 4,
  ENCA_SURFACE_MASK_EOL  = (ENCA_SURFACE_EOL_CR
                            | ENCA_SURFACE_EOL_LF
                            | ENCA_SURFACE_EOL_CRLF
                            | ENCA_SURFACE_EOL_MIX
                            | ENCA_SURFACE_EOL_BIN),
  ENCA_SURFACE_PERM_21    = 1 << 5,
  ENCA_SURFACE_PERM_4321  = 1 << 6,
  ENCA_SURFACE_PERM_MIX   = 1 << 7,
  ENCA_SURFACE_MASK_PERM  = (ENCA_SURFACE_PERM_21
                             | ENCA_SURFACE_PERM_4321
                             | ENCA_SURFACE_PERM_MIX),
  ENCA_SURFACE_QP        = 1 << 8,
  ENCA_SURFACE_REMOVE    = 1 << 13,
  ENCA_SURFACE_UNKNOWN   = 1 << 14,
  ENCA_SURFACE_MASK_ALL  = (ENCA_SURFACE_MASK_EOL
                            | ENCA_SURFACE_MASK_PERM
                            | ENCA_SURFACE_QP
                            | ENCA_SURFACE_REMOVE)
} EncaSurface;

Surface flags.

ENCA_SURFACE_EOL_CR End-of-lines are represented with CR's.
ENCA_SURFACE_EOL_LF End-of-lines are represented with LF's.
ENCA_SURFACE_EOL_CRLF End-of-lines are represented with CRLF's.
ENCA_SURFACE_EOL_MIX Several end-of-line types, mixed.
ENCA_SURFACE_EOL_BIN End-of-line concept not applicable (binary data).
ENCA_SURFACE_MASK_EOL Mask for end-of-line surfaces.
ENCA_SURFACE_PERM_21 Odd and even bytes swapped.
ENCA_SURFACE_PERM_4321 Reversed byte sequence in 4byte words.
ENCA_SURFACE_PERM_MIX Chunks with both endianess, concatenated.
ENCA_SURFACE_MASK_PERM Mask for permutation surfaces.
ENCA_SURFACE_QP Quoted printables.
ENCA_SURFACE_REMOVE Recode `remove' surface.
ENCA_SURFACE_UNKNOWN Unknown surface.
ENCA_SURFACE_MASK_ALL Mask for all bits, withnout ENCA_SURFACE_UNKNOWN.


enum EncaCharsetFlags

typedef enum { /*< flags >*/
  ENCA_CHARSET_7BIT      = 1 << 0,
  ENCA_CHARSET_8BIT      = 1 << 1,
  ENCA_CHARSET_16BIT     = 1 << 2,
  ENCA_CHARSET_32BIT     = 1 << 3,
  ENCA_CHARSET_FIXED     = 1 << 4,
  ENCA_CHARSET_VARIABLE  = 1 << 5,
  ENCA_CHARSET_BINARY    = 1 << 6,
  ENCA_CHARSET_REGULAR   = 1 << 7,
  ENCA_CHARSET_MULTIBYTE = 1 << 8
} EncaCharsetFlags;

Charset properties.

Flags ENCA_CHARSET_7BIT, ENCA_CHARSET_8BIT, ENCA_CHARSET_16BIT, ENCA_CHARSET_32BIT tell how many bits a `fundamental piece' consists of. This is different from bits per character; r.g. UTF-8 consists of 8bit pieces (bytes), but character can be composed from 1 to 6 of them.

ENCA_CHARSET_7BIT Characters are represented with 7bit characters.
ENCA_CHARSET_8BIT Characters are represented with bytes.
ENCA_CHARSET_16BIT Characters are represented with 2byte words.
ENCA_CHARSET_32BIT Characters are represented with 4byte words.
ENCA_CHARSET_FIXED One characters consists of one fundamental piece.
ENCA_CHARSET_VARIABLE One character consists of variable number of fundamental pieces.
ENCA_CHARSET_BINARY Charset is binary from ASCII viewpoint.
ENCA_CHARSET_REGULAR Language dependent (8bit) charset.
ENCA_CHARSET_MULTIBYTE Multibyte charset.


enum EncaNameStyle

typedef enum {
  ENCA_NAME_STYLE_ENCA,
  ENCA_NAME_STYLE_RFC1345,
  ENCA_NAME_STYLE_CSTOCS,
  ENCA_NAME_STYLE_ICONV,
  ENCA_NAME_STYLE_HUMAN
} EncaNameStyle;

Charset naming styles and conventions.

ENCA_NAME_STYLE_ENCA Default, implicit charset name in Enca.
ENCA_NAME_STYLE_RFC1345 RFC 1345 charset name.
ENCA_NAME_STYLE_CSTOCS Cstocs charset name.
ENCA_NAME_STYLE_ICONV Iconv charset name.
ENCA_NAME_STYLE_HUMAN Human comprehensible description.


enum EncaErrno

typedef enum {
  ENCA_EOK = 0,
  ENCA_EINVALUE,
  ENCA_EEMPTY,
  ENCA_EFILTERED,
  ENCA_ENOCS8,
  ENCA_ESIGNIF,
  ENCA_EWINNER,
  ENCA_EGARBAGE
} EncaErrno;

Error codes.

ENCA_EOK OK.
ENCA_EINVALUE Invalid value (usually of an option).
ENCA_EEMPTY Sample is empty.
ENCA_EFILTERED After filtering, (almost) nothing remained.
ENCA_ENOCS8 Mulitibyte tests failed and language contains no 8bit charsets.
ENCA_ESIGNIF Too few significant characters.
ENCA_EWINNER No clear winner.
ENCA_EGARBAGE Sample is garbage.


ENCA_NOT_A_CHAR

#define ENCA_NOT_A_CHAR 0xffff

Not-a-character in unicode tables.