Google

MON_LIB (CCP4: Formats)

INTRODUCTION TO MON_LIB

What is a Monomer Description?

The refinement program REFMAC requires a complete chemical description of all monomers (i.e. any molecular entity, e.g. 'protein residue' or 'ligand') which are referred to in the input coordinates.

Complete description

The complete description of a monomer includes lists of:

  • all atoms with
    • identifier (i.e. name)
    • element symbol
    • chemical type - Libcheck/Refmac has a set of defined atom chemical types which have properties such as VdW and ionic radii. The chemical type assigned to an atom depends on the chemical environment of the atom, e.g. an oxygen atom in an alcohol has a different type to an oxygen atom in carboxy group.
    • partial_charge: Current version used as a whole unit of charge (e.g. PO4 as P=O1 P-O2 P-O3 P-O4).
  • covalent bonds, target bond lengths and standard deviations (SDs)
  • angles, target bond angles and SDs
  • torsion angles, target values and SDs
  • chiral centres, with sign
  • planes, with definitions of which atoms lie in a plane
  • the tree structure of the monomer - an alternative representation of connectivity

Note that values of VDW and ionic radii and definitions of inter-monomer restraints (e.g. to maintain a peptide bond) are not in the monomer description but in alternative files described below.

There are descriptions for commonly occuring residues and ligands in the library files in the directory $CLIBD_MON (i.e. $CCP4/lib/data/monomer) but for any novel monomer the crystallographer must provide the program Libcheck with a minimal description of the monomer from which the program can derive a complete description.

Minimal description

A minimal description must include a list of all non-hydrogen atoms (with the atom identifiers and element names), and all bonds and some extra information which can have any one of three forms:

  • the bond order (i.e. single, double etc.); from this information the 'missing' hydrogen atoms can be deduced
  • a list of all atoms, including hydrogens, and their connectivity; from this the bond orders can be deduced
  • the atom chemical types which effectively define the hydrogen atoms and the bond order of bonds around an atom

Practically, the user can provide this information:

  • Using the CCP4i Sketcher to draw the monomer from scratch or by editting a similar monomer extracted from the library. The Sketcher acts as an interface to Libcheck.
  • As a PDB file which includes all hydrogen atoms.
  • As a PDB file without hydrogen atoms, read into the CCP4i Sketcher for the user to provide the bond order which is not defined in the PDB file.
  • As a CIF file which contains one of the required combinations of information.
  • As a SMILES string which is converted to a minimal description in CIF format by the program SMILES2DICT.

The easiest way to interact with Libcheck is using the CCP4i Sketcher even if you already have either a PDB or CIF coordinate file.

Please note that after generating a complete description from a minimal one it is advisable to check the complete description carefully. Also note that some of the entries in the library files are only minimal descriptions from which the complete description is derived. The minimal descriptions are derived from the PDB dictionary of ligands, which may contain errors for which the author(s) cannot take responsibility.

Library Files

The format of monomer descriptions and all library files is an extension of mmCIF. All attribute values in a CIF file are preceded by the name for that attribute. The recognised types of attribute are defined in a dictionary file which gives a definition for each attribute, and which should make the library files self-explanatory.

The monomer library files describe the internal geometry of a monomer - they may contain complete or minimal descriptions of the monomers. CCP4 distributed library files are in the directory $CLIBD_MON ($CCP4/lib/data/monomer). Do not alter these files in any way, as this would corrupt the running of Refmac! If you want to change some description, you can use your own additional library with the correct description. In this case the last correct description will be used instead of the distributed one.

See an example of a complete monomer description.

Distributed library files containing complete descriptions:

  • mon_lib_prot.cif (peptides)
  • mon_lib_sug.cif (sugars)
  • mon_lib_na.cif (nucleic acids)
  • mon_lib_met.cif (metals)
  • mon_lib_1.cif (small molecules)
  • mon_lib_2.cif (small molecules)
  • mon_lib_3.cif (small molecules)
  • mon_lib_4.cif (small molecules)
  • mon_lib_5.cif (small molecules)
  • mon_lib_com.cif
  • mon_lib_ind.cif

Distributed library files containing minimal descriptions:

  • mon_lib_6.cif
  • mon_lib_7.cif
  • mon_lib_8.cif
  • mon_lib_9.cif
  • mon_lib_10.cif
  • mon_lib_11.cif
  • mon_lib_12.cif
  • mon_lib_13.cif

There are two additional files in the $CLIBD_MON directory:

ener_lib.cif
contains a complete list of VdW and ionic radii and target values for bond distances, angles and torsions for the different atom chemical types. When Libcheck generates a complete monomer description from a minimal one, target values are usually taken from ener_lib.cif and associated with the bond, angle or torsion in the monomer description file. Alternatively Libcheck can extract target values from an input coordinate file (see COOR keyword). The user can edit the values in the monomer description file or they could change the values in ener_lib.cif.
When applying the monomer descriptions, the refinement procedure uses the chemical type in the monomer description to cross-reference the VdW and ionic radii in ener_lib.cif.
mon_lib_com.cif
contains
  • The chemical structure of links between polymers (e.g. cis- and trans-peptide bonds, disulfide bridges, glycosidic bonds for sugars, phosphate bridges for DNA)
  • Chemical details of common modifications of monomers (e.g. termini of polypeptide chains, sugar modifications, termini of DNA/RNA)
  • See details in library of monomers.