MATTHEWS_COEF (CCP4: Supported Program)

NAME

matthews_coef - Misha Isupov's Jiffy to calculate Matthews coefficient.

SYNOPSIS

matthews_coef
[Keyworded input]

DESCRIPTION

The Matthews Coefficient and solvent content are calculated from the unit cell and the molecular weight of the molecules in the unit cell. A description of the Matthews coefficient Vm and how it relates to solvent content is given below.

Input:

The program requires the information below which is input via keywords. No input files are required.

Output:

No output files are generated; below is a sample of the log output.

  THE MATTHEWS COEF. IS :  1.74
  SOL % IS : 28.96

or, if used with the AUTO keyword:

For estimated protein molecular weight    6000.
Nmol/asym  Matthews Coeff   %solvent
  1          6.6             81.2
  2          3.3             62.5
  3          2.2             43.7
  4          1.7             24.9
  5          1.3              6.1

KEYWORDED INPUT

Available keywords are:

AUTO, CELL, MOLWEIGHT, NMOL, NRES, SYMMETRY, XMLOUTPUT

Compulsory keywords

CELL a b c [alpha beta gamma]

You must give the unit cell parameters. The angles default to 90.0 if omitted.

SYMMETRY

Either the spacegroup number or name can be given. Alternatively, the symmetry operators can be input explicitly, each separated with a '*'. However, the program only requires the total number of operators.

Extra keywords

NRES <number_of_residues>

This is used to estimate the molecular weight of one molecule in Daltons. It is assumed that on average each residue contains 5 carbons, 1.35 nitrogens, 1.5 oxygen, 8 hydrogen and 0.05 sulpher atoms and has a molecular weight of about 110.

MOLWEIGHT <molecular_weight>

The molecular weight of a molecule in Daltons. What is important is the total molecular weight of the molecules in the asymmetric unit. This keyword is used in conjunction with NMOL. If this is not given, the program calculates a tentative molecular weight of the molecule, assuming the unit cell is 50% protein.

NMOL <number>

This keyword is not compulsory but is used in conjunction with MOLWEIGHT. The <number> of molecules per asymmetric unit. Default 1.

AUTO

This keyword is not compulsory and can be used in conjunction with NMOL and MOLWEIGHT. It produces a list of incrementing number of molecules, from NMOL (default 1), in the asymmetric unit whilst the %solvent is >0.0.

XMLOUTPUT

This keyword is of little use for the 'user'. When specified matthews_coef will output a small XML file of the results. The name and location of the XML file can be specified on the command line with XMLFILE, otherwise the file will be called MATTHEWS_COEF.xml.

Example of input

CELL 73.58 38.73 23.19
SYMM 19
MOLW 6600.0
AUTO

Example of output file

<?xml version="1.0"?>
 <matthews_run>>
  <MATTHEWS_COEF
    ccp4_version="4.1" 
    date=" 1/25/02" 
   />
  <keyword
  >
  
  </keyword>
  <cell
    volume="   66085.78" 
   />
  <result
    nmol_in_asu="           1" 
    matth_coef="   2.503249" 
    percent_solvent="   50.47841" 
   />
  <result
    nmol_in_asu="           2" 
    matth_coef="   1.251625" 
    percent_solvent="  0.9568155" 
   />
 </matthews_run>

PROGRAM FUNCTION

Matthews Number

Vm =   cell volume ( cubic As)      V
       -----------------------   = ---  
           M*nasymu*nmols_asu      M*Z 

         M         =   molecular weight of protein in daltons 
         V         =   volume of unit cell.
         Z         =   no. of molecules in unit cell. = nasymu*nmols_asu
         nasymu    =   number of asymm. units                 
         nmols_asu =   number of molecules in asym unit.


Molecular weight

          = number of protein residues in molecule * 110
                                              - very roughly!!!
          = number of non hydrogen protein atoms in molecule *14 
                                              - roughly!!!!

Use RWCONTENTS to read your PDB file if you have one; it will count number of atoms of each type.

Matthews found Vm somewhere between 1.66+ and 4.0+ corresponding to protein contents of 75% to 30% but proteins with higher solvent contents will give higher values of Vm. E.g. for a solvent content of 90%, the Vm would be 12+.

Using this you can calculate Vm assuming nmols_asu = 1/4,1/2,1,2,3 etc etc.. You MAY be able to narrow down the number of possibilities for nmols_asu. If Vm falls outside the range above then the number of molecules per asymmetric unit assumed, is likely to be incorrect.

Protein fraction

Turning this into fraction of protein in asymmetric unit:

            Total mass of Protein in unit cell
Vp  =    ---------------------------------------
          Protein density   *  Unit cell volume


Vp  = M*Z*u/(V*Dp)    = 1/(N*Dp*Vm) 
 
where   Vp = fraction of protein volume in asymmetric unit.
        Vm = Matthews Number        (A**3/Daltons)
        Dp = density of protein = 1.35  (g/cc)   (ref 1)
        N  = Avagadro constant  = 6.023*10**23  gmole**(-1)
        u  = Mass of Hydrogen   = 1.66*10**-24  g
  
( It is sufficient to approximate the mass of a Hydrogen atom as 
(1/N) because the mass of 1 mole of Hydrogen approximates to 1g.)

==>From this it is easy to obtain the formula derived in Matthews i.e.

                       Vp = 1.66*v / Vm  
                          = 1.23 / Vm
                            1/Dp is Matthew's v = 0.74 cc/g )

Alternatively:

 
Vp  = Np* AV/V

where   Np  = number of protein atoms in unit cell 
             (including hydrogens)
        AV  = average atomic volume in A**3 - = 10 approximately.

             (There are about the same number of hydrogens 
              as C N O etc.)


If Vp equals fraction of protein volume  in asymmetric 
unit

Density  =    Dp *Vp  +   Ds* (1-Vp)
         =   1.35*Vp  + 1.0 * (1-Vp)
         =   0.35*Vp  + 1.0

Ds = density of solvent.  = 1.0 for H2O 
             
therefore    Vp  =   (density -1.0)/0.35

If you know the density you can work backwards and find the number of molecules in the asymmetric unit exactly.

EXAMPLES

  1. matthews_coef << eof
    CELL 73.58 38.73 23.19
    symm 19
    molweight 6600.0
    nmol 1
    eof
    
  2. With keyword 'AUTO'
    matthews_coef << eof
    CELL 73.58 38.73 23.19
    SYMM 19
    MOLW 6600.0
    AUTO
    eof
    

AUTHORS

Originator: Misha Isupov
Additions by: Alun Ashton a.w.ashton@ccp4.ac.uk, Eleanor Dodson ccp4@ysbl.york.ac.uk

REFERENCES

  1. Matthews, J.Mol.Biol 33, 491-497 (1968).

SEE ALSO

rwcontents (1)