At the end of refinement it is useful to try to rationalise the H2O naming. You may have more than one molecule in the asymmetric unit; have two isomorphous structures, etc., and want to compare the H2O structures for them.
This program has two purposes.
The distance search is done with the program DISTANG, which must be run first. WATERTIDY then reads in the DISTANG output ("log file") which lists all close contacts, and does some preliminary analysis of H2O contacts (e.g. contact too close, C involved in close contact, number of contacts per chain).
This generates another problem; what to do about H2Os which are bonded to more than one host atom? The solution used here is to list such H2Os more than once, giving the site closest to a host atom the input occupancy, and all secondary sites occupancy <occw> (default value 0.01, see keyword OCCW).
The program can be run first to find the H2Os linked to the protein molecule, then a second or third pass would attempt to apply the same rules to renaming H2Os in a second or third solvent shell which will not have been renamed at all in the previous pass.
All non relabelled atoms are output exactly as input.
WATERTIDY names the waters with the appropriate output ID and a label containing information about which residue and atom type the water is H-bonded to. An H2O is labelled in the output PDB file as
O<i><j> WAT <chnid> <nres>where <nres> is the host residue number and <chnid> is the assigned output ID. <i> and <j> are defined as follows:
0 for N 1 for O 2 for OG OG1 3 for OD1 ND1 4 for OD2 ND2 5 for OE OE1 NE1 6 for OE2 NE2 7 for NZ 8 for OH OH1 NH1 9 for OH2 NH2Additional assignments for <i> are made as follows:
0 for OW <n> for O<n> or OW<n> where n=0-9 <n> for O<n><m> where n,m=0-9The number <j> (range 0-3) numbers the contact of the H2O to the protein atom; up to <hbond> H2Os can be bonded to one atom. An extension to allow other acceptor atoms (e.g. C S etc.) means that the numbering has to be modified slightly.
0 for CA as well 1 for C as well 2 for CG CG1 as well 3 for CD CD1 as well 4 for CD2 CD3 as well 5 for CE CE1 as well 6 for CE2 CE3.. as well 7 for CZ as well 8 for CH CH1 as well 9 for CH2 CH3.. as well
When you have assigned as many shells as you feel are needed, resort the output water atoms of the PDB file on <chnid>, residue number, etc., using the system sort utility. On Unix, this sorts on <chainid> first, then residue number then atom number:
sort +4 -5 +5 -6 +3.1 - 3.3 wat.pdb > wat_sorted.pdbA VMS example is
$SORT/KEY=(POS:21,SIZE:6)/key=(pos:15,size:1)/key=(pos:16,size:1) - /key=(pos:7,size;6) DSCR:DPI047R2.pdb DSCR:DPI047R2.pdbBEWARE: Your CRYSTAL and SCALE cards will be scrambled by the sorting.
Available keywords are:
ACCEPT, CHNID, END, HBOND, OCCW, SHELL, SYMMETRY, TITLE, WATID.
Specify extra acceptors: single character atom types, default O N.
Maximum number of waters bonded to one atom, default 4.
Occupancy for secondary sites (default 0.01). If <occw> is set to 0.0 then secondary sites are not written to XYZOUT.
Specify the shell number (up to 3), default 1.
Standard symmetry specification. This must be the same as used for DISTANG.
<title> is written to output PDB file as a REMARK.
Water chain id. The chain identifier for unassigned H2Os to be assigned in this pass, as it appears in XYZIN.
REMARK REMARK SCALE2 0.00000 0.03820 0.00000 0.00000 SCALE3 0.00000 0.00000 0.01937 0.00000 SCALE1 0.01897 0.00000 0.00099 0.00000 ATOM 1 N GLY A 1 -8.094 0.714 38.861 1.00 19.52 ... ATOM 18 C VAL A 3 -10.635 2.653 34.037 1.00 15.79 ATOM 13 N VAL A 3 -8.153 2.210 33.953 1.00 16.23 ... ATOM 25 N GLU A 4 -10.661 2.145 35.262 1.00 13.58 ATOM 28 O GLU A 4 -12.831 4.702 36.359 1.00 15.64 .... ATOM 21 OE1 GLU A 4 -9.572 0.074 36.837 1.00 30.05 ATOM 20 OE2 GLU A 4 -11.042 -1.224 35.968 1.00 32.63 .... ATOM 769 O00 WAT P 1 -8.453 -1.913 39.350 1.00 45.10 A H2O bonded to the N of GLY A 1... ATOM 772 O00 WAT P 3 -7.612 -0.514 34.997 0.01 22.90 A H2O bonded to the N of VAL A 3... ATOM 750 O10 WAT P 4 -14.304 4.121 38.925 1.00 25.25 ATOM 772 O50 WAT P 4 -7.612 -0.514 34.997 1.00 22.90 ... ATOM 795 O04 WAT T 3 -5.847 -2.930 35.432 0.01 30.04 ATOM 749 O14 WAT T 4 -11.391 4.228 40.350 1.00 32.06 ATOM 811 O15 WAT T 4 -14.681 2.966 41.308 1.00 56.74 ATOM 795 O54 WAT T 4 -5.847 -2.930 35.432 0.01 30.04 ...
distang, pdbset, sort (1)