DETWIN (CCP4: Supported Program)
NAMEdetwin - tests for merohedral twinning, and detwins data
detwin HKLIN foo_in.mtz HKLOUT foo_out.mtz
Twinned data is measured when two or more copies of the reciprocal lattice overlap. Hence we need to deconvolute the two twinned components to obtain useable data. The equations for two overlapping reflections, Itrue(h1) and Itrue(h2), are:
ITw(h1) = (1-tf)*Itrue(h1) + tf*Itrue(h2) ITw(h2) = tf*Itrue(h1) + (1-tf)*Itrue(h2) where tf is the twin fraction, often denoted as 'alpha' thus Itrue(h1) = ((1-tf)*iTw(h1) -tf*iTw(h2)) / (1-2tf) thus Itrue(h2) = ((1-tf)*iTw(h2) -tf*iTw(h1)) / (1-2tf) also var(h1) = ((1-tf)/(1-2tf))**2 * sdTw(h1)**2 + (tf/(1-2tf))**2 * sdTw(h2)**2 var(h2) = ((1-tf)/(1-2tf))**2 * sdTw(h2)**2 + (tf/(1-2tf))**2 * sdTw(h1)**2
This deconvolution is only possible if tf is not exactly 0.5. As it approaches this value the variances become extremely large.
The occurrence of twinning can be recognised from the intensity statistics of the data set. This program carries out several tests for twinning as a function of both twinning fraction, and resolution. It can also detwin merohedrally twinned data for a given twinning fraction. It reads and writes either intensities or amplitudes from an MTZ file, with the output information corrected for a given twinning fraction.
The twin operator is required (see TABLE OF POSSIBILITIES), and for MTZ output, the chosen twin fraction.
The twinning tests performed are:
The partial twin test (see reference ) <H> = <|Itw1 -Itw2|/(Itw1 +Itw2)> plotted against theoretical expectations.
The estimate of the twinning factor as (0.5 -<H>), plotted against resolution.
Those tabulated against the twinning factor, for values ranging from 0.00 to 0.48, are
If the output MTZ file contains IMEAN, it can be run through TRUNCATE again, and the other moments and the N(z) test examined to see if the intensity statistics now follow the expected distribution for non-twinned data. twinned. See Cumulative distribution plot, which for twinned data becomes sigmoidal, and the moments of I (or E or z) which are different for twinned data than for untwinned.
The general formulas for expected moments <I^k> /<I>^k for untwinned acentric data are:
Table of moments: k-th moment is Gamma(k+1) = k! if k is an integer k-th moment = sqrt(PI) k! if k equals integer+0.5 i.e. the (2k+1)th moment of E = sqrt(PI) 2k * 2k-2 * ... *2 In general Gamma(k+1) = k Gamma(k) Acentric Centric Untwinned data Perfect twin. Untwinned data Perfect twin. <E> 0.866 0.94 0.798 ? <E^3> 1.339 1.175 1.596 ? <I^2> 2.0 1.5 3.0 ? <I^3> 6.0 3.0 15.0 ? <I^4> 24.0 7.5 105.0 ?
INPUT AND OUTPUT FILESThe following input and output files are used by the program:
KEYWORDED INPUTThe various data control lines are identified by keywords. Only the first 4 letters of each keyword are necessary. The possible keywords are:
DEBUG, LABIN, RESOLUTION, OPERATOR, SIGMACUT,TITLE, TWIN_FRACTION
TITLE <title>[OPTIONAL INPUT]
Title to write to output reflection file.
Default is to keep the title on the input MTZ file. If there is no title on the input MTZ file, then the title is set to: "From Detwin on the <date>"
The various tests are carried out with all reflections, and repeated for those reflections which are greater than <Nsig>*I. Tests such as the Britten plot or those using <H> are unreliable for weak data.
RESOLUTION <Dmin> <Dmax>
Resolution limits - either 4(sin theta/lambda)**2 or d in Angstroms (either order). Reflections outside these limits will be excluded from all analysis and omitted on output. Defaults are taken from the range of data in the input file (i.e. all data included).
Twinning operator, given as the indices of the reflection that is related to the reflection (h,k,l) by the twin operator. See TABLE for likely operators for each space group. If there is only one possibility for a spacegroup this will be used, and there is no need to input an operator. Otherwise it is COMPULSORY.
For example for P31
either OPERATOR -h,-k, l
If this is given, an output MTZ file will be written to HKLOUT with Is or Fs corrected assuming the twin fraction <alpha>. It is difficult to estimate the twinfrac accurately unless the data is of very good quality, but look at all the information plotted. To carry out the exhaustive TRUNCATE tests you will need to run the program several times and look at the resulting intensity statistics. The value which gives the best fit to the theoretical distribution for the acentric terms should be used. Alternatively, run the Uppsala program "dataman" (keyword GEminin) which will estimate the twin fraction.
Once a refined model has been obtained, it is possible to use the Fcalc values from this model to obtain a better estimate of the twin fraction (not yet implemented).
LABIN <program label>=<file label>...
Specify input column lables. [OPTIONAL INPUT]
Truncate takes output from SCALA and SCALEPACK2MTZ which generate standard labels. This is the most common usage of the program, in which case LABIN records are not required.
The program labels defined are: F SIGF IMEAN SIGIMEAN I(+) SIGI(+) I(-) SIGI(-)NB: ONLY the single data pair (F SIGF) or (IMEAN SIGIMEAN) assigned by LABIN is detwinned.
DEBUG <ndebug>[OPTIONAL INPUT]
Debug output will be printed for <ndebug> reflections
Modifications by E.J.Dodson (York)
#!/bin/csh -fv detwin hklin /ss5/hotaylor/andrew/h1_scala.mtz \ hklout $scr0/detwin.mtz << eof-detwin title DETWIN WITH TWIN FRAC 0.4 SYM -k,-h,-l twin 0.4 LABI IMEAN=I SIGIMEAN=SIGI eof-detwin # Run TRUNCATE to check all moments and N(z) statistics trunc: truncate hklin $scr0/detwin.mtz hklout $scr0/trunc_detwin.mtz <