RSTATS (CCP4: Supported Program)
NAMErstats - scale together two sets of F's
SYNOPSISrstats hklin foo_in.mtz hklout foo_out.mtz rstatsbkr rstatsbkr.dat
DESCRIPTIONThe program scales together two sets of F's, calculates statistics and outputs a reflection file. Data can be split into a working set, and a set reserved for calculation of a freeR factor.
Rejected criterion can be specified as FC/FO ratio, sigma multiple, or |FO-FC|.
KEYWORDED INPUTThe various data control lines are identified by keywords, those available being:
CYCLES, END, FREE, LABIN, LABOUT, LIST, NOABS, OUTPUT, PRINT, PROCESS, REJECT, RESOLUTION, RSCB, SCALE, TEMPERATURE_FACTOR, TITLE, WEIGHTING_SCHEME, WIDTH_OF_BINS
TITLE <title>The title string is written to the output reflection file, replacing the title from the input file.
If TITLE is not specified then:
FREE <num>The FreeR sub-set is defined, in the program, as those reflections which have a value of <num> in the FreeR_flag column. The default is for FreeR_flag = 0.
RESOLUTION <x1> <x2>If given then only reflections in the resolution range <x1>-<x2> will be used during the final (output) cycle in order to calculate statistics.
Note that this a change to the functionality of the RESOLUTION keyword, which can no longer be used to exclude reflections from the output mtz file.
If RESOLUTION is not specified then the limits <x1> and <x2> are taken from the input MTZ file, so no data is excluded from the statistics. The maximum and minimum resolution (in Angstroms) can be given in either order, and if only one number is given this is taken as the maximum resolution limit.
RSCB <x1> <x2>If given then reflections in the resolution range <x1>-<x2> will be used during the scaling cycles, in order to generate the scale and temperature factors. The maximum and minimum resolution (in Angstroms) can be given in either order, and if only one number is given this is taken as the maximum resolution limit.
If RSCB is not given, the limits are taken from the RESOLUTION keyword; if RESOLUTION has not been specified the default is to use all the data, i.e. the resolution limits are read from the input MTZ file header.
NOABSIf the NOABS keyword is present, the program will take the differences between the signed values of Fo and Fc, rather than using the moduli (i.e. use Fo and Fc rather than |Fo| and |Fc|). The default is to use the moduli.
SCALE <scale>Sets initial scale factor for Fc. If zero cycles are selected on the CYCLES card, this scale factor is used for the calculation of R-factors and scaling output data. Default is 1.0.
TEMPERATURE_FACTOR <factor>Sets initial value for the temperature factor. If zero refinement cycles selected using the CYCLES card, this temperature factor is used for calculation of R-factors and scaling output data. Default: 0.0.
WIDTH_OF_BINS [ RTHETA <x1> ] | [ FBINR <x2> ][Optional]
Controls the width of the bins used in the analysis.
RTHETA = <x1> sets the width of ranges of 4(sintheta/lambda)**2; default: 0.01.
FBINR = <x2> sets the width of ranges on Fobs. If x2 is not specified or the card absent then Fobs range will be set by the program. The width is altered accordingly if the scale is applied to Fobs.
Sets the value for listing of reflections with |Fo-Fc| > <x>. Default: 4000.0.
<ncyc> is the maximum number of cycles for scaling; default: 6.
The program will always make one additional pass through the reflection file to calculate statistics and write the output file. If zero cycles are specified then the program will simply apply the input scale and temperature factor. If a linear least-squares problem is selected with no rejections, the program will only make two passes through the input file. The program will stop iterating when the magnitude of the fractional shift in the scale factor is less than 0.005 and the magnitude of the shift in the temperature factor is less than 0.01.
PRINT ALL | LASTALL sets IPRINT on all cycles
LAST (default) sets IPRINT, then print out on ONLY final least squares cycle.
REJECT [ SIGMA=<sig> ] [ RATIO=<rat> ] [ DELTA=<delta> ]This option sets criteria for rejecting reflections from the scaling calculations. The rejected reflections are still written to the output file. More than one of the following options may be specified simultaneously for REJECT:
OUTPUT [ NOHKL | FOFC ] [ BKR ]The output reflection file contains all the reflections present in the input file. Note that this is different from previous versions of rstats. If OUTPUT is not given or it is not followed by a sub-keyword, then FOFC is assumed. Exception when you have LABOUT ALLIN.
PROCESS [ FCAL | FOBS | FOBC | SUMF | SUMC | LGFC | LGFO ]For the FCAL, FOBS and FOBC options, the scale factor (K) and temperature factor (B) are determined by minimising
This non-linear least squares minimisation takes several cycles to converge.
For the SUMF and SUMC options, the temperature factor is not considered and the scale factor is calculated by minimising
So that K = Sum(wFoFc)/Sum(wFc**2)
Although a linear problem, if reflections are being rejected using the DELTA test (see REJECT), several cycles may be required for convergence.
For the LGFC and LGFO options, the scale and temperature factors are determined by minimising
By considering the logarithms, the least squares minimisation becomes a linear problem but with different relative weighting. This scaling gives greater weight to the weak reflections than the minimisation without taking logs.
A weight of W=(Fo/SigFo)**2 should give similar results to a weight of W=(1/SigFo)**2 in the non-linear case.
WEIGHTING_SCHEME [ NONE | DELF=<x1>,<x2>,<x3>,<x4> | DSIG=<x1>,<x2>,<x3>,<x4> | EXP=<x1>,<x2>,<x3> | SIGMA=<x1> ]Weight reflections according to one of the following schemes [Default is NONE]:
LABIN <program_label>=<file_label> ...Input reflection file column assignments.
Assigns the program labels to the columns on the input file. The program labels are:
Data must always be present for H K L FP and FC. SIGFP must also be present when using the SIGMA weighting scheme. FREE flags reflections to be considered seperately, to give statistics needed for Free R factors.
LABOUT [ALLIN] <program_label>=<file_label> ...Output reflection file column assignments.
For OUTPUT FOFC the output program labels are
Where SIGFP, SIGFC and PHIC are only written if they are present on the input file. The weight WT is only written if a WEIGHTING_SCHEME option is specified. By default the output columns will have the same column labels as used on the input file.
If ALLIN is given as a sub-keyword then all columns in the input file will be written to the output MTZ file. This option has preference over the other options for MTZ files.
ENDTerminate input (equivalent to end-of-file). Must be last keyword.
# # Produce file containing h,k,l,s,Fp,Sigfp,Fc,Phic with Fc scaled # to Fo for input to the FFT program. No reflections rejected. # # rstats hklin sample_file hklout fuo_map <<eof-rstats LABIN FP=FNAT2 SIGFP=SIGFNAT2 FC=FCCYC7 PHIC=PHI FREE=FreeR_flag RESOLUTION 8.0 2.7 ! If omitted then all data used eof-rstats # # # A more complicated example: # All input columns output with an additional weight column. # Contents of the output FNAT2 and SIGFNAT2 columns will have # a scale and temperature factor applied. # rstats hklin sample_file hklout fuo_map <<eof-rstats LABIN FP=FNAT2 SIGFP=SIGFNAT2 FC=FNAT1 FREE=FreeR_flag LABOUT ALLIN WT=SIGMAWT TITLE FNAT2 column scaled to FNAT1 using sigma weights RESOLUTION 10.0 2.3 ! default is 1 to 100 Ang PRINT ALL ! default is LAST CYCLES 3 ! default is 6 LIST 3000 SCALE 2.3 ! default is 1.0 TEMPERATURE_FACTOR 6.2 ! default is 0.0 OUTPUT FOFC ! this is OVERRIDEN by LABOUT REJECT DELTA 4000 ! default is no rejections WEIGHTING_SCHEME SIGMA ! default is NONE WIDTH_OF_BINS RTHETA=0.02 FBINR=500 ! defaults are .01 and 1000 PROCESS FOBS ! default is FCAL eof-rstatsThere is also a simple runnable unix script in $CEXAM/unix/runnable:
AUTHORSWritten by: S.E.V. Phillips
modified: Dec.1985 G.Fermi (2-6-88)
modified: Nov.1986 A.C.Bloomer
This keyworded version 24/jan/1990: Peter Brick