**fhscal hklin** *foo_in.mtz ***hklout** *foo_out.mtz*

[Keyworded input]

Derivative to native scale factors are calculated in equi-volume shells in reciprocal space using Kraut's formula (ref 1), generalised to use both centric and acentric data, and applied to the derivative data. This formula takes account of the degree of heavy-atom substitution, but does not require the presence of anomalous differences.

The program also computes a scale factor to put the isomorphous difference Patterson on the correct scale for the vector-space refinement program VECREF.

It also possible to apply the scales to all "scaleable" columns in a dataset (i.e. to F+/- and to the structure intensities; see the LABIN keyword), and this is advisable to avoid mixtures of scaled and unscaled data for a single derivative. For input MTZ files with dataset information, FHSCAL will attempt to check and warn you accordingly if it detects datasets which will be output with such a mixture. In these cases, specifying the AUTO keyword will cause the appropriate scale factor to be applied automatically to all such columns.

Free format using keywords. The following keywords may be used; only the leading 4 characters are significant and the order is immaterial:

AUTOBIAS,CENT,END,LABIN,LIST,RESO,SHELLS,TITLE

The LABIN keyword is always required, the others are optional and assume
default values if omitted. Use of BIAS 1 is recommended, provided the standard
deviations produced by the data processing program (*e.g.* SCALA) are
reliable. If in doubt, omit BIAS.

- TITLE <title>
- Title (max 100 characters).
- LIST <list>
- Number of reflections to list (for debugging purposes). Default = 0.
- SHELLS <nshell>
- Number of shells to divide data (aim to have at least 200 reflections per shell). If there is more than one derivative to be scaled, this applies to the one with the highest resolution. Default = 20.
- BIAS <bias>
- Bias factor to multiply standard deviations. This is used to correct for the bias effect when averaging squares of differences. Normally this should be 1, however care should be taken that the standard deviations are valid; for example, some programs set the s.d. of a zero F to 9999; the correct value is sqrt(I+sigma(I))-sqrt(I). This will cause the program to give incorrect results; either ignore the s.d.'s by setting BIAS = 0, or better delete or correct the reflection(s). Default = 0.
- RESOLUTION <maxres> [ <minres> ]
- Maxres and minres can be given in any order and in Angstrom units or as 4*sin(theta)**2/lamda**2. If specified then reflections outside this resolution range will be excluded from scaling and the output MTZ file. If the card is absent then all possible reflections are used for scaling and the resolution range of the native determines the output. If only one number is given it will be taken as the high resolution cutoff.
- LABIN <program label>=<file label>
- MTZ assignments, see below.
- AUTO
- Switches on AUTOmatic column selection. This option can only be used if
the input file contains dataset information.
It is only necessary to specify

`FPHn`for each dataset on the LABIN line (except in special cases, see below). Other labels can also be specified if desired. The program will then try to identify all "scaleable" columns in the dataset, automatically read them in and then apply the appropriate scale factor determined from`FPHn`.This option is intended to prevent a mixture of scaled and unscaled columns within a dataset, e.g.

`FPHn`is scaled but not`FPHn(+)`and`FPHn(-)`. There are a couple of caveats:- It is assumed that each dataset contains the information for one derivative.
- There may be problems with the automatic scaling if datasets contain both
`SIGIMEAN`and`SIGDPHn`. This is because the program cannot distinguish between sigmas for intensities (which need to be scaled by the square of the scale factor) and those for other quantities (which are multiplied by the scale factor).

In these cases the automatic selection will make a best guess at which sigma is which; the ambiguity can also be resolved provided that`IMEAN`and`SIGIMEAN`are explicitly set by the user on the LABIN line (which is safer).

- CENT
- Use only centric reflections to compute the scale factors. All reflections will be scaled and output.
- END
- Terminate input. Equivalent to end-of-file.

Standard MTZ reflection files are used for input (HKLIN) and output (HKLOUT). The following column labels are used :

H, K, L Standard meaning. FP, SIGFP Native amplitude and sigma.

If only 1 derivative is being scaled:

FPH, SIGFPH Derivative amplitude and sigma. DPH, SIGDPH Derivative anomalous difference and sigma (optional). FPH(+), SIGFPH(+), FPH(-), SIGFPH(-) Derivative amplitudes and sigmas for Friedel pair (optional).

If more than 1 derivative is being scaled (up to 20 per run), the column
labels are FPH1, SIGFPH1, [ DPH1, SIGDPH1, FPH1(+), SIGFPH1(+), FPH1(-), SIGFPH1(-), ]
FPH2, SIGFPH2, [ DPH2 ... ] *etc.*

Scales are applied to FPH, SIGFPH and DPH, SIGDPH, FPH(+), SIGFPH(+), FPH(-), SIGFPH(-) if present. All other columns, including those for which no label assignments are given, are output unchanged.

WARNING : Reflections for which there is a derivative measurement but no native and which have a greater value of S than any reflection for which both are measured, will be rejected (because no valid scale can be applied). The rejections must be re-incorporated later when higher resolution native data becomes available.

In order to avoid losing reflections in the scaling procedure, it is worth
considering using the dataset with the highest resolution limit as the reference
(*i.e.* 'native') dataset in FHSCAL.

After echoing the input data, a table with the following columns is produced for each derivative:

- Shell number
- Maximum resolution for shell
- Number of reflections in shell
- RMS FP
- RMS FPH
- RMS (K.FPH - FP) for centrics
- RMS (K.FPH - FP) for acentrics
- Smoothed scale factor for shell

Overall scale and temperature factors are determined from a Wilson plot and printed with their estimated standard deviations; however the scale factors actually applied to the derivative data are obtained by interpolating the shell scale factors.

At the end, the V factor (pseudo-cell volume) for the FFT program for use in computing a correctly scaled isomorphous difference Patterson is given:

- V' = V * C / Kv

In addition to the usual MTZ file opening errors:

ERROR(S) IN DATA: syntax errors were found in the general equivalent positions. Check for spurious characters, missing commas, extra commas etc.

ERROR - NO REFLECTIONS: no common reflections were found. Check column assignments, check reflection listing.

NO REFLECTIONS IN SHELL n. Try using smaller number of shells. Reflections may be missing in a resolution range.

Kraut's formula can be derived by equating the Patterson origins

K^2 . sum FPH^2 = sum FP^2 + sum FH^2 (1)

where FPH, FP and FH are derivative, native and heavy-atom amplitudes respectively, and K is the derivative scale to be determined.

For acentric reflections :

<FH^2>a ~= 2.<(K.FPH - FP)^2> (2)

Elimination of the unknown FH from (1) and (2) gives a quadratic equation for K, the solution of which is :

K = (2.sum FP.FPH - sqrt(4.(sum FP.FPH)^2 - 3.sum FP^2 . sum FPH^2)) / sum FPH^2 (3)

Note that in the original reference the leading factor given as 1/2 should be 2. This formula is valid only for acentric reflections. However it can easily be generalised to include centrics by noting that

<FH^2> ~= <M.(K.FPH - FP)^2> (4)

where M = 1 for centric and 2 for acentric, so using (4) instead of (2) :

K = (sum M.FP.FPH - sqrt((sum M.FP.FPH)^2 - sum (M+1).FP^2 . sum (M-1).FPH^2)) / sum (M-1).FPH^2 (5)

The numerator and denominator of (5) could be zero if all reflections in a shell were centric; this is unlikely, but just in case the equivalent formula can be used instead :

K = sum (M+1).FP^2 / (sum M.FP.FPH + sqrt((sum M.FP.FPH)^2 - sum (M+1).FP^2 . sum (M-1).FPH^2)) (6)

This formula is modified slightly to take into account the bias effect
when averaging the squares of differences, *i.e.* the term <M.(K.FPH - FP)^2>
is replaced by : <M.(K.FPH - FP)^2 - M.((K.sigma(FPH))^2+sigma(FP)^2)>
where the sigma's have been multiplied by the BIAS factor.

The program reads the input control data, then makes a first pass through the reflections to get the resolution limits (can be controlled by RESO card), then a second pass to flag reflections as centric or acentric and accumulate the sums in shells for the scale factors. The scale factors are calculated and smoothed, and applied in a third pass. The program also computes a scale factor to apply to the isomorphous difference Patterson for use in the program VECREF:

Kv = (sum (FPH-FP)^2)c + 2.(sum (FPH-FP)^2)a) / (sum (FPH-FP)^2)c + (sum (FPH-FP)^2)a)

Ian Tickle, Birkbeck College, London

- Kraut J, Sieker LC, High DF and Freer ST,
*Proc. Nat. Acad. Sci. USA*,**48**, 1417-1424 (1962).

(A vms version found in $CEXAM/vms/fhscal.com