XDLDATAMAN (CCP4: Supported Program)

NAME

xdldataman - X-windows tool; manipulation, analysis and reformatting of reflection files.

SYNOPSIS

xdldataman [-font1 | font2 | font3 | font4 | font5]
[menu-driven command selection; interactive parameter input]

DESCRIPTION

xdldataman can be used to read, write, analyse and manipulate ASCII reflection files from most biomacromolecular refinement packages packages. No binary reflection file types are supported (i.e.: MTZ files can *not* be read).

The program runs in interactive mode only. It uses the XDL_VIEW toolkit of J.W. Campbell to provide an easy-to-use interface. Commands are selected by clicking on the desired menu option; some menu options give a pop-up sub-menu with further options (indicated by "-->" in the menu name).

File formats are selected with pop-up menus; all other parameters are set in pop-up dialogue boxes (cut-and-paste is supported). In most cases, default values are given in [square brackets]. To accept these defaults, hit the RETURN key. If multiple numbers are to be input (e.g., cell constants), and if only the first one needs to be changed, for instance, typing the new value for this first number followed by the RETURN key will preserve the values for the other five numbers.

There is a command line option (-font?) which will determine the size of the menu font. These fonts refer to the xdl fonts which are defined from 1 to 5. This can be useful if the window size is too large for the screen. The default font size is 2. The font definitions can be changed in the .Xdefaults file or xrdb, however all xdl programs will be then be effected.

Output from the program is written to a separate area of the main window. Output can be scrolled and cut and pasted into other documents.

For lengthy operations a progress bar shows how much of the operation has been completed.

HISTORY

xdldataman is a CCP4 special version of DATAMAN (part of the Uppsala RAVE averaging package). This version is entirely interactive and has a user-friendly interface. However, this version can only handle one dataset at a time, and some of the functionality of the parent program is absent.

DATAMAN was originally written as a simple format-exchange program, to convert MTZDUMP files to X-PLOR reflection files. It has grown quite a bit since then to include other formats and to carry out several everyday manipulations on datasets. It also includes several programs that were previously stand-alone jiffies (such as the Gemini twin analysis command).

Some of the operations are implementations of Gerard Bricogne's algorithms as described in Volume B of the International Tables.

MENU OPTIONS

Commands are issued by moving the pointer over the desired menu option and clicking with the left mouse button.

READ A DATASET

Provide the filename and then select the appropriate file type from the pop-up menu. See FORMATS.

LIST DATASET INFORMATION

This prints some information regarding the dataset that is currently in memory.

STATISTICS OF THE DATASET

This lists the mean, standard deviation, minimum and maximum values of all properties of the dataset that are known.

EFFECTIVE RESOLUTION

This calculates an estimate of the effective resolution of the dataset, defined as "the resolution at which the actual number of reflections in the current dataset would constitute a 100% complete dataset" (B. Hazes, personal communication). This number is listed for all reflections, and for all reflections with F > n * sigma(F), with n=1,...,5. The lattice type and the number of asymmetric units need to be provided (they are used to estimate the volume of reciprocal space covered by the data).

RSYM HKL/KHL REFLECTIONS

This calculates Rsym (on Fs and on Is, assuming that I=F*F) for all reflection pairs HKL and KHL. If the spacegroup is, for instance, P3x or I4x and this Rsym is low, there may be a spacegroup error (e.g., the spacegroup is P41212 instead of P41).

CELL CONSTANTS

Enter the cell constants (needed to calculate the resolution of the reflections).

SYMMETRY OPERATORS

Provide the name of a symmetry-operator file in O format.

CALCULATE/DEDUCE/CONVERT

This command has several sub-options:
RESOLUTION
Calculate the resolution of all reflections.
CENTRICS/ACENTRICS
Assign centric/acentric flag to each reflection.
ORBITAL MULTIPLICITY
Calculate the orbital multiplicity of each reflection.
F -> I CONVERSION
Convert Fs to Is by using: I ~ F*F.
I -> F CONVERSION
Convert Is to Fs by using: F ~ sqrt(I) (I>=0).

TYPE SOME REFLECTIONS

Use this command to list some reflections. Provide the number of the first and last reflection and the step size (e.g., step size 10 will list every 10-th reflection). Providing a negative number for the last reflection is taken to mean the actual last reflection. Providing a value of zero for the step size will print only the first and the last reflection. Providing a negative step size means "show N reflections equally spread between the first and the last", with N being -1 * the step size.

SHOW SOME REFLECTIONS

This command has several sub-options:
FOBS >
Provide a number; all reflections with F greater than this number will be listed.
FOBS <
Provide a number; all reflections with F smaller than this number will be listed.
SIGMA >
Provide a number; all reflections with sigma(F) greater than this number will be listed.
SIGMA <
Provide a number; all reflections with sigma(F) smaller than this number will be listed.
F/SIGMA >
Provide a number; all reflections with F/sigma(F) greater than this number will be listed.
F/SIGMA <
Provide a number; all reflections with F/sigma(F) smaller than this number will be listed.
RESOLUTION >
Provide a number; all reflections with a resolution greater than this number will be listed (note: greater means "lower resolution" !).
RESOLUTION <
Provide a number; all reflections with a resolution smaller than this number will be listed (note: smaller means "higher resolution" !).

SPECIAL REFLECTIONS

List all reflections of a certain type. Provide a template of the type of reflections to be shown (containing the characters H, K, L and/or 0). For example: HHH, 0K0, KK0, etc.

ABSENCES LIST

List reflections that are systematically absent according to the current spacegroup symmetry operators. This can sometimes be used to make educated guesses concerning the nature of certain screw axes (e.g., in P4x, if only 00L, with L=4N, are strong reflections x is probably 1 or 3).

TWIN STATISTICS

Print some intensity statistics that may or may not be able to provide information with respect to possible twinning.

GEMINI TWIN ANALYSIS

This implements intensity analysis options as described by Stanley (1972) and Rees (1980), that may be of help in investigating possible merohedral twinning. The output consists of statistics and an estimate of the most likely twin fraction ("0.0" means no merohedral twinning). In addition, two PostScript files are produced showing 1N(z,alpha) as a function of z for centro- and non-centrosymmetric reflections. See the original papers for more information.

TEMPERATURE FACTOR APPLY

Apply a temperature factor to the Fs.

CHANGE INDEX

Re-index data. This may be necessary when the data-processing program yields a cell with beta < 90 in a monoclinic spacegroup, or when two datasets cannot be merged due to indexing along equivalent, but different axes (e.g., in P3x, P4x, etc.). Provide expressions for the new H, K and L.

PROD/PLUS

This command has several sub-options:
FOBS
Provide two numbers X and Y; all Fs will be replaced by X*F+Y.
SIGMA(FOBS)
Provide two numbers X and Y; all sigmas will be replaced by X*sigma+Y.
BOTH
Provide two numbers X and Y; all Fs and sigmas will be replaced by X*F+Y and X*sigma+Y, respectively.

LAUE GROUP APPLY

Move the reflections into the asymmetric unit appropriate for the Laue group of the dataset. This is sometimes necessary when the data-processing program outputs a non-standard asymmetric unit (for instance, R-AXIS processing software in P4x gives an asymmetric unit which is incompatible with the CCP4 standard). The Laue group is selected from a pop-up menu.

Implemented Laue groups and their asymmetric units are:


  1bar,   hkl:h>=0  0kl:k>=0   00l:l>=0
  1bar,   hkl:k>=0  h0l:l>=0   h00:h>=0
  1bar,   hkl:l>=0  hk0:h>=0   0k0:k>=0
  2/m,    hkl:k>=0, l>=0     hk0:h>=0, k>=0
  2/m,    hkl:h>=0, l>=0     0kl:k>=0, l>=0 (2-nd sett)
  mmm,    hkl:h>=0, k>=0, l>=0
  4/m,    hkl:h>=0, k>0, l>=0 with  k>=0 for h=0
  4/mmm,  hkl:h>=0, h>=k>=0, l>=0
  3bar,   hkl:h>=0, k<0, l>=0 including 00l
  3bar,   hkl:h>=0, k>0  including  00l:l>0
  3barm,  hkl:h>=0, k>=0 with k<=h; if h=k l>=0
  6/m,    hkl:h>=0, k>0, l>=0  with  k>=0 for h=0
  6/mmm,  hkl:h>=0, h>=k>=0, l>=0
  m3,     hkl:h>=0, k>=0, l>=0 with l>=h, k>=h for l=h, k>h if l>h
  m3m,    hkl:k>=l>=h>=0

SORT REFLECTIONS

Sort the reflections by their indices H, K and L. The sort order is determined by the user.

KILL SOME REFLECTIONS

This command has the same sub-options as SHOW SOME REFLECTIONS:
FOBS >
Provide a number; all reflections with F greater than this number will be deleted.
FOBS <
Provide a number; all reflections with F smaller than this number will be deleted.
SIGMA >
Provide a number; all reflections with sigma(F) greater than this number will be deleted.
SIGMA <
Provide a number; all reflections with sigma(F) smaller than this number will be deleted.
F/SIGMA >
Provide a number; all reflections with F/sigma(F) greater than this number will be deleted.
F/SIGMA <
Provide a number; all reflections with F/sigma(F) smaller than this number will be deleted.
RESOLUTION >
Provide a number; all reflections with a resolution greater than this number will be deleted (note: greater means "lower resolution" !).
RESOLUTION <
Provide a number; all reflections with a resolution smaller than this number will be deleted (note: smaller means "higher resolution" !).

ERASE OPTIONS

This command has the several sub-options:
ROGUES
Delete "rogue" reflections simply by providing their HKL indices. This can be used to remove individual reflections which are suspect for some reason or other.
ODD H/K/L
Delete all reflections for which either H, K or L is odd.
EVEN H/K/L
Delete all reflections for which either H, K or L is even.

RFREE OPTIONS

This command has the several sub-options:
INITIALISE
Set the seed for the random-number generator. Providing a negative seed will use the current machine clock value as the seed; a positive number will be used itself as the seed.
LIST CURRENT STATUS
This lists the current partitioning of the dataset in WORK and TEST reflections. If there are very few or very many TEST reflections, a warning message will be printed. In general, 10% of the data with a minimum of ~500 and a maximum of ~2000 TEST reflections is considered to be reasonable. Note that the error in Rfree is roughly equal to 1/SQRT(number of TEST reflections), so that for 500 TEST reflections the error is ~4.5% and for 2000 TEST reflections ~2.2%.
RESET ALL RFREE FLAGS
This sets all Rfree flags to zero, i.e. all reflections are flagged as WORK reflections. See RFREE FLAGS.
GENERATE RANDOM RFREE FLAGS
Provide either the *percentage* or (roughly) the *number* of TEST reflections. Randomly chosen reflections will be assigned as TEST reflections (the same way X-PLOR does this). This is actually the worst possible way to select TEST reflections, since (through the G-function) every reflection will be related to its neighbours (in reciprocal space) and, in the case of NCS, to its "NCS mates" and their neighbours.
SHELLS OF RFREE REFLECTIONS
Provide the *percentage* or (roughly) the *number* of TEST reflections and the number of resolution bins. The data will be sorted by resolution and divided into bins. From every bins the appropriate fraction of reflections from its centre will be flagged as being TEST reflections. This is to counter couplings in the case of NCS.
SPHERES OF RFREE REFLECTIONS
Provide the *percentage* or (roughly) the *number* of TEST reflections and the radius of reciprocal-space spheres. Reflections are picked at random, and they and their neighbours inside a small sphere (in reciprocal space) are all assigned as TEST reflections. This is to counter couplings due to bulk solvent in the absence of NCS.
GSHELDRICKS METHOD
This simply assigns every N-th reflection to be a TEST reflection, where the value of N is provided by the user (in SHELX, N=10).
COMPLETE CROSS-VALIDATION SETS
The number N of datasets to be generated is provided. Every reflection will be assigned to be a TEST reflection in exactly one of the N datasets. The N datasets are written in X-PLOR format (but can, of course, be converted into other formats with this program).

WRITE DATASET TO FILE

Provide the filename and select the desired file type from the pop-up menu.

DELETE CURRENT DATASET

Remove the current dataset from memory.

HELP

This prints some brief information. Subsequently, click on *any* menu command to get a brief explanation of what that command does.

QUIT

Stop working with the program.

FORMATS

Supported input formats :
* (free format)
MTZDUMP (user or free format)
XPLOR (no format required)
SHELXS (fixed format)
TNT (free format)
PROTEIN (user or free format)
MKLCF (user or free format)
HKLFS (user or free format)
RFREE (user or free format)
ELEANOR (user or free format)

Supported output formats :

* (fixed format)
RXPLOR (no format required)
SHELXS (fixed format)
TNT (fixed format)
PROTEIN (user or fixed format)
CIF (user or fixed format)
MKLCF (user or fixed format)
HKLFS (user or fixed format)
RFREE (user or fixed format)
ELEANOR (user or fixed format)
XPLOR (no format required)

Notes on formats :

*/HKLFS - reads/writes HKL F Sigma
RXPLOR - X-PLOR format with Rfree flags
SHELXS - fixed format, no Rfree flags
TNT - no FOMs, phases or Rfree flags
PROTEIN - no Sigmas
MKLCF - integer F and Sigma
RFREE - HKL F Sigma and integer Rfree flags
ELEANOR - ditto, but real (1.0-Rfree) flags
MTZDUMP - reads unedited MTZDUMP log file
CIF - output only; Rfree flags UNofficial
Use the Calculate option to convert I<->F if needed.

RFREE FLAGS

xdldataman uses the X-PLOR convention, i.e. the Rfree flag is an integer number (0 or 1), and a value of "1" means that the reflection belongs to the TEST set which is *not* used in refinement. CCP4 has a different convention: reflections are divided into a number of equal-sized sets, one of which (usually flagged "0") represents the TEST set, see program FREERFLAG. The CCP4 convention is supported (and converted) through the "ELEANOR" format.

REFERENCES

  1. XDLDATAMAN:
    G.J. Kleywegt & T.A. Jones (1996), Acta Cryst. D52, 826-828.
  2. XDL_VIEW:
    J.W. Campbell (1995). "XDL_VIEW, an X-windows-based toolkit for crystallographic and other applications", J. Appl. Cryst. 28, 236-242.
  3. RAVE:
    G.J. Kleywegt & T.A. Jones (1994). "Halloween ... Masks and Bones", in "From First Map to Final Model" (S. Bailey, R. Hubbard & D. Waller, Eds.), SERC Daresbury Laboratory, pp. 59-66.
  4. O:
    T.A. Jones, J.Y. Zou, S.W. Cowan, & M. Kjeldgaard (1991). "Improved methods for building protein models in electron density maps and the location of errors in these models", Acta Cryst. A47, 110-119.
  5. GEMINI:
    E. Stanley (1972). "The identification of twins from intensity statistics", J. Appl. Cryst. 5, 191-194.
  6. GEMINI:
    D.C. Rees (1980). "The influence of twinning by merohedry on intensity statistics", Acta Cryst. A36, 578-581.
  7. RFREE:
    A.T. Brunger (1992). "Free R value: a novel statistical quantity for assessing the accuracy of crystal structures", Nature 355, 472-475.
  8. CCP4:
    Collaborative Computational Project Number 4 (1994). "The CCP4 suite: programs for protein crystallography", Acta Cryst. D50, 760-763.

KNOWN BUGS

None (at the time of writing).
If you improve the program, please notify GJK of your changes so that they can be implemented in future versions and the entire community may benefit from them (E-mail a brief description and the relevant pieces of code to "gerard@xray.bmc.uu.se").

AUTHORS

Originators: G.J. Kleywegt & T.A. Jones, Uppsala

SEE ALSO

xdlmapman, mtzdump