CCP4 web logo CCP4i: Graphical User Interface
Data Harvesting

next button previous button top button


MTZ Project Names and Dataset Names in CCP4i

Creating New MTZs in CCP4i

Data Harvesting in CCP4i


From CCP4 release 4.0, dataset information will be used to write out deposition files. The CCP4 programs affected are SCALA, TRUNCATE, MLPHARE, REFMAC and RESTRAIN. In addition, the following data preparation programs can be used to add/adjust Project and Dataset names in the MTZ file: SCALEPACK2MTZ, DTREK2MTZ, Combat, F2MTZ and CAD.

Provided a Project Name and a Dataset Name are specified (either explicitly or from the MTZ file) and provided the NOHARVEST keyword is not given, the harvesting programs will automatically produce a deposition file. This file will be written to

The environment variable $HARVESTHOME defaults to the user's home directory, but could be changed, for example, to a group project directory.

In summary, the extra keywords associated with harvesting that will be included in the data harvesting programs are:

Project Name. In most cases, this will be inherited from the MTZ file.
Dataset Name. In most cases, this will be inherited from the MTZ file.
Set the directory permissions to '700', i.e. read/write/execute for the user only (otherwise '755').
Write the deposit file to the current directory, rather than a subdirectory of $HARVESTHOME.
Maximum width of a row in the deposit file (default 80).
Do not write out a deposit file; default is to do so provided Project and Dataset names are available.

See also Data Harvesting.

MTZ Project Names and Dataset Names in CCP4i

The concept of project and dataset names in MTZ files serves two purposes:

  1. To identify the origin of the various columns of data in the MTZ file; particularly MAD data for each separate wavelength can be grouped together as one dataset. This is very necessary if certain manipulations such as changing spacegroup are being performed on the data.
  2. To carry the default project and dataset names which will be used as identifiers for the automated data harvesting. Data harvesting should help simplify submission of structures to the database.

Creating New MTZs in CCP4i

There are four tasks in the CCP4 Interface which import reflection data into MTZ format:

  • Import Scaled Denzo Data (running Scalepack2mtz program)
  • Import Scaled D*trek Data (running Dtrek2mtz program)
  • Import Unscaled Data (running Combat program)
  • Convert to MTZ & Standardise (running F2mtz program)
In addition, in the 'Edit MTZ Project&Dataset' task (program CAD), the Project name (PNAME) and Dataset name (DNAME) can be adjusted.

In all of these tasks you are required to enter a Project and Dataset name. By default the project name will be the same as the CCP4i project name though this is not essential. This is different from the way the CCP4 programs themselves handle defaults for the project and dataset name (see, for instance, Scalepack2mtz PNAME).

N.B. the CCP4 Interface script does not do any additional data harvesting; all this activity is in the data harvesting programs (i.e. SCALA, TRUNCATE, MLPHARE, REFMAC and RESTRAIN).

Data Harvesting in CCP4i

Data harvesting is performed for four different stages in the structure solution process and might be performed for the following tasks:
Program functionInterface Task(s)
Scaling data (Scala) Scale Experimental Intensities
Converting intensities to SFs (Truncate)
  • Import Scaled Denzo Data (optional)
  • Scale Experimental Intensities (optional)
  • Convert Intensities to SFs
  • Convert to MTZ and Standardise (optional for some formats)
  • Final round of refinement of heavy atom positions (Mlphare) Run Mlphare
    Final round of refinement (Refmac) Run Refmac

    In all of these tasks it is possible to enter an alternative project or dataset name to override the ones in the MTZ file. There are three options for where to put harvest files:

    1. A harvest file is, by default, created in a directory defined by the environment variable HARVESTHOME which is defined for your installation in the ccp4.setup file. The recommended value for HARVESTHOME is $HOME so the harvest directories are created in the users top directory (see also Data Harvesting).
    2. If you are working through CCP4i, there is also the option to put them in the CCP4i project directory.
    3. Do not create any harvest files.

    You can set a default for these options in the CCP4i Preferences window (accessed from the menu on the right-hand side of the Main Window). Also, for each task which might involve harvesting, there is a folder in the task window, immediately below the Files folder, where you can change the destination for harvest files and change the project or dataset names. In the Preferences window there are also options to change the format of the harvesting file.


    CCP4i allows for the automatic setting of certain preferences for data harvesting, e.g. whether to do any harvesting at all, and how to set read/write privileges for harvest files.

    next button previous button top button