nXDS

was developped for processing of X-ray snapshots from single-crystals in random orientations (Kabsch, 2014). It is based on the rotation data processing program XDS (Kabsch, 2010). The package consists of two components

For all images in a stream of X-ray snapshots nXDS assumes:

For combining multiple streams by nXSCALE the same type of detector is assumed, perhaps with different internal settings of its segments for each stream.

As described in this chapter, the data images in a stream are processed in 8 steps

which are called in succession by nXDS.

Information between the steps is communicated by files, which allows repetition of selected steps with a different set of input parameters without rerunning the whole program. The files generated by nXDS are either ASCII type files that can be inspected and modified by using a text editor, or binary image files in the CBF format, a byte-offset variant of the CBFlib format. Such images are indicated by the file name extension ".cbf". All files have a fixed name defined by nXDS, which makes it mandatory to process each stream in a newly created directory to avoid name clashes. Clearly, one should not run more than one nXDS-job simultaneously in the same directory. Also, output files generated by rerunning selected steps (see Table) should first be given another name if their original contents are meant to be saved.

Data processing begins by copying an appropriate input file template into the data processing directory. Input file templates are provided with the nXDS package for a number of frequently used data collection facilities. The copied input file must be renamed nXDS.INP and edited to provide the correct parameter values for the actual data collection experiment.

All parameters in nXDS.INP are named by keywords containing an equal sign as the last character, and many of them will be mentioned here in context to clarify their meaning. Execution of nXDS invokes in succession each of the 8 program steps described below - or a subset of the steps named in the parameter JOB=. Results and diagnostics from each step are saved in files with the extension .LP attached to the program step name. These files should always be studied carefully to see whether processing was satisfactory or - in case of failure - to find out what could have gone wrong.


XYCORR

calculates lookup tables of spatial corrections for each detector pixel which are stored in the files X-CORRECTIONS.cbf and Y-CORRECTIONS.cbf . In subsequent data processing steps, when the true coordinates of a pixel with respect to the laboratory coordinate system are needed, the correction values for the X- and Y-coordinates are retrieved from the tables and added to the pixel's array coordinates in the data image.

Dependent on the detector, XYCORR computes the spatial corrections in different ways.

Problems:


FILTER

tests each snapshot and removes images that cannot result from diffraction if their pixel contents does not obey Poisson statistics (input parameter GFRACT=). The list of accepted images replaces the original list specified by the parameter IMAGE_LIST=. The presence of a sufficient number of diffraction spots on the saved images is determined in the subsequent step COLSPOT.


INIT

determines three lookup tables, saved as files BLANK.cbf, GAIN.cbf, and BKGINIT.cbf, that are required by the subsequent processing steps for classifying pixels in the data images as background or belonging to a diffraction spot ('strong' pixels).

Problems:
Some detectors with insufficient protection from electromagnetic pulses may generate badly spoiled images whose inclusion leads to a completely wrong X-ray background table. These images can be identified in INIT.LP by their unexpected high mean pixel contents, and this step should be repeated with a different set of images.


COLSPOT

locates strong diffraction spots occurring in the data images listed in the file named by the input parameter value IMAGE_LIST= and saves the spot centroids on the file SPOT.nXDS.

COLSPOT first determines the background noise for each image pixel. It is assumed that the background is constant in a local area around each pixel; its size (2*NBX+1,2*NBY+1) is defined by input parameters NBX=, NBY=. The median of the pixel values in the area serves to recognize and exclude outliers; the mean of the included pixel values estimates the background and defines the parameter value of a Poisson distribution describing the observed pixel values in the area.

Each pixel is subsequently classified by its contents as
(a) 'background' if it does not exceed a given number of standard deviations of the background noise (BACKGROUND_PIXEL=)
(b) 'strong' if it exceeds a given number of standard deviations of the background noise (SIGNAL_PIXEL=) recording reflection intensity as well

Spots are defined as sets of 'strong' pixels adjacent in two dimensions. A spot is accepted if it contains a minimum number of 'strong' pixels (MINIMUM_NUMBER_OF_PIXELS_IN_A_SPOT=). Images containing less that MINIMUM_NUMBER_OF_SPOTS= are excluded from further processing. At most MAXIMUM_NUMBER_OF_SPOTS= of the strongest spots are saved on file SPOT.nXDS. The names of the accepted images are included in the list of 'good' images (COLSPOT.LST)

Problems:
Sharp edges in the images like ice rings or borders of the untrusted detector regions can lead to an excessive number of 'strong' pixels erroneously classified as contributing to diffraction spots. These aliens could lead to a waste of computing time and even prevent IDXREF to recognize the crystal lattice. To avoid this problem an upper limit for the most strongest spots in an image can be set by the user.


POWDER

generates a powder pattern POWDER.cbf from all spots in the file SPOT.nXDS. This control image could help the user to correct input parameter values for the origin of the detector coordinate system which is critical for the correct indexing of the scattering vectors in the subsequent steps of data processing. The generated image from this step should be inspected by the user with the XDS-Viewer program which adopts an interpretation of cursor positions consistent with the nXDS program.

The plane of the control image POWDER.cbf (1024 × 1024 square pixels) is placed at unit distance from the crystal with its normal coinciding with the incident beam direction computed from the input parameter INCIDENT_BEAM_DIRECTION=. The incident beam intersects the image at pixel coordinates x=511, y=511 at the image center which is marked by crosshairs. The scattering vectors from file SPOT.nXDS are marked in this image where they form a set of concentric rings - the powder pattern - in the ideal case.

The center of the rings should be at the tip of the incident beam - which is the image center. Often the center of the powder rings is found instead offset from its ideal place which can be interpreted to result from an incorrect value for the origin of the detector coordinate system (ORGX=, ORGY=). The pixel offset dx,dy (powder center - image center) is specified by the input parameter POWDER_CENTER_CORRECTION=dx dy q where q is the side length (radians) of a square pixel. From this information a corrected origin (ORGX=, ORGY=) is calculated by the POWDER step and reported in POWDER.LP. For further processing with nXDS you should then specify in nXDS.INP the new values for ORGX=, ORGY= and turn off the parameter or set POWDER_CENTER_CORRECTION= 0 0 0.001 which is its default. The POWDER step could be repeated with the new values for visual control or nXDS could proceed if no further corrections are equired.

Problems:
The control image POWDER.cbf does not show a concentric set of circles despite a large number of spots. This could be due to the presence of too many spots or artefacts on file SPOT.nXDS, to varying directions of the incident beam during data collection or to incorrect parameters describing the multi-segment detector.


IDXREF

uses the initial parameters describing the diffraction experiment from the input file nXDS.INP and the centroids of the spots observed on each image from the file SPOT.nXDS to find orientation, metric, and symmetry of the crystal lattice, and saves the refined values on file XPARM.nXDS.

Spots from file SPOT.nXDS are accepted that are

To determine a crystal lattice that explains the observed locations of the diffraction spots, IDXREF proceeds for each image as follows.

  1. The laboratory coordinates of the diffracted beam wave vector (normalized to 1/λ) corresponding to each spot are calculated from the input parameter values specifying the origin and orientation of the multi-segment detector (POWDER_CENTER_CORRECTION=, ORGX=,ORGY=, DETECTOR_DISTANCE=, DIRECTION_OF_DETECTOR_X-AXIS=, DIRECTION_OF_DETECTOR_Y-AXIS=, SEGMENT=, DIRECTION_OF_SEGMENT_X-AXIS=, DIRECTION_OF_SEGMENT_Y-AXIS=, SEGMENT_ORGX=, SEGMENT_ORGY=, SEGMENT_DISTANCE=, X-RAY_WAVELENGTH=, QX=, and QY=).
  2. Subtraction of the incident beam wave vector (given by the parameters INCIDENT_BEAM_DIRECTION= and X-RAY_WAVELENGTH=) from the diffracted beam wave vector leads for each spot to a reciprocal lattice vector when the Laue equations are satisfied.
  3. Differences between any two such reciprocal lattice vectors which are above a specified minimal length ( SEPMIN=) are then accumulated in a 3-dimensional histogram with a grid size (RGRID=) defined by the user or determined by the program. These difference vectors will form clusters in the histogram since there are many different pairs of reciprocal lattice vectors of nearly identical vector difference. The clusters are found as maxima in the smoothed histogram (CLUSTER_RADIUS=), and only some (MAXIMUM_NUMBER_OF_DIFFERENCE_VECTOR_CLUSTERS=) of the most densely populated cluster vectors are further used.
  4. If the space-group is unknown a vector triplett is selected that best explains the observed difference vector clusters by integral indices (INTEGER_ERROR=). Vector clusters that can be consistently indexed with respect to this triplett are connected in a tree. Trees using the same vector triplett as a basis - but offset only by a constant vector in reciprocal space - can be joined into a new, common reciprocal basis explaining all members of the trees. The parameter MERGE_TREE= defines a minimum fraction of explained difference vectors required for inclusion in a common basis.
  5. If space-group and cell constants are specified (input parameters SPACE_GROUP_NUMBER=, UNIT_CELL_CONSTANTS=), the observed difference vector clusters are interpreted (indexed) (INTEGER_ERROR=) by a rotated version of the given unit cell using exhaustive search (NUMBER_OF_TESTED_BASIS_ORIENTATIONS=).
  6. The basis vectors and the 30 best short lattice vectors with attached indices are listed in IDXREF.LP. If many of the indices deviate significantly from integral values, the program is unable to find a reasonable lattice basis and all further processing will be meaningless for this image.
  7. Based on the orientation and metric of the reduced cell, up to 3,000 of the strongest spots are indexed by the local indexing method. This method considers each spot as a node of a tree and identifies the largest subtree of nodes which can be assigned reliable indices. The number of reflections in the ten largest subtrees is reported and usually shows a dominant first tree corresponding to a single lattice, whereas alien spots are found in small subtrees. Input parameters that control the local indexing are INDEX_ERROR=, INDEX_MAGNITUDE=, INDEX_QUALITY=.
  8. Reflections in the largest subtree are used for initial refinement of the basis vectors of the reduced cell, the incident beam wave vector, and the detector parameters (controlled by the input parameter REFINE(IDXREF)=). Spots are included in the refinement only if their coordinates can be explained with reasonable accuracy (MAXIMUM_ERROR_OF_SPOT_POSITION=).
  9. The refined metric parameters of the reduced cell are used for testing each of the 44 possible lattice types (Kabsch, 1993). Possible lattice symmetries (decision constants MAX_CELL_AXIS_ERROR=, MAX_CELL_ANGLE_ERROR=) are reported but no automatic decision for the space-group is made. If the crystal symmetry is unknown, data processing will continue with the crystal being described by its reduced cell basis vectors and triclinic symmetry. Otherwise, the observed spots are reindexed in terms of the given cell whose basis vectors are found from the reduced cell vectors.
  10. After initial refinement based on the reflections in the largest subtree, all spots which can now be indexed are included. Spots not belonging to the crystal lattice are given indices 0,0,0. The initial file SPOT.nXDS is replaced by a file of identical name - now with indices attached to each observed spot.
  11. The indexing run is considered successful if some minimum number (MINIMUM_FRACTION_OF_INDEXED_SPOTS=) (default 40%) of the given spots can be explained; otherwise the image will be excluded from further processing.

Problems:


INTEGRATE

determines the recorded intensity of reflections of all successfully indexed data images using the parameter values listed in file XPARM.nXDS. The refined parameters together with the recorded intensities are saved on file INTEGRATE.HKL, representing the main result of processing the specified stream of snapshots.

For each image INTEGRATE carries out the following steps.

Problems:


CORRECT

determines scaling and correction factors for the recorded intensities and standard deviations of all reflections in the file INTEGRATE.HKL, reports quality and completeness of the data, and saves the fully corrected integrated intensities on file nXDS_ASCII.HKL. This file can be directly read by XDSCONV (see XDS) which converts the reflection data into various formats required by software packages for crystal structure determination like CCP4, CNS, X-PLOR, or SHELX.

The integrated intensities of the reflections saved on file INTEGRATE.HKL may or may not have been indexed in the correct space group. For the purpose of integration it is only important that all reflections occurring in the data images have been accurately located and indexed with respect to some unit cell basis. Thus, the INTEGRATE step can be carried out even in the case of an unknown space group because the reflection indices with respect to the correct space group are always a linear transformation of the original indices used in the INTEGRATE step and reported in INTEGRATE.HKL.
Presently, CORRECT requires knowledge of the space group and approximate cell constants. Plausible space group candidates and their conventional cell parameters were already reported in IDXREF.LP and can now be tried in the CORRECT step without the need for rerunning previous program steps (specify JOB=CORRECT, SPACE_GROUP_NUMBER=, UNIT_CELL_CONSTANTS=) in nXDS.INP and rerun nxds_par).

The possibility to compare the new data with a reference data set (REFERENCE_DATA_SET=) is particularly useful for resolving the issue of alternative settings in case the lattice symmetry is higher than the point group symmetry of the crystal (e.g. P4, P6, R3). NOTE: It is assumed that the reference reflections are indexed with respect to the basis specified in nXDS.INP (SPACE_GROUP_NUMBER=, UNIT_CELL_CONSTANTS=).
Also, reference data are quite useful for recognizing misindexing. CORRECT reindexes the input reflections from file INTEGRATE.HKL thereby resolving possible indexing ambiguities by comparison with a given reference data set or by a selective breeding algorithm that yields, on average, the highest correlation with intensities from all other images (Kabsch, 2014).

Reflections from file INTEGRATE.HKL are accepted that are

Intensity correction factors are calculated for each accepted reflection that are due to

Recognition and removal of outliers among the images is based on medians of

CORRECT improves the final data set by a post-refinement procedure that iteratively changes the crystallographic parameters of all images so that intensities of symmetry equivalent reflections become as similar as possible to each other and simultaneously minimize discrepacies between calculated and observed spot positions on the detector segments. The post-refinement can be controlled by input parameters like USE_REFERENCE_IN_POSTREFINEMENT=, POSTREFINE=, REFINE_SEGMENT=.

After postrefinement a simple method is chosen to reduce the potentially very large number of reflections to a manageable size - yet with little loss of information. For each unique reflection (considering Friedel pairs as different) in general two statistically independent intensity estimates are generated from reflections taken from randomly selected snapshots. Now, comparison of pairs of independent intensity estimates allows to correct the original intensity error estimates, obtained from counting statistics during the INTEGRATE step, by a resolution-dependent factor so that the new error estimates agree with the sample statistics of the intensity pairs. Outliers from the Wilson plot, often arising from ice rings in the data images, are recognized and marked by a negative sign attached to their e.s.d. in the final output file (REJECT_ALIEN=).
The reduced set of statistically independent intensity observations thus obtained is sufficient to assess the quality of the final data set as function of resolution by the various indicators, like I/SIGMA, Rmrgd-F, CC(1/2), etc (Diederichs and Karplus, 1997). This reduced set of fully corrected data are saved on file nXDS_ASCII.HKL.
A potentially very large output file is generated only if explicitly requested by the user by specifying a file name for the input parameter LONG_OUTPUT_FILE=. The named file contains the corrected intensities of all reflections recorded in the data images. Details are described in nXDS_long.HKL

Problems:


© 2013-2023, MPI for Medical Research, Heidelberg      Imprint Datenschutzhinweis.
Wolfgang.Kabsch@mpimf-heidelberg.mpg.de
page last updated: Nov 28, 2023