People
Software
DM 1.6
 
DM 1.6 Manual
     NAME
          dm - density modification package, release 1.6, 28/7/94

     SYNOPSIS
          dm HKLIN foo.mtz HKLOUT bar.mtz [ SOLIN foo.msk ] [ SOLOUT
          bar.msk ] [ NCSIN1 foo1.msk [ NCSIN2 ... ] ]

     REFERENCE
          K. Cowtan (1994), Joint CCP4 and ESF-EACBM Newsletter on
          Protein Crystallography, 31, p34-38.

     DESCRIPTION
          Dm is a package which applies real space constraints based
          on known features of a protein electron density map in order
          to improve the approximate phasing obtained from
          experimental sources. Various information can be applied,
          including such diverse elements as the following (see the
          MODE keyword):

          SOLV Solvent flattening [8]

          HIST Histogram mapping [9]

          AVER NCS averaging [2,6]

          SKEL Skeletonisation [1,7]

          SAYR Sayre's equation [5,9]

          FLIP Solvent flipping and density truncation [10]

          The program has three phase extension and combination modes,
          which are selected by the appropriate choice of keywords.
          Note that an arbitrary choice of keywords ignoring the
          recommended scheme can lead to a worse map. The combination
          mode is determined by the COMBINE keyword, if this keyword
          is omitted then the program runs in Free-Sim mode, as it is
          very hard to make a map worse in this mode:

          Reflection Omit mode [11]
               This mode gives best results with most combinations of
               density modification constraints. It does however
               drastically increase the computation time, and is
               generally used only with SOLV, HIST methods.

          `Solomon' mode [10]
               This mode is based on J. P. Abrahams Solomon package,
               and gives a good map very quickly. It can however only
               be used with the FLIP, AVER methods.

          Free-Sim (dm) mode [11]
               This mode does not give as good a final map as the
               other modes, but can be used reliably with any
               combination of density modification constraints. In
               this mode a Free-R like quantity is also generated.

          Calculation of scale and B-factor for the data are
          automatic. This is performed by comparison with an
          empirically derived database of map variance at different
          resolutions, and is more reliable than the conventional
          Wilson plot.

          Non-crystallographic symmetry averaging can be performed for
          both proper and improper symmetries, and different NCS
          averaging operations can be applied to different parts of
          the protein.  (Thanks to Dave Schuller for his help with
          this). Input masks may be on any grid and axis order.

          Skeletonisation is by the core-tracing algorithm of Swanson
          [7]. This is faster than Greer's algorithm and allow
          adjustment of the skeletonisation parameters without
          recalculating the skeleton. As a result the skeletonisation
          calculation is rendered largely automatic.

        Using `dm' in Reflection Omit Mode
          Selected by `COMBINE OMIT'. Implies `SCHEME ALL' and `NCYCLE
          AUTO'.  In this mode a reflection omit calculation is used
          to reduce dependency between initial and modified F's. Phase
          combination (between Fobs and omitted Fmod) is by the SigmaA
          method. The time taken is proportional to the number of free
          sets into which the data is divided (at least 10, default
          20). Any density modification may be used, but AVER, SKEL,
          SAYR will usually be too slow. The real-space-free-residual
          is used to automatically stop the calculation, because after
          a few cycles the map will stop improving and start to
          deteriorate. A typical command file might contain the
          following:
          SOLC <solc>
          MODE SOLV HIST [AVER]
          COMBINE OMIT [SETS <numsets>]
          [AVER...]
          LABI ...

        Using `dm' in Solomon Mode
          Selected by `COMBINE SIGMAA'. Implies `SCHEME ALL' and
          `NCYCLE AUTO'.  This mode combines solvent flipping and
          density truncation, with optional averaging. Phase
          combination is by the SigmaA method. The real-space-free-
          residual is used to automatically stop the calculation,
          because after a few cycles the map will stop improving and
          start to deteriorate. If averaging is used however the
          calculation is much more stable and `NCYC' can be used to
          increase the number of calculation cycles. A typical command
          file might contain the following:
          SOLC <solc>
          MODE FLIP [AVER]
          COMBINE SIGMAA
          [AVER...]

        Using `dm' in Free-Sim Mode
          This mode can be used with any density modification except
          solvent flipping. Phase combination is by the Free-Sim
          method. A Free-R-like quantity is generated. NCYC <ncycle>
          and COMBINE FREE <ncross> should be given. Any other
          keywords except REAL or FLIP may be used. A typical command
          file might contain the following:
          SOLC <solc>
          MODE [SOLV] [HIST] [AVER] [SKEL] [SAYR]
          NCYC 10
          SCHEME RES FROM 3.5
          COMBINE FREE 1 [SETS <numsets>]
          [AVER...]
          LABI ...
           ...

        Free Indicators
          There are two Free indicators that `dm' can use. The first
          is the density modification Free-R (defined in the same way
          as the refinement Free-R). This is calculated in the Free-
          Sim and Omit modes. Unfortunately, while effective for
          refinement, it is a poor indicator of the progress of
          density modification. A better indicator (due to J. P.
          Abrahams) is the real-space-free-residual. This is
          calculated by omitting two small spheres of protein and
          solvent from the density modification. The flatness of the
          solvent sphere and the histogram fit in the protein sphere
          provide a better indication of progress.

     INPUT/OUTPUT FILES
        HKLIN
          Input mtz file - This should contain the conventional (CCP4)
          asymmetric unit of data (see CAD).

        HKLOUT
          Output mtz file.

        SOLIN
          Input solvent mask - This overrides the automatic Wang mask
          determination. The input mask can have any grid and axis
          ordering, and may have any extent from the protein region of
          a single asymmetric unit to the whole cell.

        NCSIN<i>
          Input NCS averaging masks - These are used with the AVER
          option. The input masks can have any grid or axis ordering,
          and may cover a single monomer or the whole multimer.

        SOLOUT
          Output solvent mask - This will be on the program grid with
          default axis order, and will cover the whole unit cell.

     MAJOR KEYWORDS
          (SOLC and MODE are compulsory)

        MODE [SOLV] [HIST] [AVER] [SKEL] [SAYR] [FLIP]
          Select the calculation to be performed:

          SOLV = Solvent flattening

          HIST = Histogram mapping

          AVER = Non-crystallographic symmetry averaging

          SKEL = Skeletonisation

          SAYR = Sayre's equation

          FLIP = solvent flipping and protein truncation (Solomon
               mode)

        SOLC <solc> [MASK <solvfrac> <protfrac>] [MEAN  <solvval>
          <protval>]
          <solc>
               = solvent content. ALWAYS INPUT THE CORRECT SOLVENT
               CONTENT HERE TO ENSURE CORRECT SCALING.  0.0=all
               protein, 1.0=all solvent.

          MASK <solvfrac> <protfrac>
               - used to set different mask volumes to the above for
               histogram matching and solvent flattening.
               <solvfrac> = fraction of cell to be masked as solvent.
               <protfrac> = fraction of cell to be masked as protein.
               If <solvfrac>+<protfrac> < 1.0 then there will be a
               buffer region between solvent and protein which is
               neither histogram matched or solvent flattened. This
               feature is provided by popular demand, but makes things
               worse in most of my test cases.

          MEAN <solvval> <protval>
               - used to set mean density for solvent and protein
               regions. This affects scaling and density modification.
               <solvval> = mean density in solvent region.
               <protval> = mean density in protein region.
               (defaults 0.32, 0.43 electrons per cubic angstrom)

        RESOLUTION <rmin> <rmax>
          Resolution range of reflections to include in the
          calculation.  By the end of the calculation all the
          reflections in this range will be included, however at the
          start only a subset are used, chosen on the basis of the
          scheme card.
          (default is the whole range of the input mtz file)

        NCYCLE <ncycle> | AUTO
          Number of cycles of phase extension to perform.

          <ncycle>
               = Number of cycles over which to perform phase
               extension. Use 10 cycles for a quick result, try more
               (20-100) but check the free-R factor. (Free-Sim mode).

          AUTO = Run until the real-space-free residual stops
               decreasing, then stop.  This is used in the Reflection
               Omit/Solomon modes, where running the calculation for
               too many cycles can cause the map to get worse.

          (defaults <ncycles>=10)

        SCHEME  ALL | AUTO | RES | MAG | FOM [[ FROM <res> ] [ FRAC
          <frac> ]]
          ALL  - Use all reflections for the whole calculation. This
               is used in the Reflection Omit/Solomon modes.

          RES  - perform phase extension in resolution steps, starting
               with the low resolution data.

          MAG  - perform phase extension in magnitude steps, starting
               with the largest reflections.

          FOM  - perform phase extension in FOM steps, starting with
               the best phased data.

          AUTO - perform phase extension using a combination of the
               above chosen on the basis of what the data set looks
               like. This option will also pick a reasonable value for
               <frac>.

          FRAC <frac>
               - fraction of the input data to use as a starting set.

          FROM <res>
               - sets <frac> to the fraction of the data within a
               resolution sphere radius <res>.

          (default: AUTO)

        COMBINE OMIT | SIGMAA | FREE <ncross> [ SETS <numsets> ]
          OMIT Use reflection omit combination scheme, as part of the
               reflection-omit mode.

          SIGMAA
               Use SigmaA combination, as part of the Solomon mode.

          FREE <ncross>
               Use Free-Sim phase combination.  <ncross> = number of
               times each step is performed to provide statistics for
               the free-R and phase weighting.

               For <ncross>=1 a changing random set of reflections are
               omitted each cycle for the free-R factor.

               For <ncross>=2 a fixed set is chosen (using the free-R
               flag if available) and omitted for the free-R factor,
               then the cycle is run a second time using all the
               reflections.

               For <ncross> > 2 (<ncross>-1) multiple free-R sets are
               generated, then on the <ncross>-th cycle all
               reflections are included.
               The total time taken is proportional to the product of
               these two values. Use <ncross> = 1 for large structures
               where the time becomes a significant factor, otherwise
               use <ncross> = 2. Only use <ncross> > 2 for small
               structures where the statistics are particularly poor
               (< 5000 reflections).

          SETS <numsets>
               <numsets> is the number of free sets into which the
               data will be divided. These are used both in Free-Sim
               and Reflection Omit modes. In reflection omit mode the
               calculation time increases in proportion to the number
               of free sets.
               (defaults: FREE, <ncross>=1, <numsets>=20)

        LABIN FP=.. SIGFP=.. [PHIO=.. FOMO=..] [HLA=.. HLB=.. HLC=..
          HLD=..] [FDM=..] [PHIDM=..] [FOMDM=..] [FREE=..]
          Normally just the first four columns (FP,SIGFP,PHIO,FOMO)
          are input. However if you have Hendrickson-Lattman
          coefficients you may want to input these to the program as
          well (the difference is marginal except for SIR data). If
          you want to start from the end of a previous density
          modification calculation then the PHIDM, FOMDM columns are
          used.

          FP   = F magnitude

          SIGFP
               = standard deviation, 0 for unmeasured

          PHIO = best initial phase estimate

          FOMO = weight attached to PHIO

          If PHIO and FOMO are omitted, no phase recombination is
          performed.

          HLA-HLD
               = Hendrickson Lattman coefficients

          FDM,PHIDM,FOMDM
               = map coefficients of the starting map to which density
               modification is to be applied. e.g. from a previous
               density modification calculation (phase and weight) or
               difference map coefficients from SIGMAA (magnitude and
               phase). FDM must be on the same scale as FP.

          FREE = free-R flag (only used if <ncross> > 1)

        LABOUT PHIDM=.. FOMDM=.. [FCDM=.. PHICDM=..]
          Normally just the first two columns are output. Don't use
          the other two unless you are a very clever person.

          PHIDM
               = modified phase

          FOMDM
               = weight attached to PHIDM

          FCDM = F from final modified map before phase recombination

          PHICDM
               = Phase from final modified map before recombination

     OTHER KEYWORDS
        SKEL [ LENGTH <joinlen> <endlen> ] [ BFAC <bfac> ] [ EVERY
          <nskl> ]
          Perform iterative skeletonisation on the map. Cycles of
          skeletonisation are interspersed with cycles of conventional
          density modification.

          <joinlen>
               = length of skeleton in Angstrom/residue to generate
               between density peaks.

          <endlen>
               = length of skeleton in Angstrom/residue to generate in
               `trailing ends'.

          <bfac>
               = temperature factor to apply to the sharpened map
               before skeletonisation.

          <nskl>
               = apply skeletonisation instead of every <nskl>-th
               density modification cycle.
               (defaults <joinlen>=6.0 <endlen>=6.0 <bfac>=45
               <nskl>=3)
               See also the document `dm_skeletonisation'.

        AVERAGE <nncs> [REF [STEP <dr> <dphi>] [EVERY <nref>]]
          [OVERLAP]
          Set a NCS symmetry averaging operator. This card is followed
          by <nncs> rotation/translation matrices on subsequent lines
          in either CCP4 or O/RAVE format.

          CCP4 Formats (see also the program `lsqkab')
               ROTA EULER <alpha> <beta> <gamma>   (Euler angles)
               TRAN <t1> <t2> <t3>

          or   ROTA POLAR <omega> <phi> <kappa>    (Polar angles)
               TRAN <t1> <t2> <t3>

          or   ROTA MATRIX <r11> <r12> <r13> <r21> <r22> <r23> <r31>
               <r32> <r33>
               TRAN <t1> <t2> <t3>

          O/RAVE Format
               OMAT
               r11  r21  r31
               r12  r22  r32
               r13  r23  r33
               t1   t2   t3
               (note that the rotation matrix is transposed with
               respect to CCP4 matrix format)

          where
               x' = r11 x + r12 y + r13 z + t1
               y' = r21 x + r22 y + r23 z + t2
               z' = r31 x + r32 y + r33 z + t3

          These are the operations which map the density in the region
          covered by the input mask onto the other equivalent regions.
          The first operator must be the identity matrix.  The mask is
          input in CCP4 mask (map mode 0) format on the input file
          label NCSIN1, and should cover just one monomer or averaging
          domain, NOT the whole unit cell. The mask grid need not
          agree with the program grid.

          If you want to apply different NCS operations to different
          domains of the protein, use multiple AVER cards, and
          multiple input masks. The first AVER card corresponds to the
          mask on NCSIN1, the second to NCSIN2, etc. The masks should
          be defined in the same multimer in the unit cell, or at
          least in close proximity to one another.

          The REF, STEP and EVERY cards will enable refinement of the
          NCS rotation matrices between averaging cycles. The REF card
          enables the refinement of a particular set of NCS
          parameters. Note that the STEP card allows different
          refinement step sizes can be used for different domains,
          however all but one EVERY card will be ignored. The refined
          matrices will be written out at the end of the log file.

          <dr> = step size for refinement of positional parameters in
               Angstrom.

          <dphi>
               = step size for refinement of rotational parameters in
               degrees.

          <nref>
               = the number of phase extension cycles between each
               parameter refinement.

          The OVERLAP card forces overlap removal for all NCS-masks.
          This was the default mode of operation for old versions of
          `dm' which did not support multimer masks; it must not be
          used if the NCS-mask covers a more than one monomer. Note
          that the ncs-correlation statistics may be less reliable
          when using a multimer mask.
          (defaults <dr>=0.5 A, <dphi>=2.5 degrees, <nref>=3)
          See also the document `dm_ncs_averaging'

        GRID <nx> <ny> <nz>
          Set the grid for the calculation. You may want to do this if
          you want to include your own mask or dump a map or mask.
          (defaults: minimum efficient factors above Nyquist spacing)

        WANG <radius> <mode> [ LIMITS <rhomin> <rhomax> ]
          Set the averaging radius and mode for calculating the
          solvent mask.

          <radius>
               = radius of averaging sphere (Angstroms)

          <mode>
               = 0:  Use weighting scheme w=constant (Spherical top
               hat)

          <mode>
               = 1:  Use weighting scheme w=1-(r/R) (Wang's method)

          <mode>
               = 2:  Use weighting scheme w=1-(r/R)**2

          Heavy atoms can bias the mask calculation procedure,
          resulting in a mask of spheres around the heavy atom sites.
          The LIMITS card can be used to set the values at which the
          electron density is truncated before smoothing.  To truncate
          heavy atoms set <rhomax> to the maximum electron density due
          to non-heavy atoms at the appropriate resolution.
          (defaults <radius>=8.0 <mode>=1 <rhomin>=0.32 <rhomax>=2.0
          e/A^3)

        FLIP <flipfac> [TRUNC <fraction>]
          <flipfac>
               = amount by which to multiply density shifts with
               respect to solvent flattening. 1.0=flattening.
               2.0=flipping.

          TRUNC <fraction>
               = fraction of the protein region to truncate. The
               truncation level will be set so that this fraction is
               below it.
               (defaults: <flipfac>=2.0 <fraction>=0.3)

        REAL [SOLV <sx> <sy> <sz> <sr>] [PROT <px> <py> <pz> <pr>]
          Set the coordinates and radii (in Angstrom) of the spherical
          patches of density where the density modification
          constraints will be omitted in order to provide a real-free
          indicator of progress. If <sr> or <pr> is negative the
          Solvent or Protein free indicator will be omitted.
          (defaults: <sr>=4.0 <pr>=4.0, coordinates chosen from
          solvent mask)

        SCALE <scale> <bfac>
          Override internal scaling and scale input data by F^2 =
          <scale> * exp (<bfac> * s / 2.0) * F^2.  Scaling is critical
          to histogram mapping and Sayre's equation. In some cases you
          may want to override the B-factor, but run without this card
          first, and consider long and hard before changing scale.

     LOOKING AT YOUR OUTPUT
          Look at the free-R factor: This is listed both in the course
          of the output, and also at the end in an Xloggraph table.
          Expect some noise from cycle to cycle in the Free-R if you
          are not using NCYC <n> with FREE 2 or greater.

          The Xloggraph output, as well as showing the free-R factor,
          gives some information on the quality and completeness of
          the input data, and also a plot of the data fit against a
          standard protein data set.

          For NCS-averaging calculations, correlations are calculated
          between related areas of density. These are summarised at
          the end of the log file, and error or warning messages will
          be generated if the initial values are too low: this is a
          good indication of errors in the input matrices or mask.

     COMMON PROBLEMS
          A NCS-averaging mask file may not cover a volume larger than
          the unit cell, otherwise the following error is generated:
                  `ccpmskin - Mask file bigger than unit cell?!'
          This is unlikely to happen except in spacegroups with very
          low symmetry (P1, P2, P21). If it does then it is likely
          that the mask is padded with larger borders of zeros, or
          that it covers more than one monomer, or that there is
          significant overlap between symmetry equivalents of the
          mask. Check the volume of the `set' area of the mask, if it
          is much bigger than the volume of a single molecule then the
          mask is certainly at fault.

     AUTHOR
          Kevin D. Cowtan, Department of Chemistry, University of York
          email: cowtan@yorvic.york.ac.uk

     REFERENCES
          1.   Baker D., Bystroff C., Fletterick R., Agard D. (1994)
               Acta Cryst D49 429-439

          2.   Bricogne, G. (1974) Acta Cryst A30 395-405

          3.   Brunger, A. T. (1992) Nature 355, 472-474.

          4.   Cowtan K. D., Main, P. (1993) Acta Cryst D49 148-157

          5.   Sayre, D. (1974) Acta Cryst A30 180-184

          6.   Schuller D. (1995) in preparation

          7.   Swanson, S. (1994) Acta Cryst D50 695-708

          8.   Wang, B. C. (1985) Methods in Enzymology 115, 90-112

          9.   Zhang, K. Y. J., Main P. (1990) Acta Cryst A46 377-381

          10.  Abrahams, J. P. (1995) Acta Cryst D51 (in press)

          11.  Cowtan, K. D., Main, P. (1995) Acta Cryst D51 (in
               press)

     SEE ALSO
          cad(1), lsqkab(1), xloggraph(1), dm_skeletonisation.doc,
          dm_ncs_averaging.doc.

     EXAMPLES
          #
          #[ a simple solvent/histogram calculation  ]
          #

          dm      hklin gmto.mtz  hklout gmtodm.mtz  << my-data
          SOLC 0.35
          MODE SOLV HIST
          NCYCLE 10
          LABIN FP=FP SIGFP=SIGFP PHIO=PHIB FOMO=FOM
          LABOUT PHIDM=PHI1 FOMDM=W1
          my-data

          #
          #[ a better solvent/histogram calculation,  ]
          #[ takes 20x as long, but gives a great map ]
          #[ using reflection omit                    ]
          #

          dm      hklin gmto.mtz  hklout gmtodm.mtz  << my-data
          SOLC 0.35
          MODE SOLV HIST
          NCYCLE AUTO
          SCHEME ALL
          COMBINE OMIT
          LABIN FP=FP SIGFP=SIGFP PHIO=PHIB FOMO=FOM FREE=FreeR_flag
          LABOUT PHIDM=PHI1 FOMDM=W1
          my-data

          #
          #[ a quick solvent flipping calculation,    ]
          #[ very fast and gives a good map using     ]
          #[ Solomon mode                             ]
          #

          dm      hklin gmto.mtz  hklout gmtodm.mtz  << my-data
          SOLC 0.35
          MODE FLIP
          NCYCLE AUTO
          SCHEME ALL
          COMBINE SIGMAA
          LABIN FP=FP SIGFP=SIGFP PHIO=PHIB FOMO=FOM FREE=FreeR_flag
          LABOUT PHIDM=PHI1 FOMDM=W1
          my-data

          #
          # NON-CRYSTALLOGRAPHIC SYMMETRY AVERAGING
          #[ a three fold averaging calculation      ]
          #[ This could also be done in Solomon mode,]
          #[ or Omit mode if you have enough time    ]
          #

          dm   hklin chmimir.mtz hklout dmchm.mtz   \
               ncsin1 chmi.msk                      \
               << MY-DATA
          SOLC 0.52
          RESO 1000.0 2.1
          NCYC 10
          MODE SOLV HIST AVER
          SCHEME AUTO
          AVER 3 REF
          ROTA POLAR  0.0  0.0  0.0
          TRANS  0.0  0.0  0.0
          ROTA POLAR  113.28130 103.41944 120.33858
          TRANS  43.635 38.059 62.726
          ROTA POLAR   66.58067 -76.78019 119.69176
          TRANS  82.989 15.401 -8.928
          LABI FP=F SIGFP=SIGF PHIO=PHIB FOMO=FOM
          LABO PHIDM=PHIDM FOMDM=FOMDM
          END
          MY-DATA

          #
          # MULTI-DOMAIN AVERAGING
          #[ a two fold averaging calculation with   ]
          #[ two domains and refinement of the 2nd   ]
          #[ set of averaging matrices.              ]
          #[ WARNING: IF YOU DONT KNOW WHAT MULTI-   ]
          #[ DOMAIN AVERAGING IS, YOU DONT NEED IT   ]
          #

          dm  hklin hpattj.mtz    hklout dm1.mtz      \
              ncsin1 cwnads.mask  ncsin2 cwglobs.mask \
              << EOF-dm
          SOLC 0.57
          MODE SOLV HIST AVER
          NCYCLE 40
          AVERAGE 2
           1.0 0.0 0.0
           0.0 1.0 0.0
           0.0 0.0 1.0
           0.0 0.0 0.0
              -0.71389002    -0.69492584     0.08611962
              -0.69635397     0.69129372    -0.19136506
               0.07357326    -0.19652288    -0.97735721
             115.37364197    54.98566055    67.00005341
          AVERAGE 2 REF
           1.0 0.0 0.0
           0.0 1.0 0.0
           0.0 0.0 1.0
           0.0 0.0 0.0
               0.75830859     0.65183645     0.00883542
               0.65189570    -0.75824565    -0.00975925
               0.00033828     0.01316060    -0.99991322
              17.30371666   -47.10081482    68.99727631
          LABIN FP=FP SIGFP=SIGFP PHIO=PHIml FOMO=FOMml -
          HLA=HLA HLB=HLB HLC=HLC HLD=HLD
          LABOUT PHIDM=PHIDM FOMDM=FOMDM
          EOF-dm

          #
          # NOTE: If you don't know what multi-domain averaging is,
          # you don't need it. Use the ncs averaging example, not
          # the multi-domain example.
          #