
Wedger Elves guide to their scripts

1) In a rush?  Just type this:

mosflm.com integ >! logs/mosflm.log
merge.com >! logs/merge.log


That should do it.  

The logs from these runs: logs/mosflm.html and logs/merge.log can be viewed in 
CCP4's xloggraph program or netscape for a graphical view of your data reduction
parameters.

Your scaled and merged data will end up in merged.mtz, (in CCP4 format) 
to convert it to other formats, read on...



2) Okay, how does it work?

    Wedger Elves do a lot of fancy footwork to get mosflm running properly and
smoothly.  It is HIGHLY recommended that you use Wedger Elves, and let them optimize
your parameters as much as they can before you start messing around with these
scripts yourself.  They are more likely to run properly if you do.  

In this directory (/home/jamesh/projects/workshop_fakedata/final/process/badsignal):

################################################################################
mosflm.com	- mosflm script
    reads: x-ray images (/data2/jamesh/workshop/final/badsignal_1_###.img)
           auto.mat	- the crystal's orientation matrix (mosflm format)
    makes: raw.mtz	- integrated spot intensity data (CCP4 mtz format)
           postref.mat	- the refined crystal orientation matrix 
    usage: mosflm.com [integ]
    
    example: mosflm.com

    The mosflm.com script is meant to be a bare-bones script for running 
    A.G.W. Leslie's ipmosflm program. If you don't have mosflm, you can 
    download it for free from:

    netscape ftp://ftp.mrc-lmb.cam.ac.uk/pub/mosflm/
    
    As long as the Wedger Elves could find the ipmosflm executable, and
    some x-ray data, they should have set up mosflm.com to at least run.
    Remember, a mosflm matrix must be obtained from autoindexing, (or from
    another wedge of data collected from the same data collection session)
    before mosflm.com will run.  (see autoindex.inp below)

    NOTE: THIS SCRIPT HAS TWO MODES! 

    1) if this script has a "POSTREF SEGMENTS" line in it, then it is
    set up to only refine your crystal and camera parameters.  It will 
    produce the refined crystal orientation in postref.mat  This file
    should be copied to auto.mat if you want to re-input this orientation
    into the next mosflm run.

    2) if this script has no "POSTREF SEGMENTS" line, and only one "PROCESS"
    line, then it is set up to integrate.  It will produce a list of measured
    spot intensities in raw.mtz

    To convert from a refining script to an integrating script, delete the
    "POSTREF SEG" line, and make sure there is only one "PROCESS" line indicating
    all of the data in your wedge.  

    See the mosflm manual for more details on how to edit and run mosflm scripts.


################################################################################
merge.com	- semi-intelligent mergeing script

    reads: raw.mtz	- an mtz of integrated spot intensity data
    makes: merged.mtz	- an mtz of averaged structure factor data 
    usage: merge.com [SG] [raw.mtz] [1.8A] [120aa]
    where: 
    SG	    is the space group to apply in mergeing (default: P6122)
    raw.mtz contains the (unmerged) data to merge   (default: raw.mtz)
    1.8A    is the desired outer resolution limit   (default: 2)
    120aa   is the number of amino acids in the asymmetric unit (for truncate)

    Example: merge.com P6122 raw.mtz 3A 100aa
    will scale and merge the data in raw.mtz out to to 3.0 A with the symmetry
    operators from P6122, and run truncate with an ASU of 100aa.  
    
    This script is meant to get you started with scaling and mergeing your data.
    Wedger Elves will examine the results of this script to see if scale smoothing
    might be appropriate, and will try to apply it if it is.
    
    If a space group is provided, the CCP4 program reindex will be applied to
    the raw data before scaling and mergeing.  The outer resolution limit can
    also be specified on the command-line, as well as the number of amino acids
    residues in the asymmetric unit.  This latter value is used by truncate to
    try to put the final structure factors on an absolute scale (electron units).
    It is not critical, but a good habit to get into.
    
    Although merge.com was meant for the raw data produced from the Wedger Elves
    scripts, it should be applicable to almost any unmerged mtz data.

################################################################################
autoscala		- optimizer for SDCORR card
    reads: a scala script
    makes: a better scala script
    usage: autoscala script.com
    where: script.com is the scala script to optimize

    example: autoscala merge.com

    Scala's SDCORRECTION card allows the assigned error (sigma) of the spot 
    intensities to be edited.  Most measurement programs cannot predict the
    effects of absorption and other systematic measurement errors, and therefore
    usually give unrealisticially low estimates of the error in the measured
    spot intensities.  You should read the scala documentation to find out 
    exactly how SDCORR works.
    Briefly, "correct" sigmas should be similar to the scatter of observed intensities.
    That is, if the 10 observations of hkl=(5,9,12) deviate from the average value
    of (5,9,12) by 100 units (rms), then the sigma of (5,9,12) should be 100.  So, 
    if the assigned sigma is 50, then the scatter/sigma will be 2.  This analysis, 
    grouped by intensity bins, is the last graph in the scala logfile.  You want 
    all the points on this graph to be as close to 1.0 as possible.  If you see this, 
    then your assigned sigmas are probably realistic.
    To save you from hours of diddling with the SDCORR numbers, autoscala uses a 
    "Golded-Section" search (derived from Numerical Recipies), to optimize the three 
    numbers for scala's SDCORRECTION card, using the deviation of the aforementioned
    graph from 1.0 as a target.  In CCP4 3.3 and beyond, the first number on the SDCORR 
    card is optimized internally (and might as well be "1"), but the remaining two can 
    be tuned up by autoscala.

################################################################################
SGsearch.com		- exhaustive space-group search
    reads: a scala script
    makes: a table of mergeing statistics
    usage: SGsearch.com [script.com] [raw.mtz] [rootSGs]
    where: 
	script.com is the scala script to use        (default: merge.com)
	raw.mtz    is the raw, unscaled data         (default: raw.mtz)
	rootSGs    is/are the "starting" space group (default: SG from raw.mtz)

    example: SGsearch.com merge.com P212121

	will run merge.com with every orthorhombic space group:
	    P222, P2221, (P2122, P2212), P21212, (P21221, P22121), and P212121

    Picking the wrong space group has been known to waste weeks to years of an 
    investigators time.  SGsearch.com uses the space group provided to get the
    general crystal system your crystal was indexed with, and will then try 
    mergeing your data in EVERY space group belonging to that crystal system.
    The Rmerge, systematic absences, and asymmetric unit volume will be presented
    in a neat table for your review.  
    
    The actual logs from the individual merge.com runs will be placed in the ./logs/
    directory, named merge.SG.log.  If SGsearch.com finds these logs aready exist, 
    it will use the statistics in them to make the table, this usually saves you a 
    lot of time re-generating the table, and you can always delete these logs, and 
    run SGsearch.com again.
    
    SGsearch.com is desiged to work with the merge.com provided by Wedger Elves, 
    but should work fine with any scala/truncate script that is capable of 
    accepting and applying a space group provided on its command line.

################################################################################
Patt.com		- basic Patterson script
    reads: merged.mtz
    makes: Patt.map
    usage: Patt.com merged.mtz [merged2.mtz]
    where: 
	merged.mtz   is the merged mtz file (containing DANO or F)
	merged2.mtz  is another merged mtz file (containing F)

    examples:
	Patt.com merged.mtz
	  - will calculate a Patterson of DANO in merged.mtz
	Patt.com merged.mtz ../wedge2/merged.mtz
	  - will calculate a Patterson of F-F between the two mtzs

    description:
	Patt.com will have to be edited for most customizations.  All it
	really is is a basic framework for calculationg Pattersons.  Data
	sets are expected to be named "DANO" or "F", and changing
	the resolution or difference cutoffs must be done by editing the
	top of the script.  Patt.com exists for your convenience in
	calculating preliminary Pattersons as your data are being processed.

################################################################################
mtz2various.com		- basic format-converter script
    reads: merged.mtz
    makes: outfile.EXT
	EXT -> FORMAT
	cif -> CIF
	hkl -> shelx
	tnt -> TNT
	fin -> XtalView
	phs -> XtalView
	fobs-> XPLOR
	cv  -> XPLOR
	cns -> CNS
    usage: mtz2various.com merged.mtz outfile.EXT [format]
    where: 
	merged.mtz   is the merged mtz file (containing Fs)
	outfile.EXT  is the filename you want to use for the exported data
	format	     is the (optional) program you want outfile.EXT formatted for

    examples:
	mtz2various.com merged.mtz merged.cif
	  - will convert merged.mtz to CIF format
	mtz2various.com all.mtz "F1" merged.fobs
	  - will convert "F1" in all.mtz to XPLOR format
	mtz2various.com merged.mtz merged.hkl shelx
	  - will convert merged.mtz to shelx format
	mtz2various.com merged.mtz merged.hkl tnt
	  - will convert merged.mtz to TNT format

    description:
	mtz2various.com is a general-purpose "smart" script for converting
	"F" data from an mtz file (such as merged.mtz) to other file formats
	for other non-CCP4 programs.  The format of the output file can either
	be implied by using a standard file extension in the output file name,
	or declared explicitly and separately on the command line.  Free-R 
	flags are exported automatically, if they are present.  In the case of 
	XtalView files, a suitable CRYSTAL file is also generated.

################################################################################
autoindex.com		- mosflm autoindexing script
    usage: ./autoindex.com
    
    The autoindex.com file was created for your convenience if you want
    to resort to manually tinkering with autoindexing in mosflm.  It can
    also be edited (and renamed) to work as a script for a full mosflm
    run to integrate your data, or refine cell parameters.
    
################################################################################
strategy.com		- mosflm strategy script
    usage: ./strategy.com
    
    The strategy.com file was created for your convenience if you want
    to resort to manually tinkering with STRATEGY in mosflm.  Changing 
    things like mosaicity, distance, and 2theta can all affect the results
    of the strategy calculation.
    You can also run STRATEGY interactively by editing strategy.com
    file.  Just un-comment the "IMAGE" and "GO" lines, and run the script.        
    
    
################################################################################

3) Data files
################################################################################
best.mat	- mosflm's crystal orientation matrix

    This file contains the orientation information (and cell dimensions) of
    your crystal.  It is periodically updated by Wedger Elves, as the crystal
    orientation is refined.  
    As long as you are working with the same crystal setting, you can point
    Wedger Elves to a copy of best.mat from another wedge instead of autoindexing
    again.  best.mat holds regardless of phi, distance, 2theta, and other
    camera parameters.

################################################################################
raw.mtz		- raw, unmerged intensity data

    This is a multirecord (unmerged) mtz file produced by the last "integration"
    run of mosflm.com it should contain the raw intensity measurements for
    all the spots in the processed wedge.  
    This file is meant to go into scala, or Scaler Elves

################################################################################
merged.mtz	- scaled and merged data from the wedge

    This is a "standard" mtz file containing F, SIGF, DANO, SIGDANO for
    the data measured in the wedge just processed by Wedger Elves.  It is
    produced from raw.mtz by merge.com

################################################################################
rejected_spots.txt  - list of spot observations rejected during scaling

    "Outlier" observations that did not agree with other symmetry-equivalent
    observations are listed here (by merge.com).  However, this list is given
    in the context of the "other" observations.  It is usually a good idea for
    you to visually inspect these rejected spots and make sure there really was
    something wrong with them (i.e. behind the beamstop shadow)  The quickest
    way to do this is with Spotter Elves:
    Spotter logs/merge.log /data2/jamesh/workshop/final/badsignal_1_001.img

################################################################################
Patt.map	- anomalous difference Patterson

    By default, Wedger Elves create a "standard" anomalous-difference Patterson
    map in Patt.map by running Patt.com (above).  If you do not have access to
    a graphics program for viewing the map, logs/patt.log should contain a peak-
    pick of this map that you can review.

################################################################################

    


4) What if something goes wrong?

    The Wedger elves have been trained to handle a number of common problems 
encountered in using mosflm.  However, if something happens that
is beyond their experience, it's up to you to figure it out.  :(  But, please
email jamesh@ucxray6.berkeley.edu about your problem.

The mosflm manual is available from:

netscape ftp://ftp.mrc-lmb.cam.ac.uk/pub/mosflm/


5) tips and tricks:


If you search for the words "Ding" and "Dang" in Wedger Elves output, 
you can instantly see the rate at which mosflm runs have been sucessful or
crashing.
---

use the following awk program to average a bunch of numbers:
---

uncommenting the following lines:
#IMAGE blah blah blah
#GO
in mosflm.com will turn it into a fully-interactive mosflm script.  You run it, 
and the mosflm graphics pops up, giving you interactive control.
---

