Goodies...

This page allows you "direct access" to some of the tools and scripts that Elves use to work on your structures. They are included here in "free-standing" form, so you can just download and use them for "power" situations where you want to use some, but not all of Elves features. Be warned, however, that some editing of these scripts may be required in order to get them to run on your system (since Elves won't be there to help you).

Smart Scripts

The following scripts are unoptimized copies of the "smart" scripts that are written and used by Elves. After an Elves run is finished, there should be a copy of each of these scripts in a directory called "scripts" in the main directory you initially ran Elves from. When Elves write these scripts, they set them up with defaults appropriate for your particular project. However, each of these scripts is designed to be as easy as possible for humans to read, modify and adapt to other projects, without having to rely on Elves to re-generate them for you. For this reason, they are included here for people who are interested in a "medium" level of automation in their projects.

In addition to being runnable in the "traditional" way of rewriting the script for each application, these "smart" scripts have a sophisticated procedure (hidden away down at the bottom of the script) for intelligently reading their command line, and scanning their input data files for commonly-changed parameters. This sounds a little un-customizable, but it turns out to be very handy, and time-saving in the long run. For example, how often have you had to go digging in your fft script to change the name of the "F" dataset you were using? Wouldn't it be nice if you had a script that could just "know" that there was only one F in the mtz file anyway, and use it? These scripts do things like that. In fact, they can sometimes "outsmart" you. For example, if you specify a certain "F" in an mtz file that just isn't there, they will ignore you, and use one that is. This does, admittedly, add an element of unpredictability, but all you have to do is look at the top of the logfile produced by the "smart" script to see which data columns its using.

fft.com

is a general-purpose phased electron-density map calculator, which is written and use by Phaser Elves. It produces a CCP4 map file called "ffted.map", which covers one standard CCP4 asymmetric unit, (and is normalized so that its mean=0 and sigma=1). If mapman is available, a dsn6 (o format) version of the map, extended to cover 170% of the unit cell is also generated, and called "ffted.omap". A standard bones trace is also done, and written to "bones.o". If a pdb file is provided, then the dsn6 map is extended to cover the pdb file. In either case, an o macro called "ffted.omac" is written to explicitly load and view all these files into o.

usage: fft.com mtz/best_phased.mtz pdb/build1.pdb

calculates a map from the "best" F in mtz/best_phased.mtz (the one that is most complete, highest resolution, and best <F>/<SIGF>), and the most recently added Phase and figure-of-merit (usually PHIDM and FOMDM). If any of the data columns in the input mtz file are named on the command line, they will be used instead of the automatically-chosen ones.

dm.com

is a general-purpose solvent-flattening script, which is written and use by Phaser Elves. It looks for an F and Phase in exactly the same way as fft.com (above), except that it also checks for a solvent content on the command line, and H/L coefficients.

usage: dm.com 40% mtz/mlphared.mtz

performs solvent-flattening on the mlphare results in mtz/mlphared.mtz, using a solvent content of 40%. The highest resolution F in mtz/mlphared.mtz will be automatically chosen for input into dm, as well as any H/L coefficients, or Phase/FOM columns found therein. If any of the data columns in mtz/mlphared.mtz are named on the command line, they will be used instead of the automatically-chosen ones.

merge.com

a general scala/truncate script that scales and merges a single, multirecord mtz raw data file, and then runs truncate and unique to fill in any missing data in the output file: "merged.mtz". Therefore, an mtzdmp of this file will give the true completeness of the data.

usage: merge.com P2221 raw.mtz 2.0A

will merge the raw data in raw.mtz out to 2.0A, using P222₁ symmetry. The output, merged file will be merged.mtz.

System Tools

realnice

Use this when "nice" just isn't nice enough. Realnice is a watchdog program that will "STOP" (not kill) an indicated process when someone is logged in on the computer's console (and has not been idle for more than 10 minutes). The job will be "CONTinued" when the console becomes idle again, and/or the person logs off. Helps prevent irate labmates who are trying to use o, or other interactive programs while you are running big, long jobs in the background. The watched process is specified as a text string on the realnice command line, and the first process matching this string in a "ps -f" command is monitored. Realnice only watches a single process group, so if multiple copies of the same program are running, the text string should just be a process ID. Programs like o, which can make the console "look" idle when it is not, can be explicitly named as "dominant" programs (listed after the watched text string). The watched program will be stopped as long as one of these "dominant" programs is running.

usage: realnice refmac ono

stops your refmac job whenever someone is on the console (and not idle) or running a program whose name contains "ono"

sendhome

Sendhome is a file transfer program, designed for sending x-ray image data home from a synchrotron site in real time. The sendhome script continuously watches for newly-created images, and then sends them to a specified remote machine when they appear. The transfer is done through an ssh login session, so, as long as you can logon to the remote computer using ssh, the program will work, and the remote password only needs to be given once. Transfers are also compressed, so most x-ray images are sent ~3X faster than FTP. The syntax is simmilar to scp:

usage: sendhome /data/user/frame_001.img user@remotehost.college.edu:/bigdisk/user/frames

will send frame_001.img and any images in /data/user (on the local machine) newer than frame_001.img to /bigdisk/user/frames on remotehost.college.edu, using "user"'s account. However, sendhome is (at the moment) "upload-only", so you can't use it to transfer remote files to the local machine. To do that, you should ssh to the remote machine, and "sendhome" back to your local host. The ssh login session is conducted in the usual way, except that, instead of a remote command prompt, you see the remote tar job unpacking your files. Once the transfer has started, you can press <Cntrl>-Z ("Ctrl" key and "Z" key), followed by the "bg" unix command to continue transfering files in the background.

For the technically curious, the transfer is done something like this:

ls -1rt /data/user | (cd /data/user ; tar cf - - ) | compress | ssh user@remotehost.college.edu "(cd /bigdisk/user/frames ; tar xvf - )"

Unfortunately, sendhome only works on SGIs and Linux machines. OSF1 users are out of luck unless they get a better tar program. Also, you MUST have ssh installed on both ends in order for sendhome to be secure!

Visualization Tools

moviefy

Moviefy is a program intended for sumarizing x-ray image graphics as movies, but it can be used to convert just about any sequence of graphics images into a movie. On SGIs, the dmconvert program is used, (and it's only available on Irix 6.x). However, moviefy can employ ImageMagick on any unix platform.

usage: moviefy /data/user/frame_???.img

will convert all the frames indicated into an SGI movie. You can also specify a particular region of the detector face using "-box" and you can also control the zoom and normalization factors:

usage: moviefy /data/user/frame_???.img -box 100 200 900 1000 -zoom 0.5 -scale 0.2

will convert the area between the pixel coordinates (100,200) and (900,1000) on all the frames indicated into an SGI movie. The size of the image will be rescaled (zoomed) so that each pixel on the detector becomes 0.5 pixels in the movie. Also, the value of the (usually 16-bit) detector pixel will be multiplied by 0.2 before it is converted to the (8-bit) greyscale movie file. By default, a zoom of 0.25 is used, and the scale is determined automatically for each image. To "auto-scale" for only the first image, and keep that scale for the rest, you would say "-scale same".

The moviefy script is used by the Spotter Elves to make their movies of important spots. To work quickly, moviefy requires the image converter binaries adsc2pgm and osc2pgm (below), but it does contain its own embedded copies of the binaries and source, so it will always work on a computer with an ANSI compatible C compiler installed and licenced.

xplot

Xplot is a short awk program for reformatting a table of numbers to be displayed in the CCP4 program xloggraph.

Files like this:

Rcrys Rfree

33.23 35.25

30.45 33.89

25.30 30.21

22.18 28.67

Become this:

27.79 +/- 4.31107

32.005 +/- 2.66587

$TABLE : - Plots:

$GRAPHS:Values by line:A:1, 2, 3:

:Values vs. 1st column:A:2, 3: $$

line Rcrys Rfree $$

$$

1 33.23 35.25

2 30.45 33.89

3 25.30 30.21

4 22.18 28.67

$$

usage: xplot data.list >! data.xplot ; xloggraph data.xplot

will produce and xloggraph "version" of each column of numbers in data.list.

Rplot.com

Rplot.com is a jiffy for displaying vital statistics from one or more refmac logs. The quantities R_cryst, R_free, R_free- R_cryst, correlation coefficients, bond deviations, angle deviations, and total number of atoms refined are listed in an xloggraph-readable format. Also works on logs from wARP refinements that use refmac. When multiple logs are provided, they are sorted by creation date and the values from the last refinement step in each file are listed against the number extracted from the log filename. For example:

usage: Rplot.com logs/refmac*.log >! Rplot.xlog

would produce a file called Rplot.xlog that can be displayed graphically by xloggraph.

Drift.com

Drift.com is a jiffy for displaying movement statistics from two or more pdb files, and is usually used to see how much the model is moving in an x-ray refinement run. The rms change in XYZ position and B-factor are listed for C-alpha as well as for all atoms in an xloggraph-readable format. The maximum shifts in XYZ and B are also listed. Provided pdbs are sorted by creation date and separate listings are created for stepwise (i vs i+1) differences, differences from the first file (i vs 1), and differences from the last file (i vs n). The last of these is a good indicator of wether or not the model is stuck, or is still "headed somewhere" in the refinement.

usage: Drift.com pdb/refmac*.pdb >! Drift.xlog

would produce a file called Drift.xlog that can be displayed graphically by xloggraph.

Sytemization Tools

epmr.com

epmr.com will run the program epmr (assumed to be in the $PATH) on every imput model specified in every combination of resolution limit, number of monomers and space group provided.

usage: epmr.com data.mtz searchforme.pdb [SG] [reso] [n]

will run epmr on all the provided target PDB models. If you specify more than one model, SG, reso, or n, then every combination of these parameters will be tried in turn.

example: epmr.com merged.mtz kinase.pdb P212121 4-15A 1

₁

SGsearch.com

SGsearch.com merges raw data in every possible space group (ones with the same lattice), and prints out R_merge and systematic absences for each one. You must provide SGsearch.com with a functioning mergeing script that accepts the unmerged mtz filename on its command line. The raw datafile you provide will be reindexed, sorted, and passed to this script. you may also specify a resolution limit.

usage: SGsearch.com merge.com raw.mtz 2.5A

will run the command "merge.com reindexed.mtz 2.5A" where reindexed.mtz is a copy of raw.mtz reindexed to each possible alternative space group to the one found in raw.mtz. For example, if raw.mtz is in P2₁2₁2₁, then mergeing will be done in P2₁2₁2₁, P222, P222₁, P22₁2, P2₁22, P2₁2₁2, P2₁22₁, and P22₁2₁, and display mergeing statistics and systematic absence data for each. Note that some of these are not "real" space groups, but represent different screw axis assignments of P222₁ and P2₁2₁2.

autoscala

Autoscala optimizes the SDCORR card in a given scala script. Display which one gives the best scatter/sigma (chi²).

usage: autoscala merge.com

will run the command "merge.com" but substitute the line beginning with "SDCORR" in merge.com with a series of possible SDCORR x y z command cards. The scatter/sigma table produced by merge.com is checked, and a new SDCORR line is chosen, based on the principle of the Golden Section search. Once the Golden Section search converges, a file called merge.com_best will be created, which is identical to merge.com, but with the SDCORR line edited to the "best" values found.

merge.com

example scala/truncate script that works with the above systemization programs.

usage: merge.com P2221 raw.mtz 2.0A

will merge the raw data in raw.mtz out to 2.0A, using P222₁ symmetry. The output, merged file will be merged.mtz.

autoscalepack

same as autoscala, but for HKL's scalepack program. In this case, the error_scale_factor and estimated_error lines of the provided scalepack script. The caveats are that the scalepack script must be able to run on its own (that is, it should begin with a:
#! /bin/csh -f
line), and it should not have any run-to-run "memory" (that is, either delete the "reject" file, or don't write it out). Alternately, if your scalepack.com script self-converges (re-runs scalepack until there are no new rejections), it will also work with autoscalepack. You can download an example of such a self-convergeing scalepack script here: scalepack.com.

usage: autoscalepack scalepack.com

rrsps.com

Recursive, Real-Space Patterson Search is basically a more comprehensive extension of the CCP4 rsps program. After an initial "harker scan" possible sites are each, in turn, checked for cross-scoring new sites. for each of these pairs of sites, another cross-score is computed, and a list of candidate third sites is obtained. This process is repeated, recursively, until no significant (default: 3 sigma) new sites are found. Each constellation of sites is then given a score, which is the product of all the peak heights in the constellation. This list is sorted, and provided to the user for subsequent evaluation. rrsps.com is a genuinely recursive shell program: it actually launches a new instance of itself for each new crossvector search! One might think that an exhaustive recursive search like this would take a really long time, but, for modest numbers of sites (<10) it only takes 15-60 minutes on an SGI Octane workstation.

usage: rrsps.com patterson.map P212121 P222

Will search for site constellations consistent with the Patterson in patterson.map, first using P2₁2₁2₁, and then P222 symmetry. Each space group is given a separate output file. If you like, more than one Patterson map can also be given, and each will be considered in separate runs. Remember, however, that because this is a Patterson search, inversion-related constellations get the same score.

origins.com

this script checks two PDB files against each other to see if they are the "same" crystal structure with different origin choices. Molecular replacement and heavy-atom finding programs will pick an origin at random, which makes it difficult to compare the results of different runs.

origins.com will take the second PDB file on its command line and shift it around using all possible symmetry, cell, AND origin-shift operations and then check the RMSD to the first PDB file. The shifted model that best agrees with the reference model is output as neworigin.pdb. Alternaltely, you can mention the word "correlate" on the command line and use the correlation coefficient of the electron density calculated from the shifted and not-shifted PDB instead of the RMSD. RMSD only works when the two PDBs have the same atom/residue names.

Multi-chain PDB files will have each chain aligned separately and the origin choice giving the best combined score will be used to generate the final neworigin.pdb file.

usage: origins.com right.pdb wrong.pdb P212121

will shift wrong.pdb around to each possible origin and symmetry operation allowed for P2₁2₁2₁. The atoms in the shifted version of wrong.pdb will be compared to the ones in right.pdb and the rmsd reported. Alternately:

usage: origins.com right.pdb wrong.pdb P212121 correlate

will do the same shifts as above, but instead of a rmsd calculation, it will calculate maps for each PDB and score based on the CC between them.

Conversion Tools

reindex.com

This is a "smart" script for doing simple reindexing between Laue-equivalent space groups. All you need to give it are the mtz file and the new space group. It works on both merged and unmerged data, and the "pseudo" space groups P2₁22, P22₁2, etc. are supported (wether or not you've got CCP4 4.x or not). The result of reindexing to a "pseudo" space group will be an mtz with a cannonical space group name (P222₁ or P2₁2₁2), but with the cell axes permuted appropriately.

usage: reindex.com merged.mtz P222

will change the space group of "merged.mtz" to P222. The new filename will be "reindexed.mtz".

FreeRer.com

This script does the same thing as the CCP4 "uniqueify" program, except that it can "inherit" a free-R set from an existing CCP4 or XPLOR/CNS reflection file. Also, FreeRer.com outputs free-R flags in XPLOR/CNS format, as well as CCP4 mtz.

usage: FreeRer.com not_free_yet.mtz flags.cv

will import the free-R flags in the CNS reflection file: flags.cv into the CCP4 mtz file: not_free_yet.mtz. Unfortunately, users of X-PLOR/CNS will, all-too-often, forget to assign a free-R flag to as-yet unobserved unique HKLs. In these situations, FreeRer.com will use the "complete" feature of the CCP4 freerflag program. FreeRer.com also generates a file called freeR_flag.mtz, which contains just the complete, unique flag assignments out to 1.5A, and can easily be converted to other formats. The output file, called FreeRed.mtz, is identical to the not_free_yet.mtz file, but with the FreeR_flag column added.

bestFH.com

Applies a generalization of the procedure developed by Matthews et al (1965) for estimating the amplitude of the heavy-atom contribution (F_H) at each hkl by combining anomalous and isomorphous (or dispersive) differences. Many people don't realize that there are systematic as well as random errors in their difference Pattersons. The systematic errors arise because standard difference Pattersons are calculated from (|F_PH|-|F_P|)², and not (|F_PH-F_P|)². Therefore, there are cross-terms missing if you're going to think of your difference Patterson as the Patterson of your heavy metal sites. For example, if (for a particular hkl) |F_H| = |F_PH-F_P| is relatively large, but F_H is 90 degrees out-of-phase with F_P, then |F_P| ~ |F_PH| and the (|F_PH|-|F_P|)² value used to calculate the difference Patterson will be near zero! On the other hand, the anomalous difference of |F_PH| = |F_PH⁺| - |F_PH^-| will be large. Considering the orthogonal nature of anomalous and dispersive differences, it's quite remarkable that isomorphous and anomaous difference Pattersons look alike at all. The Matthews procedure combines both kinds of differences to estimate what the true value of |F_H| is.

bestFH.com takes a standard, merged CCP4 mtz file, and uses all the differences between all the "F" datasets found therein, along with all the anomalous difference "D" datasets to estimate F_H= |F_PH-F_P|. This is a similar procedure to the CCP4 program revise, except that it requires no keyworded input, and, theoretically, works for all kinds of difference data, not just MAD. The only caveat is that all the difference data provided to bestFH.com should be from metal sites at the same XYZ location, otherwise, you will get an "averaged" F_H for all the site constellations.

usage: bestFH.com alldata.mtz

will calculate the best estimate of F_H from the native and derivative data sets found in alldata.mtz. In my experience, the Pattersons produced by F_H are cleaner than simple isomorphous and anomalous difference Pattersons, and direct-methods programs like shelx also work better with F_H.

adsc2pgm (sgi, linux, c source)

Converts an ADSC Quantum 4 image into a Portable Greymap image file. The Portable Greymap (PGM) format is the most general kind of greyscale image format imaginable. Basically, it's a short text header: "P2 <width> <height> 255", followed by a list of text numbers between 0 and 255, which are the pixel values. Most modern image file converters, such as ImageMagick, can read PGM, and convert it to something that takes up less space.

You can specify a region of the detector face by using "-box x1 y1 x2 y2" on the command line, you can specify the "zoom factor" (size scale) with "-zoom x", and the intensity normalization scale using "-scale x". The program defaults to converting the whole detector face, using a zoom factor of 0.25 (4 detector pixels -> 1 image pixel), and an intensity scale of rms(pixel value)/5, which, empirically, gives a "nice" looking normalization in the 8-bit output image.

usage: adsc2pgm input_1_001.img -box 100 100 300 300 -zoom 0.2 -scale 0.1 output.pgm

will take the region between pixels (100,100) and (300,300) on input_1_001.img, multiply the pixel values by 0.1, and then output every 5th pixel to output.pgm. If you don't specify an output file name here, it would default to "input_1_001.pgm".

adsc2sgi (sgi, c source)

same as adsc2pgm, but outputs an SGI *.rgb file.

osc2pgm (sgi, linux, c source)

Same as adsc2pgm, except that it converts an MSC R-axis II or R-axis IV image file into a Portable Greymap image file.

osc2sgi (sgi, c source)

same as osc2pgm, but outputs an SGI *.rgb file.

tuencode.awk

Converts a list of text numbers to a (uuencoded) binary file of the same composition. It is a convenient way to create an arbitrary binary file without having to write and compile a c program to do it. The command pipeline:

usage: echo "10 20 30 255" | awk -f tuencode.awk | uudecode

will create a four-byte binary file called binout.bin containing the byte values: 0x0A, 0x14, 0x1E and 0xFF. This way, pure byte fields are made, regardless of the endianism of the computer hardware. Any numbers > 255 will be converted to the remainder of number/256.

refmac2o.awk

Jiffy for converting a cannonical refmac pdb file (with multiple conformers) into something more friendly to o. Multiple conformer residues are converted to "insertion" residues in the sequence, which o understands. You can then rebuild each conformer as independent (but, admittedly, spacially overlapping) residues.

o2refmac.awk

Jiffy for reversing the refmac2o.awk procedure on a sam-atom-out pdb file from o.

Back to the Elves Page.

This page is not finished. It will never be finished, and neither will yours. Admit it.

James Holton <JMHolton@lbl.gov>