Goodies...
This page allows you "direct access" to some
of the tools and scripts that Elves use to work on your structures. They
are included here in "free-standing" form, so you can just download
and use them for "power" situations where you want to use some,
but not all of Elves features. Be warned, however, that some editing of
these scripts may be required in order to get them to run on your system
(since Elves won't be there to help you).
Smart Scripts
The following scripts are unoptimized copies of
the "smart" scripts that are written and used by Elves. After
an Elves run is finished, there should be a copy of each of these scripts
in a directory called "scripts" in the main directory you initially
ran Elves from. When Elves write these scripts, they set them up with defaults
appropriate for your particular project. However, each of these scripts
is designed to be as easy as possible for humans to read, modify and adapt
to other projects, without having to rely on Elves to re-generate them
for you. For this reason, they are included here for people who are interested
in a "medium" level of automation in their projects.
In addition to being runnable in the "traditional" way
of rewriting the script for each application, these "smart" scripts
have a sophisticated procedure (hidden away down at the bottom of the script)
for intelligently reading their command line, and scanning their input
data files for commonly-changed parameters. This sounds a little un-customizable,
but it turns out to be very handy, and time-saving in the long run. For
example, how often have you had to go digging in your fft script to change
the name of the "F" dataset you were using? Wouldn't it be nice
if you had a script that could just "know" that there was only
one F in the mtz file anyway, and use it? These scripts do things like
that. In fact, they can sometimes "outsmart" you. For example,
if you specify a certain "F" in an mtz file that just isn't there,
they will ignore you, and use one that is. This does, admittedly, add an
element of unpredictability, but all you have to do is look at the top
of the logfile produced by the "smart" script to see which data
columns its using.
- fft.com
- is a general-purpose phased electron-density map calculator, which
is written and use by Phaser Elves. It
produces a CCP4 map file called "ffted.map",
which covers one standard CCP4 asymmetric unit, (and is normalized so that
its mean=0 and sigma=1). If mapman
is available, a dsn6 (o format)
version of the map, extended to cover 170% of the unit cell is also generated,
and called "ffted.omap".
A standard bones
trace is also done, and written to "bones.o".
If a pdb file is provided, then the dsn6 map is extended to cover the pdb
file. In either case, an o macro called "ffted.omac"
is written to explicitly load and view all these files into o.
usage: fft.com mtz/best_phased.mtz pdb/build1.pdb
calculates a map from the "best" F in mtz/best_phased.mtz
(the one that is most complete, highest resolution, and best <F>/<SIGF>),
and the most recently added Phase and figure-of-merit (usually PHIDM and
FOMDM). If any of the data columns in the input mtz file are named on the
command line, they will be used instead of the automatically-chosen ones.
dm.com
is a general-purpose solvent-flattening script, which is written and
use by Phaser Elves. It looks for an F
and Phase in exactly the same way as fft.com (above), except that it also
checks for a solvent content on the command line, and H/L coefficients.
usage: dm.com 40% mtz/mlphared.mtz
performs solvent-flattening on the mlphare results in mtz/mlphared.mtz,
using a solvent content of 40%. The highest resolution F in mtz/mlphared.mtz
will be automatically chosen for input into dm, as well as any H/L coefficients,
or Phase/FOM columns found therein. If any of the data columns in mtz/mlphared.mtz
are named on the command line, they will be used instead of the automatically-chosen
ones.
merge.com
a general scala/truncate script that scales and merges a single, multirecord
mtz raw data file, and then runs truncate and unique to fill in any missing
data in the output file: "merged.mtz".
Therefore, an mtzdmp of this file will give the true completeness of the
data.
usage: merge.com P2221 raw.mtz 2.0A
will merge the raw data in raw.mtz out to 2.0A, using P2221
symmetry. The output, merged file will be merged.mtz.
System Tools
- realnice
- Use this when "nice" just isn't nice enough. Realnice
is a watchdog program that will "STOP" (not kill) an indicated
process when someone is logged in on the computer's console (and has not
been idle for more than 10 minutes). The job will be "CONTinued"
when the console becomes idle again, and/or the person logs off. Helps
prevent irate labmates who are trying to use o,
or other interactive programs while you are running big, long jobs in the
background. The watched process is specified as a text string on the realnice
command line, and the first process matching this string in a "ps
-f" command is monitored. Realnice only watches a single process group,
so if multiple copies of the same program are running, the text string
should just be a process ID. Programs like o,
which can make the console "look" idle when it is not, can be
explicitly named as "dominant" programs (listed after the watched
text string). The watched program will be stopped as long as one of these
"dominant" programs is running.
usage: realnice refmac ono
stops your refmac job whenever someone is on the console (and not idle)
or running a program whose name contains "ono"
- sendhome
- Sendhome is a file transfer program, designed for sending x-ray image
data home from a synchrotron site in real time. The sendhome script continuously
watches for newly-created images, and then sends them to a specified remote
machine when they appear. The transfer is done through an ssh
login session, so, as long as you can logon to the remote computer using
ssh, the program will work, and the remote password only needs to be given
once. Transfers are also compressed, so most x-ray images are sent ~3X
faster than FTP. The syntax is simmilar to scp:
usage: sendhome /data/user/frame_001.img user@remotehost.college.edu:/bigdisk/user/frames
will send frame_001.img and any images in /data/user
(on the local machine) newer than frame_001.img to /bigdisk/user/frames
on remotehost.college.edu, using "user"'s account. However,
sendhome is (at the moment) "upload-only", so you can't use it
to transfer remote files to the local machine. To do that, you should ssh
to the remote machine, and "sendhome" back to your local
host. The ssh
login session is conducted in the usual way, except that, instead of a
remote command prompt, you see the remote tar job unpacking your files.
Once the transfer has started, you can press <Cntrl>-Z ("Ctrl"
key and "Z" key), followed by the "bg" unix command
to continue transfering files in the background.
For the technically curious, the transfer is done something like
this:
ls -1rt /data/user | (cd /data/user ; tar cf - - ) | compress | ssh user@remotehost.college.edu "(cd /bigdisk/user/frames ; tar xvf - )"
Unfortunately, sendhome only works on SGIs and Linux machines. OSF1
users are out of luck unless they get a better tar program. Also, you MUST
have ssh installed on both ends in order for sendhome to be secure!
Visualization Tools
- moviefy
- Moviefy is a program intended for sumarizing x-ray image graphics as
movies, but it can be used to convert just about any sequence of graphics
images into a movie. On SGIs, the dmconvert program is used, (and
it's only available on Irix 6.x). However, moviefy can employ ImageMagick
on any unix platform.
usage: moviefy /data/user/frame_???.img
will convert all the frames indicated into an SGI movie. You can also
specify a particular region of the detector face using "-box"
and you can also control the zoom and normalization factors:
usage: moviefy /data/user/frame_???.img -box 100 200 900 1000 -zoom 0.5 -scale 0.2
will convert the area between the pixel coordinates (100,200) and
(900,1000) on all the frames indicated into an SGI movie. The size of the
image will be rescaled (zoomed) so that each pixel on the detector becomes
0.5 pixels in the movie. Also, the value of the (usually 16-bit) detector
pixel will be multiplied by 0.2 before it is converted to the (8-bit) greyscale
movie file. By default, a zoom of 0.25 is used, and the scale is determined
automatically for each image. To "auto-scale" for only the first
image, and keep that scale for the rest, you would say "-scale same".
The moviefy script is used by the Spotter
Elves to make their movies of important spots. To work quickly, moviefy
requires the image converter binaries adsc2pgm
and osc2pgm (below), but it does contain its own
embedded copies of the binaries and source, so it will always work on a
computer with an ANSI compatible C compiler installed and licenced.
- xplot
- Xplot is a short awk program for reformatting a table of numbers to
be displayed in the CCP4 program xloggraph.
- Files like this:
- Rcrys Rfree
- 33.23 35.25
- 30.45 33.89
- 25.30 30.21
- 22.18 28.67
- Become this:
- 27.79 +/- 4.31107
- 32.005 +/- 2.66587
- $TABLE : - Plots:
- $GRAPHS:Values by line:A:1, 2, 3:
- :Values vs. 1st column:A:2, 3: $$
- line Rcrys Rfree $$
- $$
- 1 33.23 35.25
- 2 30.45 33.89
- 3 25.30 30.21
- 4 22.18 28.67
- $$
usage: xplot data.list >! data.xplot ; xloggraph data.xplot
will produce and xloggraph "version" of each column
of numbers in data.list.
- Rplot.com
- Rplot.com is a jiffy for displaying vital statistics from one or more
refmac
logs. The quantities Rcryst, Rfree, Rfree-
Rcryst, correlation coefficients, bond deviations, angle deviations,
and total number of atoms refined are listed in an xloggraph-readable
format. Also works on logs from wARP
refinements that use refmac.
When multiple logs are provided, they are sorted by creation date and the
values from the last refinement step in each file are listed against the
number extracted from the log filename. For example:
usage: Rplot.com logs/refmac*.log >! Rplot.xlog
would produce a file called Rplot.xlog that can be displayed
graphically by xloggraph.
- Drift.com
- Drift.com is a jiffy for displaying movement statistics from two or
more pdb files, and is usually used to see how much the model is moving
in an x-ray refinement run. The rms change in XYZ position and
B-factor are listed for C-alpha as well as for all atoms in an xloggraph-readable
format. The maximum shifts in XYZ and B are also listed. Provided pdbs
are sorted by creation date and separate listings are created for stepwise
(i vs i+1) differences, differences from the first file (i vs 1), and differences
from the last file (i vs n). The last of these is a good indicator of wether
or not the model is stuck, or is still "headed somewhere" in
the refinement.
usage: Drift.com pdb/refmac*.pdb >! Drift.xlog
would produce a file called Drift.xlog that can be displayed
graphically by xloggraph.
Sytemization Tools
- epmr.com
- epmr.com will run the program epmr (assumed to be in the $PATH) on every imput model specified
in every combination of resolution limit, number of monomers and space group provided.
usage: epmr.com data.mtz searchforme.pdb [SG] [reso] [n]
will run epmr on all the provided target PDB models. If you specify more than one model, SG,
reso, or n, then every combination of these parameters will be tried in turn.
example: epmr.com merged.mtz kinase.pdb P212121 4-15A 1
Will run epmr on kinase.pdb in P212121 using data between 4 and
15A and looking for 1 model in the ASU.
If you leave out the space group,
then epmr will be run in P212121,
P222, P2221, P2212, P2122, P21212,
P21221, and P22121.
SGsearch.com
SGsearch.com merges raw data in every possible space group (ones with
the same lattice), and prints out Rmerge and systematic absences
for each one. You must provide SGsearch.com with a functioning mergeing
script that accepts the unmerged mtz filename on its command line. The
raw datafile you provide will be reindexed, sorted, and passed to this
script. you may also specify a resolution limit.
usage: SGsearch.com merge.com raw.mtz 2.5A
will run the command "merge.com reindexed.mtz 2.5A"
where reindexed.mtz is a copy of raw.mtz reindexed to
each possible alternative space group to the one found in raw.mtz.
For example, if raw.mtz is in P212121,
then mergeing will be done in P212121,
P222, P2221, P2212, P2122, P21212,
P21221, and P22121, and display
mergeing statistics and systematic absence data for each. Note that some
of these are not "real" space groups, but represent different
screw axis assignments of P2221 and P21212.
autoscala
Autoscala optimizes the SDCORR card in a given scala script. Display
which one gives the best scatter/sigma (chi2).
usage: autoscala merge.com
will run the command "merge.com" but substitute the
line beginning with "SDCORR" in merge.com with
a series of possible SDCORR x y z command cards. The scatter/sigma
table produced by merge.com is checked, and a new SDCORR line is chosen,
based on the principle of the Golden Section search. Once the Golden Section
search converges, a file called merge.com_best will be created,
which is identical to merge.com, but with the SDCORR line
edited to the "best" values found.
merge.com
example scala/truncate script that works with the above systemization
programs.
usage: merge.com P2221 raw.mtz 2.0A
will merge the raw data in raw.mtz out to 2.0A, using P2221
symmetry. The output, merged file will be merged.mtz.
autoscalepack
same as autoscala, but for HKL's scalepack program. In this case, the
error_scale_factor and estimated_error lines of the provided scalepack
script. The caveats are that the scalepack script must be able to run on
its own (that is, it should begin with a:
#! /bin/csh -f
line), and it should not have any run-to-run "memory" (that is,
either delete the "reject" file, or don't write it out).
Alternately, if your scalepack.com script self-converges (re-runs scalepack
until there are no new rejections), it will also work with autoscalepack.
You can download an example of such a self-convergeing scalepack script
here: scalepack.com.
usage: autoscalepack scalepack.com
rrsps.com
Recursive, Real-Space Patterson Search is basically a more comprehensive
extension of the CCP4 rsps
program. After an initial "harker scan" possible sites are each,
in turn, checked for cross-scoring new sites. for each of these pairs of
sites, another cross-score is computed, and a list of candidate third sites
is obtained. This process is repeated, recursively, until no significant
(default: 3 sigma) new sites are found. Each constellation of sites
is then given a score, which is the product of all the peak heights in
the constellation. This list is sorted, and provided to the user for subsequent
evaluation. rrsps.com is a genuinely recursive shell program: it actually
launches a new instance of itself for each new crossvector search! One
might think that an exhaustive recursive search like this would take a
really long time, but, for modest numbers of sites (<10) it only takes
15-60 minutes on an SGI Octane workstation.
usage: rrsps.com patterson.map P212121 P222
Will search for site constellations consistent with the Patterson in
patterson.map, first using P212121,
and then P222 symmetry. Each space group is given a separate output file.
If you like, more than one Patterson map can also be given, and each will
be considered in separate runs. Remember, however, that because this is
a Patterson search, inversion-related constellations get the same score.
origins.com
this script checks
two PDB files against each other to see if they are the "same" crystal
structure with different origin choices. Molecular replacement and
heavy-atom finding programs will pick an origin at random, which makes
it difficult to compare the results of different runs.
origins.com will take the second PDB file on its command line and shift
it around using all possible symmetry, cell, AND origin-shift operations
and then check the RMSD to the first PDB file. The shifted model that best
agrees with the reference model is output as neworigin.pdb. Alternaltely, you can
mention the word "correlate" on the command line and use the correlation
coefficient of the electron density calculated from the shifted and not-shifted
PDB instead of the RMSD. RMSD only works when the two PDBs have the same
atom/residue names.
Multi-chain PDB files will have each chain aligned separately
and the origin choice giving the best combined score will be used to generate the
final neworigin.pdb file.
usage: origins.com right.pdb wrong.pdb P212121
will shift wrong.pdb around to each possible origin and symmetry operation allowed
for P212121. The
atoms in the shifted version of wrong.pdb will be compared to the ones in right.pdb
and the rmsd reported.
Alternately:
usage: origins.com right.pdb wrong.pdb P212121 correlate
will do the same shifts as above, but instead of a rmsd calculation, it will calculate
maps for each PDB and score based on the CC between them.
Conversion Tools
- reindex.com
- This is a "smart" script for doing simple reindexing
between Laue-equivalent space groups. All you need to give it are the mtz
file and the new space group. It works on both merged and unmerged data,
and the "pseudo" space groups P2122, P2212,
etc. are supported (wether or not you've got CCP4 4.x or not). The result
of reindexing to a "pseudo" space group will be an mtz with a
cannonical space group name (P2221 or P21212),
but with the cell axes permuted appropriately.
usage: reindex.com merged.mtz P222
will change the space group of "merged.mtz" to P222.
The new filename will be "reindexed.mtz".
FreeRer.com
This script does the same thing as the CCP4 "uniqueify"
program, except that it can "inherit" a free-R set from
an existing CCP4 or XPLOR/CNS reflection file. Also, FreeRer.com outputs
free-R flags in XPLOR/CNS format, as well as CCP4 mtz.
usage: FreeRer.com not_free_yet.mtz flags.cv
will import the free-R flags in the CNS reflection file: flags.cv
into the CCP4 mtz file: not_free_yet.mtz. Unfortunately, users
of X-PLOR/CNS will, all-too-often, forget to assign a free-R flag
to as-yet unobserved unique HKLs. In these situations, FreeRer.com will
use the "complete" feature of the CCP4 freerflag
program. FreeRer.com also generates a file called freeR_flag.mtz,
which contains just the complete, unique flag assignments out to 1.5A,
and can easily be converted to other formats. The output file, called FreeRed.mtz,
is identical to the not_free_yet.mtz file, but with the FreeR_flag
column added.
bestFH.com
Applies a generalization of the procedure developed by Matthews et
al (1965) for estimating the amplitude of the heavy-atom contribution (FH)
at each hkl by combining anomalous and isomorphous (or dispersive) differences.
Many people don't realize that there are systematic as well as random errors
in their difference Pattersons. The systematic errors arise because standard
difference Pattersons are calculated from (|FPH|-|FP|)2,
and not (|FPH-FP|)2. Therefore, there
are cross-terms missing if you're going to think of your difference Patterson
as the Patterson of your heavy metal sites. For example, if (for a particular
hkl) |FH| = |FPH-FP| is relatively large,
but FH is 90 degrees out-of-phase with FP, then
|FP| ~ |FPH| and the (|FPH|-|FP|)2
value used to calculate the difference Patterson will be near zero! On
the other hand, the anomalous difference of |FPH| = |FPH+|
- |FPH-| will be large. Considering the orthogonal
nature of anomalous and dispersive differences, it's quite remarkable that
isomorphous and anomaous difference Pattersons look alike at all. The Matthews
procedure combines both kinds of differences to estimate what the true
value of |FH| is.
bestFH.com takes a standard, merged CCP4 mtz file, and uses all the
differences between all the "F" datasets found therein,
along with all the anomalous difference "D" datasets to estimate
FH = |FPH-FP|. This is a similar procedure
to the CCP4 program revise,
except that it requires no keyworded input, and, theoretically, works for
all kinds of difference data, not just MAD. The only caveat is that all
the difference data provided to bestFH.com should be from metal sites at
the same XYZ location, otherwise, you will get an "averaged" FH for
all the site constellations.
usage: bestFH.com alldata.mtz
will calculate the best estimate of FH from the native and
derivative data sets found in alldata.mtz. In my experience, the
Pattersons produced by FH are cleaner than simple isomorphous
and anomalous difference Pattersons, and direct-methods programs like shelx
also work better with FH.
adsc2pgm
(sgi, linux,
c source)
Converts an ADSC Quantum 4
image into a Portable
Greymap image file. The Portable Greymap (PGM) format is the most general
kind of greyscale image format imaginable. Basically, it's a short text
header: "P2 <width> <height> 255", followed
by a list of text numbers between 0 and 255, which are the pixel values.
Most modern image file converters, such as ImageMagick,
can read PGM, and convert it to something that takes up less space.
You can specify a region of the detector face by using "-box x1
y1 x2 y2" on the command line, you can specify the "zoom factor" (size
scale) with "-zoom x", and the intensity normalization scale
using "-scale x". The program defaults to converting the whole
detector face, using a zoom factor of 0.25 (4 detector pixels -> 1
image pixel), and an intensity scale of rms(pixel value)/5, which, empirically,
gives a "nice" looking normalization in the 8-bit output
image.
usage: adsc2pgm input_1_001.img -box 100 100 300 300 -zoom 0.2 -scale 0.1 output.pgm
will take the region between pixels (100,100) and (300,300) on
input_1_001.img, multiply the pixel values by 0.1, and then output
every 5th pixel to output.pgm. If you don't specify an output
file name here, it would default to "input_1_001.pgm".
adsc2sgi (sgi,
c source)
same as adsc2pgm, but outputs an SGI *.rgb file.
osc2pgm
(sgi, linux,
c source)
Same as adsc2pgm, except that it converts an MSC R-axis
II or R-axis IV image file into a Portable
Greymap image file.
osc2sgi (sgi,
c source)
same as osc2pgm, but outputs an SGI *.rgb file.
tuencode.awk
Converts a list of text numbers to a (uuencoded) binary file of
the same composition. It is a convenient way to create an arbitrary binary
file without having to write and compile a c program to do it. The command
pipeline:
usage: echo "10 20 30 255" | awk -f tuencode.awk | uudecode
will create a four-byte binary file called binout.bin
containing the byte values: 0x0A, 0x14, 0x1E and 0xFF. This way, pure
byte fields are made, regardless of the endianism of the computer hardware.
Any numbers > 255 will be converted to the remainder of number/256.
refmac2o.awk
Jiffy for converting a cannonical refmac pdb file (with multiple conformers)
into something more friendly to o. Multiple conformer residues are converted
to "insertion" residues in the sequence, which o understands.
You can then rebuild each conformer as independent (but, admittedly, spacially
overlapping) residues.
o2refmac.awk
Jiffy for reversing the refmac2o.awk procedure on a sam-atom-out pdb
file from o.
Back to the Elves Page.
This page is not finished. It will never be finished, and neither will
yours. Admit it.
James Holton <JMHolton@lbl.gov>