P1 Lysozyme example These are notes I made while processing this data, so it is like a diary of what I did. The notes are raw, so you will see my "Ooops" every now and then. I have downloaded what appear to be 5 scans. There are more scans, but I will not use them for now. Scan SeqStart SeqEnd Dist 2theta Start End Inc a 0001 0060 400 0 -90 90 3.0 b 0001 0060 400 0 -90 90 3.0 (phi 180) c 0001 0120 200 0 -90 90 1.5 d 0001 0120 200 0 -90 90 1.5 (phi 180) e 0001 0360 100 0 -90 90 0.5 f 0001 0360 100 0 -90 90 0.5 (phi 180) By examination, I see that a & b are related and so are c & d. It appears that the crystal is rotated 180 degrees around phi or omega before collected b and d. The direction of rotation is important since these scans start at -90 degrees. The phi axis vector is either (1 0 0) or (-1 0 0). Trial and error shows it is (-1 0 0). Modified d*TREK on 2011-Apr-27 to have the correct gonio vectors and to use the CRYSTAL_GONIO_VALUES keyword in the header. ======================================================================= The basic problems with this crystal are rather trivial: P1. The crystal ends up being split at some point, so care must be taken to index from the major part of the crystal and not be mis-led by the smaller satellite bits of diffraction. This problem is overcome by using higher resolution reflections for the refinement and indexing. d*TREK allows one to specify different resolutions for the different steps including during integration. For example, we integrate the entire active area of the detector, but only use higher resolution spots to refine the crystla and experimental parameters. P2. The beam center is wrong in every scan. This is easily overcome by using dtdisplay to overlay a few images and inspecting the diffraction pattern. The dtdisplay "Beam circle" cursor can trivially get the beam center. P3. This is more insidious. The rotation axis is not perpendicular to the X-ray beam. The detector position is not well known. For example, the longer crystal to detector distance of scans a & b mean that the detector sags and has a significant 2-theta angle (it is NOT at 2theta 0 degrees). We overcome these kinds of problems by doing a global refinement of all parameters before we start to integrate. In order to do a global refinement, we find spots on many images rather than on one image. P4. Many reflections are saturated. They are deleted by d*TREK by default, but one could change this. P5. The multiple scans are not a problem by themselve even when merged together. However, I suggest that each scan be processed separately, then scaled separately at first to see if there are any problems. One can reject reflections from any scan, too. Then combine all the reflections and scale together. P6. Mosaicity. It may make sense to fix the mosaicity for some of the scans during integration. We will test a few hypotheses. P7. Default for spot circles in dtdisplay is 9 pixels, which is too small to see them, so change to 33 (dtdisplay>Edit>Refln view props...> Size 33 =========================================================================== Things to watch out for: W1. Make sure predictions are dead on. Watch during integration with dtdisplay especially in the corners and for widely-spaced-in-rotation images. W2. Be sure to mask out the shadows of the beam stop. W3. Watch how I created the refinment macro to use during integration. It is tricky, but better than the default macro. In essence: restrict resolution, use -cycles 30 and multiple -go steps: ... -reso 2 .5 -cycles 30 -verbose 0 -go -go -go -go -verbose 1 -go W4. During scaling, set the -sigma higher than normal if the scale factors do no vary smoothly. Then if still not smooth, reject some Batch IDs (that is, reflections with a given Batch ID). W5. We will use the d*TREK Prefix to name files and results. The prefix will be a_ for scan a, b_ for scan b, ..., and f_ for scan f. In dtintegrate, we will use Batch Prefix 1, 2, 3, ..., 6 for scan a, b, c, ..., f, respectively. W6. During scaling together of all scans, we may wish to "pre-scale" scans by multiplying I and sigI by a scale factor, so that the plots look nice. See prescale.csh We will process the separate scans in subdirectories A, B, C, D, E, F W7. I saved ALL the log files and most of the other files. Since d*TREK does automatic versioning of files (see the manual), the first version is file_1.log, the 2nd file_2.log while the most recent file will not have a version number: file.log. This may help you understand the path I took to get there. ================================================================ Scan A dtdisplay>File>New>Overlay ... 1-6, use the BeamCircle to set the beam center. Process>dtprocess... dtprocess, change Prefix to "a_", "Write a_dtprocess.head" for autoindexing choose the P1 spacegroup. If we Predict for Image 1, things are good, but when we Predict for image 60, it does not match perfectly. This suggests that the experimental hardware is not exactly as specified. So to refine all this, (Don't forget in dtdisplay>Edit>Refln view props...>Size 33) Set mode to Manual. Go to Find. Find spots on images 1, 15, 30, 45, and 60. Now do a better job of refinement since we suspect First use the macro above. Notice that at corners, predictions are still not perfect, so suspect lower reso spots have more weight, so change reso to "-reso 2.5 0.5" and allow lower sigma spots "-sigma 3" Change rejection limits to 1 1 1 (larger distance between predicted and observed get included, but not too far away) Integrate - Be sure to set BatchPrefix to 1 (we will use 2 for scan b, etc) Double check that the refinement macro was used. If it wasn't, then you forgot to click on (i.e. select) a resultant a_dtrefine.head earlier on. Scaling - try the defaults: Run Scale. Click on >Utils>PlotStats... The batch scale factors were not used, so increase Sigma from 5 to 10 and Run Scale. Looks better, Emul is < 1, so very nice. I restrained batch scale factors with 0.001 instead of 0.002. Also be sure to output a_dtunavg.ref for later use. Rmerge = 2.9% ================================================= Scan B We could use the detector position and beam center from Scan A, but let's just repeat what we did for Scan A but with scan B images. In dtscaleaverage, use -sigma 10 Rmerge = 2,7% ================================================= Scan C New detector position. And we need to mask out the beamstop shadow Index from image 1, predict for image 120 (see c1.xwd) not so good, so find spots on images 1, 30, 60, 90, 120, 45, 75, 105 and refine with restricted resolution to higher resolution Predictions are better, but not perfect, so restrict reso to 1.5 to 0.5 OK, that looks good. So integrate (don't forget BatchPrefix 3) While integrating, noticed that maybe the mosaicity should be fixed to a larger value like 0.8 or so. Let's see what scaling gives us, then perhaps re-integrate. Scaling 1: Seems OK, set -sigma 10 and repeat, Rmerge = 3.8%, but Emul is 1.90 and not below 1 as before. Let's re-integrate with fixed mosaicity. Re-integrate: Set d*TREK prefix to c_m0p8_ and in Integrate use -mosaicitymodel 0 0.8 Since I used the .head file from the END of the previous integration, I was to make sure predictions at BEGINNING of integration still match nicely. They do. Scaling 2: Rmerge = 3.0%, so that's probably all I want to mess with here. ============================== Scan D Proceed like Scan C (use c_beam.mask) Bad matching predictions for image 120. So try something drastic, reindex with the c_dtfind.ref from multiple images. Then try refinement with -reso 1.2 0.5, then with our macro used in this exercise Now predictions look outstanding. Integrate with a fixed mosaicity of 0.8 Scaling: Use the defaults Rmerge = 3.0% =================================================================== Scan E New scan, need new bad pixel mask. Find spots on a number of images, use them in dtrefine, things look great. Predictions match for images 1 to 360 Integrate with Batch prefix 5. Scaling: With defaults, a couple of images have bogus scale factors, so change from -sigma 5 to -sigma 10 to see if that makes things better... Does not quite work, so exclude batch 50321 from scaling and scale again. This worked! Rmerge=2.9% ===================================================================== Scan F Like scan E, use e_beam.mask Use the same refine macro: ... +All -sigma 2 -reso 1 0.5 -rej 1 1 1 -cycles 30 ... -verbose 0 -go -go -go -go -verbose 1 -go Scan F proceeds smoothly, Scaling: Suggest last batch should be excluded: 60360 Rmerge = 3.0% Ooops! All the files in the F subdirectory had d*TREK prefix e_ Simply re-run dtscaleaverage with the f_ prefix ===================================================================== OVERALL SCALING of the 6 scans There are many possible ways to scale, but we will start with the most straightforward: We will combine and scale the previously scaled (but unaveraged) measurements from the 6 scans that have been processed. Copy [a-f]_dtunavg.ref to the SCALE directory. Copy e_dtintegrate.head to the SCALE directory Now use dtprocess e_dtintegrate.head -nodisplay in the SCALE directory, in the Setup dialog, change d*TREK output file prefix to abcdef_ and then click on "Write abcdef_dtprocess.head" Then click on "[Merge refln files]" in the flow bar. Select the 6 *_dtunavg.ref files and click on "Run merge" (You will need to use ctrl-click to add to the selection.) This will also create an abcdef_dtreflnmerge.scom file that you will edit later. Select "Scale/Average" in the flow bar. You will now scale the merged file. Since absorption correction was applied it does not need to be applied again (but see below), so select "Batch only" Since the error model has already been adjusted it does not need to be adjusted again. Set "Weight multiplier" to 1.0 and "Weight addend" to 0.0001 Since batches were already rejected from the *_dtunavg.ref files, no batches need to be rejected at first. Click on "Run scale" Then click on Utils>PlotStats...(see scale1.xwd) The incoming intensities are wide-ranging as shown in the scale factor table and in the plot. This is OK, but not conducive to understanding what is going on. Let's get a better plot by pre-scaling all the scans before running dtscaleaverage. We do this by examining the log file and estimating a single scale factor to apply separately to the 6 scans. We could also base this on exposure time, but it is also easy to do this by inspection of the numbers. Let's try to make everything close to scan f. In the initial scaling, scan f has scale factors around 0.6, so does scan e, (that makes sense!), so we will not scale them, but will scale the other scans to them. It looks like scan a and b need a factor of 100, while scans c and d need a factor of 15. To do this see prescale.scom in the directory. We multiply intensities and sigmas by the scale factors with the options of dtreflnmerge: -fIntensity\*=100 -fSigmaI\*=100 where 100 is the multiplying factor and \*= is an "escaped *= operator. So edit abcdef_dtreflnmerge.scom and save the changes as prescale.scom then run that. Oops, output file name was *dtprofit.ref when it should have been *dtunavg.ref so simply edit prescale.scom and re-run. Now back in dtprocess, select the newly merged reflnlist file. It is not found in the "Reflnlists" list at first, so click on File>Reflnlist... and select it. This will also refresh the list in the main dialog. Click "Run Scale" Scales now plot better (scale2.xwd), but there are issues with scale factors of scans a and b. To try to get them to calm now, I will use -sigma 10. That seemed to work (see scale3.xwd). Let's also restrain the batch scale factors more with -batchrestrain 0.001. That is better, but there are still little unexplained blips for batch IDs 50001 and 40094, so reject them and scale again. Looks OK, so we are done. ============== Another way to Scale? Instead of using the *_dtunavg.ref to scale, let's to use the *_dtprofit.ref files instead. First combine them with the prescale_dtprofit.scom script Then in dtprocess, change d*TREK prefix to abcdef_dtp_ in order to keep things separated from the previous scaling. In the Scale/Average dialog, use "Batch+4th 3D", Weight multiplier -2, and Weight Added -0.03, no rejects, and batchrestrain 0.002 (i.e the defaults), Ooops, had wrong file name (extra "_e" in the name) for the merged *_dtprofit.ref files. So edit prescale_dtprofit.scom and run again. Then "Run scale" again. Results in abcdef_dtp_dtscaleaverage.log In theory, it looks OK, but in practice, I do not trust the error model (with Emul 0.15). I can try to manually adjust this and see the result. Change the d*TREK prefix to 2abcdef_dtp_ to keep these results and tests separate from the others. Set Emul to 1 and Eadd to 0.05. The results look rather good, so I'll reject batch IDs 40096 & 50001 and call it a day. FINAL results from 2abcdef_dtp_dtscaleaverage.log Summary of data collection statistics ------------------------------------------------------------- Spacegroup P1 Unit cell dimensions 27.08 31.26 33.76 87.97 71.99 67.86 Mosaicity 0.20 Resolution range 28.83 - 0.60 (0.62 - 0.60) Total number of reflections 981757 Number of unique reflections 182609 Average redundancy 5.38 (2.77) % completeness 73.8 (8.9) Rmerge 0.056 (0.551) Rmeas 0.060 (0.665) RmeasA (I+,I- reflns kept apart) 0.062 (0.647) Reduced ChiSquared 1.20 (1.50) Output 16.4 (1.5) ------------------------------------------------------------- Note: Values in () are for the last resolution shell. 995105 reflections in data set 2 reflections rejected (|ChiSq| > 50.00) 13348 reflections total rejected ( 1.34% |Deviation|/sigma > 40.14) FINAL output file for the next step: 2abcdef_dtp_dtscale.ref ================================================================== As an alternate, let's cut resolution to 0.65 Angstrom and check statistics Use 0.65A_2abcdef_dtp_ for the PREFIX. Results end up in 0.65A_2abcdef_dtp_dtscaleaverage.log and 0.65A_2abcdef_dtp_dtscale.ref