I dare anyone who considers themself an expert macromolecular crystallographer to derive the structure in 3dko from this data.
Twinning has long been the kryptonite of anomalous phasing methods. Yes, there are a few examples out there, usually with a small number of sites, where a clever crystallographer (usually with surname "Dauter") was able to figure out the heavy atom partial structure despite the twinning. But, in general, heavy-atom finding programs get very confused by twinning. And a 50:50 "perfect twin" might be the most frustrating of all. You can see the anomalous differences. Even measure them very well. But you still can't make sense of them.
A major reason for this inefficacy in our software is that methods developers seldom get their hands on "interesting" twinned cases that have anomalous differences. Perhaps it is because the people who collected it are too embarassed to admit they had to find another crystal form? Also, it is difficult to pick and choose what twin fraction the data have. 50:50 is generally considered impossible to solve, but what about 60:40, or 70:30?
Well, here you go:
twin_5050.mtz
twin_5149.mtz
twin_5248.mtz
twin_5347.mtz
twin_5446.mtz<-- impossible?
twin_5545.mtz<-- harder
twin_5644.mtz<-- hard
twin_5743.mtz<-- possible
twin_5842.mtz
twin_5941.mtz<-- Pavol Skubak Crank2 solution
twin_6040.mtz<-- Takanori Nakane SHELX[CDE] solution
twin_7030.mtz
twin_8020.mtz
twin_9010.mtz
twin_9901.mtz
all twin fractions in 1% steps can be downloaded at once in
mtz or xds format
The "right answer" here is the PDB entry
3dko
modified slightly to have SeMet residues here.
The original structure is not from twinned data, but I selected 3dko because belongs to a space group that CAN be twinned.
3dko has 12 Met residues, so there are 12 selenium sites to find. How hard could that be? Well, here is the success rate with shelxd:
So, in this case the anomalous signal is quite strong, and shelxd finds all 12 sites with twin fractions
as high as 0.44, provided it runs for up to 100,000 trials. However, if you "cheat" and use the
phases of the final, refined, correct model to compute a phased anomalous difference Fourier,
then all 12 sites are clearly resolved above the tallest noise peak all the way out
to a twin fraction of 0.5.
They are actually from simulated diffraction patterns created for an educational workshop to demonstrate to novice crystallographers what twinning is. Specifically, there are two datasets: A and B, which are identical in every way except the crystal orientation. You can solve either one of them by SAD, no problem. However, if these two crystals were in the same beam at the same time, the diffraction pattern you'd get would be the pixel-by-pixel sum of the relevant images from the A and B datasets. You can generate this sum using my provided img_mix.com script. Just run it with no options to get online help. In this way, you can get any twin fraction you want. But, for starters, I'd say try your hand at the 80:20 case, and then see if you can get anywhere with 56:44.
Ever wondered how accurate all those "twin fraction estimates"
you get from your favorite programs are? Well, here you go:
Here "pointless" is just running the standard L-test, which is known to underestimate high twin fractions. Interestingly, refmac's
twin refinement, even given the right answer in the first place, tends to estimate a little high for low twin fractions. The
maximum-likelihood twin fraction estimated by phenix.xtriage (the last one reported in the log file) seems to be the most accurate
overall.
Using the right sequence information is not cheating, since that is generally something you will know before you sit down to collect data.
However, this does beg another important question: can MR-SAD let you get away with a more distant homolog than regular MR alone? What about in the presence of twinning?
James Holton <JMHolton@slac.stanford.edu>