MANUAL

Chapter 8: X-ray basics

Glossary of X-ray terms

absorption

The absorption of x-rays as they pass through the protein crystal is probably the largest source of systematic error in macromolecular crystallography. Suprisingly, it is almost always ignored. Like any other kind of light, x-rays are absorbed by protein and water in your crystal (and loop). About 20% of a 1.54A x-ray beam's intensity is absorbed after passing through 200 microns of a protein crystal. If there are metals in the crystal, then the absorption goes up a lot, especially if you are using x-rays on the metal's absorption maximum. (like you do for MAD!) These numbers are somewhat alarming, and they probably should be. However, as long as the average linear pathlength experienced by each photon as it passes through the crystal is the same, the effect of absorption can be reduced to a simple scale factor.

Unfortunately, since the photons contributing to each spot emerge from the crystal at different angles, and spots only occur when the crystal is oriented just so, every observed spot has at least a slightly different average pathlength through the crystal. This variability in pathlength is most pronounced in plate and needle crystals, where a diffracted x-ray beam could be travling through 500 microns of crystal in one direction, but only 50 in another. The major effects of absorption can be corrected by fudging the scale and B factors assigned to the diffraction images, perhaps even using an anisotropic B factor. Localscaling is a popular solution, but, as its name implies, it can only correct for relative absorption effects in a small region of reciprocal space. Perhaps the most sucessful absorption corrections have involved calculating intensities from a refined molecular model, and then re-scaling the raw diffraction data, using the calculated intensities as a guide to the "true" intensity values. Such a procedure is, obviously, prone to bias, so Scaler Elves have introduced the application of cross-validation (Free-R) to absorption corrections.

anomalous

Anomalous scattering results when some of the atoms in your crystal scatter "out of phase" with the rest of the atoms in the crystal. Normally, the carbon, nitrogen, oxygen, sulphur, and even metal atoms in your crystal all scatter x-ray photons in more-or-less the same way. This is largely due to the fact that x-ray photons have shuch short wavelengths that all electrons, even ones bound tightly in the core of atoms aren't preturbed very much by the electric field of the passing photon, and can be treated like "free" electrons. However, if the passing photon has an energy close to the binding energy of one of the atom's electrons, then this electron can be ejected from the atom. This ejection affects both the degree of scattering and the timing of the scattering event. That is, there is a delay in the time it takes to scatter that photon. This delay in the scattering means that the contribution of the anomalously scattering atom to the diffraction pattern as a whole is different, depending on the wavelength of x-rays used to illuminate the crystal.

phi - see angles

chi - see angles

omega - see angles

2-theta - see angles

kappa - see angles

angles

There are several customary names for the various angles involved in the diffraction experiment. The most commonly referred-to angle is phi, which is sometimes called the "spindle axis". Although other rotation axes are available on some, more complex diffractometers, "phi" is almost always the angle varied when collecting data from protein crystals. The second most commonly referred-to angle is 2-theta. It can be a little confusing why 2-theta is used, instead of just calling it theta, but theta is, traditionally, the angle between the Bragg plane in the crystal, and the x-ray beam. Since the x-rays are "reflected" off of this plane, the take-off angle of the diffracted x-ray beam (spot) relative to the main beam is 2*theta. In the old, diffractometer days, the detector 2-theta angle determined the position of the pinhole at the center of the detector (Geiger counter or the like), and was used to easily calculate the resolution of the spot being measured. Upon the introduction of area detectors (which were mounted on the same diffractometer machines) the designation of the 2-theta angle was preserved.

Oher diffractometer axes can be used for rotaton, but phi is the most common. The omega angle, for example, is around an axis perpendicular to the beam (usually coincident with phi). The chi angle is the rotation around the x-ray beam axis. The kappa angle is a bit more nebulous, and

area detector

Although the term "area detector" now usually refers to multiwire type detectors, an "area detector" is just any kind of x-ray detector that can collect diffraction information on an array of positions at once, instead of just one point in space at a time, as with a diffractometer. Therefore, film-based, image plate, CCD, and multiwire detector types all "count" as area detectors.

bandwidth

Bandwidth is the inverse of spectral dispersion. It relates the "error" in the photon energy to the photon energy itself. For most kinds of monochromators, this ratio is relatively constant over a range of photon energies. A bandwidth of 4000 means that, at 1.00 Angstrom, the x-ray spectrum seen by your crystal is a (roughly Gaussian) distribution about 3 eV wide.

Bragg condition

In order for x-ray diffraction to occur from a crystalline lattice, the Bragg condition must be satisfied. The Bragg condition depends on the angle of the incident x-ray beam as it enters the crystal lattice and the direction at which the diffracted beam exits the lattice. It is met only when the scattered waves from all the atoms in the lattice are in phase, and interfere constructively. As you might have guessed, the Bragg condition is rarely met. In fact, for a perfect crystal, it is only met at infinitesimally small points in the 4-dimensional graph of incident and take-off x-ray beam angles in and out of the lattice. (2 angles in, 2 angles out). This is why x-ray diffraction only occurs as discrete spots on the detector face. For real crystals however (that aren't perfect) these infinitesimally small points become smeared out a bit (see mosaicity), and this is why you see more than one spot on the detector at a time.

A geometric construction (the Ewald construction) makes the Bragg condition easier to visualize.

Ewald sphere

The Ewald construction makes the Bragg condition easier to visualize (easier than a 4-dimensional graph, that is). The Ewald construction is done in reciprocal space, where every distance is measured in 1/Angstroms. You can plot the reciprocal-space version of the crystal lattice in reciprocal space as a regular, 3-D array of dots (lattice points). As you rotate the crystal in real space, it also rotates the reciprocal lattice in reciprocal space (by the same angle). Now, imagine a sphere whose radius is the inverse of the x-ray wavelength (pretty large, compared to the lattice spacing). The surface of this sphere passes through the origin of reciprocal space (and the center of the reciprocal lattice). The center of the sphere lies on the line defined by the x-ray beam, passing through the origin. This is the Ewald sphere. The Bragg condition is satisfied only when a reciprocal lattice point touches the surface of the Ewald sphere. As we rotate the reciprocal lattice, we can, eventually, sweep almost all the reciprocal lattice points through the Ewald sphere (and record their intensities on the detector). The only lattice points we can't make pass through the sphere are the ones that are very close to the crystal rotation axis. However, these "cusp" lattice points are usually related to observed lattice points by some kind of symmetry operation. If not, they can be picked up by rotating the crystal about another axis (usually kappa), and sweeping out an appropriate range of phi values again.

reciprocal space

The diffraction pattern you see from your crystal arises directly from the interaction of the Ewald sphere will your crystal lattice in "reciprocal space". As opposed to "real space", or "direct space", reciprocal space is related to real space in the same way that frequency is related to time. A point in "frequency space" corresponds to a well-defined frequency (measured in Hertz, or 1/seconds). Each point in "frequency space" can be represented as a perfect sine wave in "time space" (where things are measured in seconds). In a similar way, a particular point in reciprocal space corresponds to a repeating pattern of electron density vs position in real space. In real space, distances are measured in Angstroms, and in reciprocal space, distances are measured in 1/Angstroms. An infinitely sharp peak in reciprocal space, therefore, corresponds to a sine wave plotted against position (instead of time) repeating every so many Angstroms. Because we are interested in 3-D protein structures, we should extend this to three dimensions. A point in 3-D reciprocal space has three "spacial frequencies" and corresponds to a kind of 3-D sine wave in real space with three "wavelengths", one for each direction. These 3-D sine waves add up to produce your electron-density map in real-space.

real space

Real space, sometimes referred to as "direct space", is the kind of space you are used to dealing with. You look at protein models in real space. In real space, you have three dimensions, and you measure distances in Angstroms.

Patterson space

Or "vector space" The appearance of an object in Patterson space can be calculated by plotting the length and direction of all the atom-to-atom vectors in yet another 3-D space. However, "xyz" in Patterson space is usually referred to as "uvw", where u represents the change in x, v the change in y, and w the change in z. If you have N atoms in your molecule, there will be N² peaks in Patterson space. As you might have guessed, Patterson space is a lot more complicated than direct space. The reason why Patterson space is interesting is because you can obtain a representation of your protein in Patterson space directly from the diffraction data, without any need of phase information. Most proteins are too complicated to solve in Patterson space, but, if you subtract your protein's Patterson map from the Patterson map of a metal derivative of your protein, you will get a "difference Patterson" map, that (hopefully) corresponds to the representation of your metal sites in Patterson space. The constellation of a few metal sites can usually be deduced from their representation in Patterson space.

inverse beam

Sometimes referred to as "anomalous" geometry, inverse beam just means rotating the crystal 180 degrees. The "anomalous wedge" is just a wedge collected 180 degrees away from the current wedge. Inverse beam geometry is often used to collect anomalous scattering information because it gaurentees that each Friedel mates of the spots recorded in the first wedge will be recorded in the "anomalous wedge". Although it is possible to use symmetry to pick up Friedel (Bijvoet) mates too, the inverse beam geometry minimizes systematic errors incurred from absorption effects, and from measuring Bijvoet mates different numbers of times.

normal beam

Nowadays, almost all protein crystal diffraction experiments are done in what is called "normal beam" geomentry. This just means that the rotation axis of the crystal is perpendicular (normal) to the x-ray beam. Most data-processing programs assume normal beam geometry, and can have serious problems processing data collected with a phi axis that is significantly non-perpendicular to the x-ray beam. A few programs (such a s d*trek and HKL2000) can handle odd rotation angles, but most synchrotrons have pretty well aligned spindles anyway.

divergence

This is the angle (usually in degrees) at which the x-ray beam spreads out as it moves away from the colimnator. For example, a 100 micron beam with a divergence of 0.06 degrees, would be about 200 microns wide at 100 mm from the colimnator.

dispersion

This number is the inverse of bandwidth, and represents the relative width of the distribution of wavelengths in the x-ray beam. That is, if you plotted the "emission spectrum" of an x-ray source (say a graphite crystal monochromator) with a spectral dispersion of 1/1000 at 1.54 Angstroms, then you would see a peak, centered at 8014eV with a width of about 8eV.

wavelength

A wavelength is a distance travled, over space, where a travling wave begins to repeat itself.

For x-rays (and other kinds of light), each photon can be thought of as a travling, vibrating electric field, moving through space at the speed of light. The frequency of this vibration, when multiplied by the speed of the wave (in this case, the speed of light), is the distance travled by the wave over the course of a single cycle of vibration. This is called the wavelength.

Every photon also carries some energy with it, and this energy is proportional to the frequency of vibration of the photon: E=hv, where v is the frequency, and h is "Planck's constant". you have probably noticed that synchrotron pysicists prefer to talk about their x-rays in terms of their photon energy, whereas crystallographers are used to thinking in Angstroms, so they prefer wavelengths. However, near an x-ray edge, it is arguable that photon energies are better units, because changes on the order of 1eV are important, but correspond to a (seemingly insignificant, and all-too-often rounded off) 0.0001 Angstrom change in wavelength.

To convert from a wavelength to a photon energy (or vice-versa), you need to divide 12398.4245 / x (Planck constant (in eV*s) * speed of light (in Angstroms/s)), where x is either a wavelength (in A)or an energy (in eV).

beam

X-rays used for crystal diffraction experiments are colimnated into a narrow beam, usually 0.1-0.5mm in diameter. However, like any beam of light, the x-ray beam has some divergence to it. The other important property of the x-ray beam is the wavelength of the x-ray photons in it, and the relative dispersion of the x-ray spectrum.

oscillation

The oscillation method of x-ray diffraction data collection is done by rocking the crystal over a small angle (around 1 degree) during the exposure. For most data-processing programs the crystal must move at a constand speed, spending an equal amount of time at every value of the angle between the start and end points. For the next exposure, the crystal rotation is advanced by the same small angle, and rocked again. For example, the crystal might be rocked between phi=0 and phi=1 degree for the first exposure, phi=1 and phi=2 degrees for the second, and so on. A collection of abutting oscillation diffraction images is commonly referred to as a wedge.

When collecting x-ray data, you are interested in the total intensity (number of photons) delivered to a particular spot (relative to all the others). However, for any particular orientation of a mosaic crystal (such as a protein crystal) only some of the crystal's unit cells are contributing to the diffraction. The rest of the unit cells are oriented very slightly differently from the diffracting ones, and, therefore do not satisfy the Bragg condition, and don't diffract. To "integrate" over the whole crystal, you need to rock it, slowly and smoothly, over an angle that is larger than the mosaicity so that each unit cell has equal time and opportunity to diffract x-rays. The time-averaged spot intensity can then be taken as the "full" intensity.

You might think it would be simplest just to rotate the crystal over 360 degrees to gaurentee that all the spots would be fully-recorded on the detector, but such a photograph would result in many spots that fell too close to each other on the detector to resolve and measure separately (called overlaps).

wedge

The term "wedge" is somewhat informal, but is used to describe a series of contiguous oscillation diffraction images that cover (inclusively) a particular rotation range (usually phi range) of the crystal.

For example, the first image in a typical "wedge" from 0 to 90 degrees would be an x-ray exposure, taken while the crystal was rotated steadily from a phi setting of 0 degrees to a phi setting of 1 degree. An even exposure at every angle during this rotation is assumed by most data-processing software. The second image in the "wedge" would run from 1 to 2 degrees, and so on.

mosaicity

Mosaicity is the width of the distribution of mis-orientation angles of all the unit cells in a crystal. Not every crystal is perfect, and protein crystals are usually far from it. Imperfections in crystals come in many flavors, which I won't go into here, because protein crystallographers almost never think about them. Instead, imperfect crystals are thought of as a conglomeration of tiny, perfect crystals which all have slightly different orientations. If you plotted the orientation of all these tiny perfect crystals into a big, (possibly 3-D) histogram, then the width of this histogram would be called the "mosaicity" of the crystal. This number has units of degrees, and reflects the relative, average misorientation of the tiny perfect crystals (sometimes called mosaic domains, or mosaic blocks) in your crystal.

High mosaicity has the effect of broadening your spots. That is, spots from a highly mosaic crystal (for example, one that you dropped on the floor) would appear "smeared" tangentially (the direction perpendicular to the line between the spot and the beam center), as though the crystal had been rotated around the beam slightly during the exposure. There would also appear to be more spots than usual, because, for a mosaic crystal, any given rotation angle of the whole crystal (phi), actually represents a range of rotation angles (one for each mosaic block), and each of these blocks diffracts independently of each other. Therefore, spots from highly mosaic crystals will stay in the diffraction condition longer than low-mosaic ones, and each spot will tend to get spread over a wider rotation range, and be more likely to overlap with other spots. Mosaicity can also smear spots radially (along a line drawn from the spot to the beam center)

The problems presented by mosaicity are mainly weak diffraction and spot overlap. Spreading a spot over a broader region of

For a fascinating discussion of cutting-edge investivations into crystal mosaicity, you should look at NASA's space-crystal growth studies.

x-ray

Oficially, the x-ray spectrum is defined as electromagnetic radiation (light) with photon energies ranging from 100eV to 100keV (wavelengths of ~123 Angstroms to 0.1 Angstrom), but x-ray crystallography is usually restricted to wavelengths between 2 and 0.7 or so Angstroms. Because of their short wavelength, x-rays don't interact readily with matter, and tend to pass right through most materials. However, when they do interact with an atom, they deliver a lot of energy to it, usually ionizing the atom. Thus, x-rays are ionizing radiation, and ionizing radiation can do bad things to living matter, such as yourself. So, try not to expose yourself to x-rays.

error

There are two kinds of error: random error, and systematic error. Random error results from random fluctuations in the system you are using to measure something. The nice thing about random error, is that it, eventually, "cancels out" if you do a large number of repeated measurements. In x-ray diffraction, random errors result from the fact that you are counting photons (which are true random events), and there is an intrinsic uncertainty in counting random events (error in count = sqareroot(counts)). However, if you count enough photons, the ratio of the count to the square root of the count (expected error), will become smaller and smaller.

Systematic errors are different. They result from errors in the model you are using to interpret your measurements. For example, the assumption that the solvent channels of your crystal contain no scattering matter is, well, wrong, and large systematic errors result from protein models that do not try to model solvent. A major source of systematic error in spot measurment is the absorption of x-rays by your crystal.

localscaling

Localscaling was first proposed by Matthews and Czerwinski (1975) as a means of minimizing the systematic errors incurred by absorption effects. There are many ways to do localscaling, but the all invole assigning an effective scale to a region of (unreduced) reciprocal space.

B-factor

A very uninformative name, to be sure, but the B-factor is a measure of the effective diameter of an atom's electron density. Various kinds of disorder (static and thermal) can effectively "spread out" the electron density of a given atom, and this, formally, increases it's B-factor. The B-factor is related to the rms error in an atom's position (u) by the equation: B=79*u2. For this reason, B-factors are related to resolution.

resolution

The resolution of a particular crystal's diffraction is often given in Angstroms. This number however, is not the expected error in the positions of the atoms, but rather the shortest reciprocal-space "wavelength" of all the significant 3D sine waves that add up to make the electron density map. As a rule of thumb, the expected error in atomic positions is about 1/3 of the resolution. This is because each peak in a sine wave has a half-maximum width of about 1/3 of a cycle. Therefore, the smallest feature you're going to see in a 3A electron density map is about 1A wide. For a graphical depiction of exactly what maps look like at different resolutions, have a look at my movie.

Free-R

Initially called cross-validation ananlysis, the free R was introduced into macromolecular crystallography (Brunger et. al. 19xx) as an attempt to deal with the very low observations/parameters ratio that crystallographers have to deal with. Any good scientist will tell you that fitting a 10-parameter equation to 10 data points is a bad idea, and fitting 50,000 parameters to 50,000 observations is even worse. However, depending on the resolution and solvent content of your crystal, you may have to do just about that. If one considers a 10,000-atom protein (~70kD), then you are dealing with about 30,000 parameters of x-y-z freedom. If B-factors are considered, this number goes up to 40,000 free parameters. If a crystal of this protein has a solvent content of about 50%, and diffracts to 2.7A, then it should have around 40,000 unique hkls (observations). So, you're going to be fitting 40,000 parameters to 40,000 observations. It's not quite that bad, because of the constraints, but it is still a horendous fitting problem.

To help identify local minima, Brunger et. al. suggested a cross-validation analysis. A statistically significant (at least 1000 spots, usually 5%-10% of the total spots) group of randomly-chosen spots are hidden from the fitting procedure (in this case, macromolecular refinement). A fitting procedure will always improve the agreement of a model with observed data, wether the model is correct or not. But, if the fitting is indeed improving the model, then it should also be predicting more and more correct values for these hidden data as well. Note the catch-22 nature of this cross-validation analysis: although the free R is a useful indicator of wether or not your model is improving, you CANNOT use it as a basis for making decisions in your refinement, otherwise, (via your brain) those hidden data are being used in a fitting procedure, which defeats the purpose of having a free R in the first place.

Back to the Elves Manual Table of Contents.

This page is not finished. It will never be finished, and neither will yours. Admit it.

James Holton <JMHolton@lbl.gov>