Understanding Resolution

Simply put, "Resolution" is a term used to describe the potential for detail in an image. The term is used throughout cinema, from lenses, cameras, sensors and images through to projectors and other displays. However, resolution is just one factor contributing to our perception of detail.


Our overall perception of detail depends not just on the finest image features, but also on how the full spectrum of feature sizes is rendered. With any optical system, each of these sizes is interrelated. Larger features, such as the trunk of a tree, retain more of their original contrast. Smaller features, such as the bark texture on a tree trunk, retain progressively less contrast. Resolution just describes the smallest features, such as the wood grains, which still retain discernible detail before all contrast has been lost.

A key reason why higher resolution imagery ends up looking better therefore isn't just because finer detail becomes visible but also because its benefits cascade to features at coarser sizes. In the above example, having a higher resolution could cause the wood grains to become rendered with as much contrast as the bark was previously. Best of all, these benefits apply even if an image isn't used for display at its full resolution, since higher contrast, large-scale features are retained.

However, one can always have too much of a good thing, and resolution is no different in this regard. If the resolving power of a sensor is pushed too near its pixel resolution, then unsightly digital artifacts will become increasingly apparent. Even worse, these can adversely affect detail at all feature sizes, and drastically reduce video compression efficiency by causing false motion.

Achieving a maximally detailed, natural-looking image therefore requires the right balance of high resolution and low digital artifacts. When done correctly, the benefits translate into applications spanning full resolution theater display all the way down to resized videos on YouTube.


Resolution is not sharpness! Although a high resolution image can appear to be sharp, it is not necessarily so, and an image that appears sharp is not necessarily high resolution. Our perception of resolution is intrinsically linked to image contrast. A low contrast image will always appear softer than a high contrast version of the same image. We see edges and detail by the contrast that edges make.

Sharpness can be very visually appealing in an image, but the overall perception of sharpness is linked to image contrast at courser levels of detail, whereas resolution is usually interpreted to mean contrast at the finest levels of detail. To encompass all these meanings of resolution and sharpness, we need to see how contrast in an image varies relative to the size of detail being represented, and that leads us to Modulation Transfer Function or MTF.


Unlike resolution, which is often quoted as a single number, MTF is best represented as a graph showing how image contrast varies with respect to spatial frequency (size of detail). High spatial frequencies represent the fine details, and low spatial frequencies represent larger details.

MTF can be thought of as a black box which quantifies how each input detail frequency in a subject gets translated into an output image by the optical system. For example, in the chart below, detail with a frequency/scale of 1K gets reduced to 90% of its original contrast when recorded, whereas detail at 2K gets reduced to about 60% of its original contrast.

Perceived image sharpness is closely related to what we refer to as MTF50, the spatial frequency at which the MTF has dropped to 50% of its low spatial frequency value. Traditional "resolution" numbers relate to the spatial frequency at which the image contrast has dropped far enough that it is just perceptible as detail, which is subjective and usually lies around 5%.


Nyquist-Shannon sampling theory describes the number of discrete samples necessary to quantify a continuously-varying signal. In the case of digital imaging, this can be used to describe the necessary pixel density for capturing detail at a given size. However, Nyquist-Shannon sampling theorem is often incorrectly applied to pixels as meaning: "If you have X samples across a line you can only resolve X/2 without experiencing aliasing artifacts."

Let's instead start with Shannon's original theorem: "If a function x(t) contains no frequencies higher than B hertz, it is completely determined by giving its ordinates at a series of points spaced 1/(2B) seconds apart."

The confusion begins due to the theorem statement being formulated in terms of frequencies rather than pixels. The equivalent to frequency in hertz is line-pairs, and there are half as many line-pairs as there are lines. When we translate the statement into more familiar language, it's telling us that: "If you have more than X samples across a line you can resolve X without experiencing aliasing artifacts". Thus sampling theorem sets up a minimum condition for avoiding aliasing, but notice that although we "can" resolve X, it doesn't say we "will."

The oft-forgotten part of sampling theorem though, is "If a function x(t) contains no frequencies higher than B hertz", where it describes a bandwidth limited function. In imaging, we don't have a bandwidth limited input signal because the amount of detail in the real world is immense, and although lenses won't allow the transmission of all that detail to the sensor, they're generally designed to have a high MTF and transmit as much of that detail as possible.

Optical Low Pass Filters (OLPF) attempt to produce a bandwidth limited input signal by putting an upper limit on the amount of detail that can reach a sensor. However, due to the physical nature of light and optical filtration, they cannot be totally effective in limiting the input detail while retaining good image resolution and MTF. This does not mean we must accept gross aliasing artifacts, but means that we're unlikely to remove all possibility of aliasing. Careful balance of OLPF formulation is necessary to achieve a low incidence of aliasing in practical shooting and high MTF.


In our quest for high resolution and strong MTF at all spatial frequencies, we could be tempted to go too far. One of the biggest dangers of trying too hard for resolution is called "aliasing" and is often described in images as "stair-stepping" or "moir_".

Aliasing is not just an un-desirable artifact from a visual point of view, but also from a technical perspective. Most movie material is eventually delivered to the viewer through some kind of digital compression scheme, and compression can react badly with aliasing in two ways: (i) the false detail from the aliasing wastes compression bits to represent it, whereas the false detail doesn't represent actual wanted image data, and (ii) on moving images, the aliases move in the opposite direction to the motion of the object which can confuse motion adaptive compression schemes, further wasting precious compressed bits.

Aliasing also adds a false perception of detail, and while this can "work" on some scenes, as soon as that false detail appears on a repeating pattern it visualizes as moiré and becomes objectionable.

Sinusoidal circular zone plates are particularly useful for measuring resolution, as they clearly show any aliasing artifacts from any sampling process as circular patterns not concentric with the main pattern. At the center of the pattern we have rings which have a wide spacing to represent low spatial frequencies. Toward the outer edges, the rings become progressively closer to represent higher spatial frequencies. The transition from the central to outer rings is a smooth and linear change in spatial frequency so that resolution and MTF are easy see.

Aliasing occurs when a sampled system (like a camera sensor) is asked to deal with too much detail (see sidebar on sampling theorem). Once present, it is intrinsically difficult to successfully remove from an image, in part because it occurs at the point of sampling, but also because aliasing effects visually "fold back" as a lower frequency pattern. The higher the frequency causing the aliasing, the lower the frequency it folds back to and the harder it is to remove or disguise, and the more visually annoying it becomes.


The RED approach realizes that to achieve a high MTF50, you must aim for a high MTF5 with negligible aliasing through optimum use of an optical low pass filter. RED cameras achieve this by design, and for example, with the RED EPIC® and its 5K sensor (5120x2700), the high levels of detail are easily visible, along with superb MTF50 and optical filtering ensures very low levels of aliasing. This level of performance is not achievable with HD cameras which must choose to either allow excessive aliasing or have soft images.

Although the MTF plot tells the story, the images of a zone plate show these results visually. They show the superbly clean detail and high contrast at 2K (from a 5K capture) with no hint of aliasing. This is the kind of real detail you can see in all viewing environments, and it is what gives you the visual impact your RED footage will make on the big screen.