Color conversion matrices in digital cameras: a tutorial

D. Andrew Rowlands

doi:10.1117/1.OE.59.11.110801

17 November 2020 Color conversion matrices in digital cameras: a tutorial

D. Andrew Rowlands

Author Affiliations +

Optical Engineering, Vol. 59, Issue 11, 110801 (November 2020). https://doi.org/10.1117/1.OE.59.11.110801

Abstract

Color conversion matrices and chromatic adaptation transforms (CATs) are of central importance when converting a scene captured by a digital camera in the camera raw space into a color image suitable for display using an output-referred color space. In this article, the nature of a typical camera raw space is investigated, including its gamut and reference white. Various color conversion strategies that are used in practice are subsequently derived and examined. The strategy used by internal image-processing engines of traditional digital cameras is shown to be based upon color rotation matrices accompanied by raw channel multipliers, in contrast to the approach used by smartphones and commercial raw converters, which is typically based upon characterization matrices accompanied by conventional CATs. Several advantages of the approach used by traditional digital cameras are discussed. The connections with the color conversion methods of the DCRaw open-source raw converter and the Adobe digital negative converter are also examined, along with the nature of the Adobe color and forward matrices.

1. Introduction

Consider converting a scene captured by a digital camera in the camera raw space into a digital image suitable for display using an output-referred color space. At the very least, there are two issues of fundamental importance that must be addressed when attempting to correctly reproduce the appearance of color. The first is that the response functions of digital cameras differ from those of the human visual system (HVS). A widely used approach to this issue is to consider color spaces as vector spaces and to account for the differences in response by introducing a color conversion matrix. A type of color conversion matrix that is commonly encountered is the $3 \times 3$ characterization matrix $\underline{T}$ that defines the linear relationship between the camera raw space and the CIE XYZ reference color space:

Eq. (1)

[\begin{matrix} X \\ Y \\ Z \end{matrix}] \approx \underline{T} [\begin{matrix} R \\ G \\ B \end{matrix}] .

In general, camera raw spaces are not colorimetric, so the above conversion is approximate. The relationship can be optimized for a given illuminant by minimizing the color error. Significantly, this means that the optimum

\underline{T}

depends upon the nature of the scene illuminant,¹^,² including its white point (WP). The characterization methodology for determining the optimum

\underline{T}

is described in Sec. 2.4, along with an illustration of how

\underline{T}

should be normalized in practice.

The second issue that must be addressed is the perception of the scene illumination WP. Although the various adaptation mechanisms employed by the HVS are complex and not fully understood, it is thought that the HVS naturally uses a chromatic adaptation mechanism to adjust its perception of the scene illumination WP to achieve color constancy under varying lighting conditions.³^,⁴ Since digital camera sensors do not naturally adapt in this manner, incorrect white balance (WB) will arise when the WP of the scene illumination differs from the reference white of the output-referred color space used to encode the output image produced by the camera. As demonstrated in Sec. 3, digital cameras must attempt to emulate this chromatic adaptation mechanism by utilizing an appropriate chromatic adaptation transform (CAT).

As discussed in Sec. 4, modern smartphones and commercial raw converters typically calculate the optimum characterization matrix $\underline{T}$ by interpolating between two preset characterization matrices according to an estimate of the scene illumination WP, and the CAT is implemented after applying $\underline{T}$ . In traditional digital cameras, the color conversion is typically reformulated in terms of raw channel multipliers and color rotation matrices $\underline{R}$ . This approach offers several advantages, as discussed in Sec. 5. A similar but computationally simpler approach is used by the DCRaw open-source raw converter, as discussed in Sec. 6. The open-source Adobe^® digital negative (DNG) converter offers two color conversion methods, and the nature of the Adobe color matrices and forward matrices are discussed in Sec. 7. Finally, conclusions are drawn in Sec. 8.

2. Camera Raw Space

2.1.

Gamut

The camera raw space for a given camera model arises from its set of spectral responsivity functions or camera response functions:

Eq. (2)

R_{i} (λ) = {QE}_{i} (λ) \frac{e λ}{h c},

where

e

is the elemental charge,

λ

is the wavelength,

h

is Planck’s constant, and

c

is the speed of light. The external quantum efficiency for mosaic

i

is defined by

Eq. (3)

{QE}_{i} (λ) = T_{CFA, i} (λ) η (λ) T (λ) FF,

where

T_{CFA, i}

is the color filter array (CFA) transmittance function for mosaic

i

,

η (λ)

is the charge collection efficiency or internal quantum efficiency of a photoelement, and

T (λ)

is the

{SiO}_{2} / Si

interface transmittance function.⁵ The fill factor is defined by

FF = A_{\det} / A_{p}

, where

A_{\det}

is the photosensitive detection area at a photosite and

A_{p}

is the photosite area. The spectral passband of the camera should ideally correspond to the visible spectrum, so an infra-red blocking filter is required.

Analogous to the eye-cone response functions of the HVS, which can be interpreted as specifying the amounts of the eye cone primaries that the eye uses to sense color at a given $λ$ , the camera response functions can be interpreted as specifying amounts of the camera raw space primaries at each $λ$ . For example, the measured Nikon D700 camera response functions are shown in Fig. 1. However, a camera raw space is colorimetric only if the Luther-Ives condition is satisfied,⁷^–⁹ meaning that the camera response functions must be an exact linear transformation of the eye-cone response functions, which are indirectly represented as a linear transformation from the CIE color-matching functions for the standard observer.

Fig. 1

Camera response functions for the Nikon D700 camera. The peak spectral responsivity has been normalized to unity. Data sourced from Ref. 6.

Although the eye-cone response functions are suited to capturing detail using the simple lens of the human eye, digital cameras use compound lenses that have been corrected for chromatic aberration. Consequently, camera response functions are designed with other considerations in mind.¹⁰^,¹¹ For example, better signal-to-noise performance is achieved by reducing the overlap of the response functions, which corresponds to a characterization matrix with smaller off-diagonal elements.¹⁰^–¹² Indeed, minor color errors can be traded for better signal-to-noise performance.¹⁰^–¹³ On the other hand, increased correlation in the wavelength dimension can improve the performance of the color demosaicing procedure.¹⁴ Due to such trade-offs along with filter manufacturing constraints, camera response functions are not exact linear transformations of the eye-cone response functions in practice. Consequently, camera raw spaces are not colorimetric, so cameras exhibit metameric error. Metamers are different spectral power distributions (SPDs) that are perceived by the HVS to be the same color when viewed under exactly the same conditions. Cameras that exhibit metameric error produce different color responses to these metamers. Camera metameric error can be determined experimentally and quantified by the digital still camera sensitivity metamerism index (DSC/SMI).⁸^,¹⁵

Figure 2 shows the spectral locus of the HVS on the $x y$ chromaticity diagram, which is a 2D projection of the CIE XYZ color space that describes the relative proportions of the tristimulus values. Note that the spectral locus itself is horseshoe-shaped rather than triangular due to the fact that overlap of the eye-cone response functions prevents the eye cones from being independently stimulated, so the chromaticities corresponding to $(x, y)$ chromaticity coordinates positioned outside the spectral locus are invisible or imaginary as they are more saturated than pure spectrum colors. The gamut of the camera raw space for the Nikon D700 camera is also plotted on Fig. 2 and compared with several standard output-referred color spaces, namely sRGB,¹⁶ Adobe^® RGB,¹⁷ and ProPhoto RGB.¹⁸ Due to the positions of the camera raw space primaries on the $x y$ chromaticity diagram, certain regions of the camera raw space gamut do not reach the spectral locus of the HVS as these regions lie outside the triangular shape accessible to additive linear combinations of the three primaries. Furthermore, a notable consequence of camera metameric error is that the camera raw space gamut is warped away from the triangular shape accessible to the additive linear combinations of the three primaries. Certain regions are even pushed outside of the triangle accessible to the CIE XYZ color space.¹⁹ See Ref. 19 for additional examples.

Fig. 2

Gamut of the camera raw space for the Nikon D700 (light-blue shaded area) plotted on the $x y$ chromaticity diagram. The gamut is not a perfect triangle since the Luther-Ives condition is violated, which also explains why certain regions are pushed outside the triangle accessible to the CIE XYZ color space defined by the primaries located at (0,0), (0,1), and (1,0). The boundary of the horseshoe-shaped gray shaded area defines the spectral locus of the HVS. Saturation decreases with inward distance from the spectral locus. For comparison purposes, the (triangular) gamuts of several standard output-referred color spaces are indicated.

To determine the gamut of a camera raw space, the first step is to measure the camera response functions using a monochromator at a discrete set of wavelengths according to method A of the ISO 17321-1 standard.¹⁵ For each wavelength, the camera response functions yield raw $RGB$ relative tristimulus values in the camera raw space. The second step is to convert $RGB$ into relative CIE XYZ values by applying a characterization matrix that satisfies Eq. (1). Subsequently, the $(x, y)$ chromaticity coordinates corresponding to the spectral locus of the camera raw space can be calculated using the usual formulas, $x = X / (X + Y + Z)$ and $y = Y / (X + Y + Z)$ .

Since a given characterization matrix is optimized for use with the characterization illuminant, i.e., the scene illumination used to perform the characterization, another consequence of camera metameric error is that the camera raw space gamut may vary according to the characterization matrix applied. The gamut of the Nikon D700 camera raw space shown in Fig. 2 was obtained using a characterization matrix optimized for CIE illuminant D65. Figure 3 shows how the gamut changes when a characterization matrix optimized for CIE illuminant A is applied instead.

Fig. 3

Same as Fig. 2 except that a characterization matrix optimized for CIE illuminant A was used to obtain the camera raw space gamut rather than a characterization matrix optimized for CIE illuminant D65.

2.2.

Raw Values

Color values in a camera raw space are expressed in terms of digital raw values for each raw color channel, which are analogous to tristimulus values in CIE color spaces. For a CFA that uses three types of color filters such as a Bayer CFA,²⁰ the raw values expressed using output-referred units, i.e., data/digital numbers (DN) or analog-to-digital units, belong to the following set of raw channels denoted here using calligraphic symbols:

Eq. (4)

[\begin{matrix} n_{DN, 1} \\ n_{DN, 2} \\ n_{DN, 3} \\ n_{DN, 4} \end{matrix}] = [\begin{matrix} R \\ G_{1} \\ G_{2} \\ B \end{matrix}] .

Although vector notation has been used here to represent a Bayer block, a true raw pixel vector is obtained only after the color demosaic has been performed, in which case there will be four raw values associated with each photosite. The Bayer CFA uses twice as many green filters as red and blue, which means that two values

G_{1}

and

G_{2}

associated with different positions in each Bayer block will be obtained in general. This is beneficial in terms of overall signal-to-noise ratio since photosites belonging to the green mosaics are more efficient in terms of photoconversion. Furthermore, the Bayer pattern is optimal in terms of reducing aliasing artifacts when three types of filters are arranged on a square grid.¹⁴ Although it is thought that a greater number of green filters provides enhanced resolution for the luminance signal since the standard 1924 CIE luminosity function for photopic vision peaks at 555 nm,²⁰ it has been argued that a Bayer CFA with two times more blue pixels than red and green would in fact be optimal for this purpose.¹⁴ When demosaicing raw data corresponding to a standard Bayer CFA, the final output will show false mazes or meshes if the ratio between

G_{1}

and

G_{2}

varies over the image.²¹ Software raw converters may average

G_{1}

and

G_{2}

together to eliminate such artifacts.²¹

Since there are fundamentally only three camera response functions, $R_{1} (λ)$ , $R_{2} (λ)$ , and $R_{3} (λ)$ , color characterization for a Bayer CFA regards $G_{1}$ and $G_{2}$ as a single channel, $G$ . The raw values can be expressed as follows:

Eq. (5)

R = k \int_{λ_{1}}^{λ_{2}} R_{1} (λ) {\tilde{E}}_{e, λ} d λ, G = k \int_{λ_{1}}^{λ_{2}} R_{2} (λ) {\tilde{E}}_{e, λ} d λ, B = k \int_{λ_{1}}^{λ_{2}} R_{3} (λ) {\tilde{E}}_{e, λ} d λ .

The camera response functions are defined by Eq. (2), the integration is over the spectral passband of the camera,

{\tilde{E}}_{e, λ}

is the average spectral irradiance at the photosite, and

k

is a constant. Expressions for

{\tilde{E}}_{e, λ}

and

k

are given in the Appendix.

The actual raw values obtained in practice are quantized values modeled by taking the integer part of Eq. (5). When transforming from the camera raw space, it is useful to normalize the raw values to the range [0,1] by dividing Eq. (5) by the raw clipping point, which is the highest available DN.

2.3.

Reference White

Using the above normalization, the reference white of a camera raw space is defined by the unit vector

Eq. (6)

[\begin{matrix} R \\ G \\ B \end{matrix}] = [\begin{matrix} 1 \\ 1 \\ 1 \end{matrix}] .

Expressed in terms of CIE XYZ tristimulus values or

(x, y)

chromaticity coordinates with

Y = 1

, the reference white of a camera raw space is the WP of the scene illumination that yields maximum equal raw values for a neutral subject. (The WP of a SPD is defined by the CIE XYZ tristimulus values that correspond to a 100% neutral diffuse reflector illuminated by that SPD.)

It follows that the reference white of a camera raw space can in principle be determined experimentally by finding the illuminant that yields equal raw values for a neutral subject. Note that if the DCRaw open-source raw converter is used to decode the raw file, it is essential to disable WB. In terms of CIE colorimetry, the camera raw space reference white is formally defined by

Eq. (7)

{[\begin{matrix} X (WP) \\ Y (WP) \\ Z (WP) \end{matrix}]}_{scene} = \underline{T} {[\begin{matrix} R (WP) = 1 \\ G (WP) = 1 \\ B (WP) = 1 \end{matrix}]}_{scene},

where

Y (WP) = 1

and the subscripts denote that the WP is that of the scene illumination. The

3 \times 3

characterization matrix

\underline{T}

converts from the camera raw space to CIE XYZ and should be optimized for the required scene illumination. The optimum

\underline{T}

is unknown at this stage but can in principle be determined using the optimization procedure to be outlined in Sec. 2.4.

Although CIE color spaces use normalized units such that their reference whites correspond to WPs of CIE standard illuminants, camera raw spaces are not naturally normalized in such a manner. Consequently, the reference white of a camera raw space is not necessarily a neutral color as it is typically located far away from the Planckian locus and so does not necessarily have an associated correlated color temperature (CCT).

Note that a WP can be associated with a CCT provided its $(x, y)$ chromaticity coordinates are sufficiently close to the Planckian locus, but there are many such coordinates that correspond to the same CCT. To distinguish between them, a $D_{u v}$ value, informally referred to as a color tint, can be assigned.²² This is determined by converting $(x, y)$ into $(u, v)$ chromaticity coordinates on the CIE 1960 UCS chromaticity diagram,²³^,²⁴ where isotherms are normal to the Planckian locus. In this representation, CCT is a valid concept only for $(u, v)$ coordinates positioned a distance from the Planckian locus that is within $D_{u v} = \pm 0.05$ along an isotherm.²⁵

To see that the reference white of a camera raw space is far from the Planckian locus, consider the Nikon D700 raw values for a neutral diffuse reflector illuminated by CIE illuminants A and D65, respectively,

Eq. (8)

{[\begin{matrix} R (WP) = 0.8878 \\ G (WP) = 1.0000 \\ B (WP) = 0.4017 \end{matrix}]}_{A} = {\underline{T}}_{A}^{- 1} {[\begin{matrix} X (WP) = 1.0985 \\ Y (WP) = 1.0000 \\ Z (WP) = 0.3558 \end{matrix}]}_{A} {[\begin{matrix} R (WP) = 0.4514 \\ G (WP) = 1.0000 \\ B (WP) = 0.8381 \end{matrix}]}_{D 65} = {\underline{T}}_{D 65}^{- 1} {[\begin{matrix} X (WP) = 0.9504 \\ Y (WP) = 1.0000 \\ Z (WP) = 1.0888 \end{matrix}]}_{D 65},

where

{\underline{T}}_{A}

and

{\underline{T}}_{D 65}

are example characterization matrices optimized for CIE illuminants A and D65, respectively. As shown in Fig. 4, the WPs of these standard illuminants are very close to the Planckian locus. Illuminant A has

CCT = 2856 K

and

D_{u v} = 0.0

, and illuminant D65 has

CCT = 6504 K

and

D_{u v} = 0.0032

. Evidently, the above Nikon D700 raw values are very different from the camera raw space unit vector, and it would be necessary to apply large multipliers to the red and blue raw pixel values in both cases. These multipliers are known as raw channel multipliers as they are typically applied to the red and blue raw channels before the color demosaic as part of the color conversion strategy used by the internal image-processing engines of traditional digital cameras.

Fig. 4

Estimated reference whites of the Nikon D700 and Olympus E-M1 camera raw spaces in relation to the WPs of CIE illuminants A and D65. The Planckian locus is represented by the black curve. Only visible chromaticities contained within the sRGB color space are shown in color.

An estimate of the Nikon D700 reference white can be obtained by approximating Eq. (7) using a readily available characterization matrix in place of $\underline{T}$ . Applying ${\underline{T}}_{A}$ yields $(x, y) = (0.3849, 0.3058)$ , which corresponds to $D_{u v} = - 0.0378$ . This has an associated $CCT = 3155 K$ as the $D_{u v}$ value is just within the allowed limit, but Fig. 4 shows that the color tint is a strong magenta. This is true of typical camera raw spaces in general.²¹ A similar estimate for the Olympus E-M1 camera yields $(x, y) = (0.3599, 0.2551)$ , which corresponds to $D_{u v} = - 0.0637$ . This does not have an associated CCT, and the color tint is a very strong magenta.

Although the fact that camera raw space reference whites are not neutral in terms of CIE colorimetry has no bearing on the final reproduced image, it will be shown in Sec. 5 that the camera raw space reference white is utilized as a useful intermediary step in the color conversion strategy used by traditional digital cameras.

2.4.

Camera Color Characterization

Recall the linear transformation from the camera raw space to CIE XYZ defined by Eq. (1):

[\begin{matrix} X \\ Y \\ Z \end{matrix}] \approx \underline{T} [\begin{matrix} R \\ G \\ B \end{matrix}],

where

\underline{T}

is a

3 \times 3

characterization matrix:

Eq. (9)

\underline{T} = [\begin{matrix} T_{11} & T_{12} & T_{13} \\ T_{21} & T_{22} & T_{23} \\ T_{31} & T_{32} & T_{33} \end{matrix}] .

The color conversion is approximate since the Luther-Ives condition is not satisfied exactly. As mentioned in the introduction,

\underline{T}

can be optimized for the characterization illuminant, i.e., the scene illumination used to perform the characterization.¹^,² The optimum matrix

\underline{T}

is dependent upon the SPD itself, but it largely depends upon the characterization illumination WP provided the illuminant is representative of a real world SPD.

Characterization matrices optimized for known illuminants can be determined by color-error minimization procedures based upon photographs taken of a standard color chart.² Although various minimization techniques have been developed, including WP-preserving techniques,²⁶ the procedure discussed below is based on the standardized method B of ISO 17321-1.¹⁵

Note that ISO 17321-1 uses processed images output by the camera rather than raw data and consequently requires inversion of the camera opto-electronic conversion function (OECF).²⁷ The OECF defines the nonlinear relationship between irradiance at the sensor plane and the digital output levels of a viewable output image such as a JPEG file produced by the camera. To bypass the need to experimentally determine the OECF, a variation of method B from ISO 17321-1 is described below. This method uses the DCRaw open-source raw converter to decode the raw file so that the raw data can be used directly.²⁸^,²⁹

1. Take a photograph of a color chart illuminated by a specified illuminant. Since the raw values scale linearly, only their relative values are important. However, the $f$ -number $N$ and exposure duration $t$ should be chosen so as to avoid clipping.
2. Calculate relative XYZ tristimulus values for each patch of the color chart:
Eq. (10)
$X = k \int_{λ_{1}}^{λ_{2}} \bar{x} (λ) E_{e, λ} R (λ) d λ Y = k \int_{λ_{1}}^{λ_{2}} \bar{y} (λ) E_{e, λ} R (λ) d λ Z = k \int_{λ_{1}}^{λ_{2}} \bar{z} (λ) E_{e, λ} R (λ) d λ,$
where $E_{e, λ}$ is the spectral irradiance incident at the color chart measured using a spectrometer; $\bar{x} (λ)$ , $\overline{y} (λ)$ , and $\bar{z} (λ)$ are the color-matching functions of the CIE XYZ color space; and the integration is discretized into a sum with a 10-nm increment and limits $λ_{1} = 380 nm$ and $λ_{2} = 780 nm$ . Unless a tristimulus colorimeter is used, the calculation requires knowledge of the spectral reflectance of each patch. Spectral reflectance has been denoted by $R (λ)$ in the above equations, and this should not be confused with the camera response functions. The normalization constant $k$ can be chosen so that $Y$ is in the range [0,1] using the white patch as a white reference.
3. Obtain a linear demosaiced output image directly in the camera raw space without converting to any other color space. Gamma encoding, tone curves, and WB must all be disabled. Since the present method bypasses the need to determine and invert the OECF, it is crucial to disable WB; otherwise, raw channel multipliers may be applied to the raw channels. If using the DCRaw open-source raw converter, an appropriate command is
$dcraw - v - r 1 1 1 1 - o 0 - 4 - T filename .$
This yields a 16-bit linear demosaiced output TIFF file in the camera raw space. If working with raw channels rather than demosaiced raw pixel vectors, an appropriate command is
$dcraw - v - D - 4 - T filename .$
The above DCRaw commands are explained in Table 3.
4. Measure average $R$ , $G$ , and $B$ values over a $64 \times 64$ block of pixels at the center of each patch. Each patch can then be associated with an appropriate average raw pixel vector.
5. Build a $3 \times n$ matrix $\underline{A}$ containing the vectors of the XYZ color space for each patch $1, \dots, n$ as columns:
Eq. (11)
$\underline{A} = [\begin{matrix} X_{1} & X_{2} & \dots & X_{n} \\ Y_{1} & Y_{2} & \dots & Y_{n} \\ Z_{1} & Z_{2} & \dots & Z_{n} \end{matrix}] .$
Similarly, build a $3 \times n$ matrix $\underline{B}$ containing the corresponding raw pixel vectors as columns:
Eq. (12)
$\underline{B} = [\begin{matrix} R_{1} & R_{2} & \dots & R_{n} \\ G_{1} & G_{2} & \dots & G_{n} \\ B_{1} & B_{2} & \dots & B_{n} \end{matrix}] .$
6. Estimate the $3 \times 3$ characterization matrix $\underline{T}$ that transforms $\underline{B}$ to $\underline{A}$ :
Eq. (13)
$\underline{A} \approx \underline{T} \underline{B} .$
A preliminary solution is obtained using linear least-squares minimization:²^,¹⁵
Eq. (14)
$\underline{T} = \underline{A} {\underline{B}}^{T} {(\underline{B} {\underline{B}}^{T})}^{- 1},$
where the T superscript denotes the transpose operator.
7. Use the preliminary $\underline{T}$ estimate to calculate a new set of estimated CIE XYZ tristimulus values ${\underline{A}}^{'}$ according to Eq. (13). Transform $\underline{A}$ and $\underline{A^{'}}$ to the perceptually uniform CIE LAB reference color space, and calculate the color difference $Δ E_{i}$ between the estimated tristimulus values and real tristimulus values for each patch $i$ . The set ${Δ E_{i}}$ can be used to calculate the DSC/SMI.⁸^,¹⁵ Note that, to satisfy the Luther-Ives condition exactly, it it is necessary that ${\underline{A}}^{'} = \underline{A}$ , in which case a DSC/SMI score of 100 would be obtained.
8. Optimize $\underline{T}$ by minimizing ${Δ E_{i}}$ using the nonlinear optimization technique recommended by ISO 17321-1. The final DSC/SMI defines the final potential color error. Ideally, include a constraint that preserves the characterization illuminant WP.
9. Scale the final $\underline{T}$ according to the normalization required for its practical implementation. This is discussed below.

Provided WB was disabled in step 3, the characterization matrix $\underline{T}$ can be used with arbitrary scene illumination. However, optimum results will be obtained for scene illumination with a WP that closely matches that of the characterization illuminant.

Figure 5 shows how the matrix elements of an optimized characterization matrix vary as a function of characterization illuminant CCT for the Olympus E-M1 camera.

Fig. 5

Variation of the matrix elements of a characterization matrix for the Olympus E-M1 camera as a function of characterization illuminant CCT.

For the same camera, Fig. 6(a) shows a photo of a color chart in the camera raw space taken under D65 illumination. When the camera raw space $RGB$ values are interpreted as RGB values in the sRGB color space for display purposes without any color characterization matrix applied, a strong green color tint is revealed, which arises from the greater transmission of the green Bayer filter. Figure 6(b) shows the same photo converted into the sRGB color space by applying an optimized characterization matrix $\underline{T}$ , followed by a matrix that converts the colors from the CIE XYZ color space to sRGB. Evidently, the colors are now displayed correctly.

Fig. 6

(a) Photo of a color chart in the camera raw space taken under D65 illumination. (b) The same photo converted to the sRGB color space.

2.5.

Characterization Matrix Normalization

Normalization of a characterization matrix refers to scaling of the entire matrix so that all matrix elements are scaled identically. A typical normalization applied in practice is to ensure that the matrix maps between the characterization illuminant WP expressed using the CIE XYZ color space and the camera raw space such that the raw data just saturates when a 100% neutral diffuse reflector is photographed under the characterization illuminant. The green raw channel is typically the first to saturate.

For example, if the characterization illuminant is D65, then $\underline{T}$ can be normalized such that its inverse provides the following mapping:

Eq. (15)

[\begin{matrix} R (WP) \\ G (WP) \\ B (WP) \end{matrix}]_{D 65} = {\underline{T}}^{- 1} {[\begin{matrix} X (WP) = 0.9504 \\ Y (WP) = 1.0000 \\ Z (WP) = 1.0888 \end{matrix}]}_{D 65},

where

\max {R (WP), G (WP), B (WP)} = 1

. Since the green raw channel is typically the first to saturate under most types of illumination, it will typically be the case that

G (WP) = 1

, whereas

R (WP) < 1

and

B (WP) < 1

.

For example, the Olympus E-M1 characterization matrices used by Fig. 5 for the 4200 and 6800 K calibration illuminants are defined by

Eq. (16)

{\underline{T}}_{4200 K} = [\begin{matrix} 0.8680 & 0.3395 & 0.2133 \\ 0.2883 & 0.8286 & - 0.0216 \\ 0.0425 & - 0.2647 & 1.7637 \end{matrix}], {\underline{T}}_{6800 K} = [\begin{matrix} 1.2105 & 0.2502 & 0.1882 \\ 0.4586 & 0.8772 & - 0.1328 \\ 0.0936 & - 0.2788 & 1.9121 \end{matrix}] .

These matrices are normalized such that the WP of the characterization illuminant maps to raw values where the green raw channel just reaches saturation:

Eq. (17)

{[\begin{matrix} R (WP) = 0.6337 \\ G (WP) = 1.0000 \\ B (WP) = 0.5267 \end{matrix}]}_{4200 K} = {\underline{T}}_{4200 K}^{- 1} {[\begin{matrix} X (WP) = 1.0019 \\ Y (WP) = 1.0000 \\ Z (WP) = 0.6911 \end{matrix}]}_{4200 K}, {[\begin{matrix} R (WP) = 0.4793 \\ G (WP) = 1.0000 \\ B (WP) = 0.7312 \end{matrix}]}_{6800 K} = {\underline{T}}_{6800 K}^{- 1} {[\begin{matrix} X (WP) = 0.9682 \\ Y (WP) = 1.0000 \\ Z (WP) = 1.1642 \end{matrix}]}_{6800 K} .

3. White Balance

A remarkable property of the HVS is its ability to naturally adjust to the ambient lighting conditions. For example, if a 100% neutral diffuse reflector is placed in a photographic scene illuminated by daylight, the reflector appears to be neutral white. Later in the day when there is a change in the chromaticity or CCT of the scene illumination, the color of the reflector would be expected to change accordingly. However, the reflector will continue to appear neutral white. In other words, the perceived color of objects remains relatively constant under varying types of scene illumination, which is known as color constancy.³^,⁴

The chromatic adaptation mechanism by which the HVS achieves color constancy is complex and not fully understood, but a simplified explanation is that the HVS aims to discount the chromaticity of the illuminant.³⁰ Back in 1902, von-Kries postulated that this is achieved by an independent scaling of each eye cone response function.³^,⁴ The color stimulus that an observer adapted to the ambient conditions considers to be neutral white (perfectly achromatic with 100% relative luminance) is defined as the adapted white.³¹

Since camera response functions do not naturally emulate the HVS by discounting the chromaticity of the scene illumination, an output image will appear too warm or too cold if it is displayed using illumination with a WP that does not match the adapted white for the photographic scene at the time the photograph was taken. This is known as incorrect WB. The issue can be solved by implementing the following computational strategy.

1. Inform the camera of the adapted white before taking a photograph. Due to the complex dependence of the true adapted white on the ambient conditions, this task is replaced by a simpler one in practice, namely to identify the scene illumination WP. For example, a WB preset corresponding to the scene illumination can be manually selected, a scene illumination CCT estimate can be manually entered, or the camera can compute its own estimate by analyzing the raw data using the automatic WB function. In all cases, the camera estimate for the scene illumination WP is known as the camera neutral³² or adopted white (AW).³¹ (This illuminant estimation step should not be confused with WB. Illuminant estimation refers to the computational approaches used by the automatic WB function to estimate the scene illumination WP. A very simple illuminant estimation approach is the “gray world” method,³³ which assumes that the average of all of the scene colors will turn out to be achromatic. Another simple approach is to assume that the brightest white is likely to correspond to the scene illumination WP.³⁴ However, practical illuminant estimation algorithms are much more sophisticated.³⁵^,³⁶)
2. Choose a standard reference white that will be used when displaying the output image. If the image will be displayed using a standard output-referred color space such as sRGB, the chosen reference white will be that of the output-referred color space, which is CIE illuminant D65 in the case of sRGB.
3. Chromatically adapt the image colors by adapting the scene illumination WP estimate (the AW) so that it becomes the reference white of the chosen output-referred color space. This white balancing step is achieved by applying a CAT.

The CAT needs to be applied as part of the overall color conversion from the camera raw space to the chosen output-referred color space. Different approaches for combining these components exist. The typical approach used in color science is to convert from the camera raw space to CIE XYZ, apply the CAT, and then convert to the chosen output-referred color space. In the case of sRGB,

Eq. (18)

{[\begin{matrix} R_{L} \\ G_{L} \\ B_{L} \end{matrix}]}_{D 65} = {\underline{M}}_{sRGB}^{- 1} {\underline{CAT}}_{AW \to D 65} \underline{T} {[\begin{matrix} R \\ G \\ B \end{matrix}]}_{scene},

where

\underline{T}

is a characterization matrix that converts from the camera raw space to CIE XYZ and is optimized for the scene AW, the matrix

{\underline{CAT}}_{AW \to D 65}

applied in the CIE XYZ color space is a CAT that adapts the AW to the D65 reference white of the sRGB color space, and finally

{\underline{M}}_{sRGB}^{- 1}

is the matrix that converts from CIE XYZ to the linear form of the sRGB color space:

Eq. (19)

{\underline{M}}_{sRGB}^{- 1} = [\begin{matrix} 3.2410 & - 1.5374 & - 0.4986 \\ - 0.9692 & 1.8760 & 0.0416 \\ 0.0556 & - 0.2040 & 1.0570 \end{matrix}] .

In particular, the AW in the camera raw space is mapped to the reference white of the output-referred color space defined by the unit vector in the output-referred color space:

Eq. (20)

{[\begin{matrix} R_{L} = 1 \\ G_{L} = 1 \\ B_{L} = 1 \end{matrix}]}_{D 65} = {\underline{M}}_{sRGB}^{- 1} {\underline{CAT}}_{AW \to D 65} \underline{T} {[\begin{matrix} R (AW) \\ G (AW) \\ B (AW) \end{matrix}]}_{scene} .

When the encoded output image is viewed on a calibrated display monitor, a scene object that the HVS regarded as being white at the time the photograph was taken will now be displayed using the D65 reference white. Ideally, the ambient viewing conditions should match those defined as appropriate for viewing the sRGB color space.

If the scene illumination WP estimate is far from the true scene illumination WP, then incorrect WB will be evident to the HVS. If the scene illumination CCT estimate is higher than the true CCT, then the photo will appear too warm. Conversely, if the scene illumination CCT estimate is lower than the true CCT, then the photo will appear too cold.

Figure 7(a) shows a photo of a color chart taken under 2700 K CCT tungsten illumination using the Olympus E-M1 camera. A characterization matrix $\underline{T}$ was applied to convert the colors into CIE XYZ, followed by ${\underline{M}}_{sRGB}^{- 1}$ to convert the colors to sRGB. Evidently, the true color of the scene illumination is revealed since no chromatic adaptation has been performed by the camera. In other words, the photo appears too warm in relation to the D65 reference white of the sRGB color space. Figure 7(b) shows the same photo after white balancing by including a CAT that chromatically adapts the scene illumination WP to the sRGB color space D65 reference white, which has a 6504 K CCT and $D_{u v} = 0.0032$ color tint.

Fig. 7

(a) Photo of a color chart taken under 2700 K CCT tungsten illumination and converted to the sRGB color space for display without any chromatic adaptation. (b) White balanced photo obtained by including a CAT to adapt the scene illumination WP to the D65 reference white of the sRGB color space.

3.1.

Chromatic Adaptation Transforms

A CAT is a computational technique for adjusting the WP of a given SPD. It achieves this goal by attempting to mimic the chromatic adaptation mechanism of the HVS. In the context of digital cameras, the most important CATs are the Bradford CAT and raw channel scaling.

In 1902, von-Kries postulated that the chromatic adaptation mechanism be modeled as an independent scaling of each eye cone response function,³^,⁴ which is equivalent to scaling the $L$ , $M$ , and $S$ tristimulus values in the LMS color space. To illustrate the von-Kries CAT, consider adapting the scene illumination WP estimate (the AW) to the WP of D65 illumination:

Eq. (21)

{[\begin{matrix} X \\ Y \\ Z \end{matrix}]}_{D 65} = {\underline{CAT}}_{AW \to D 65} {[\begin{matrix} X \\ Y \\ Z \end{matrix}]}_{scene} .

In this case, the von-Kries CAT that must be applied to all raw pixel vectors can be written as

Eq. (22)

{\underline{CAT}}_{AW \to D 65} = {\underline{M}}_{vK}^{- 1} [\begin{matrix} \frac{L (D 65)}{L (AW)} & 0 & 0 \\ 0 & \frac{M (D 65)}{M (AW)} & 0 \\ 0 & 0 & \frac{S (D 65)}{S (AW)} \end{matrix}] {\underline{M}}_{vK} .

The matrix

{\underline{M}}_{vK}

transforms each raw pixel vector into a diagonal matrix in the LMS color space. Modern forms of

{\underline{M}}_{vK}

include matrices based on the cone fundamentals defined by the CIE in 2006³⁷ and the Hunt–Pointer–Estevez transformation matrix³⁸ defined by

Eq. (23)

{\underline{M}}_{vK} = [\begin{matrix} 0.38971 & 0.68898 & - 0.07868 \\ - 0.22981 & 1.18340 & 0.04641 \\ 0.00000 & 0.00000 & 1.00000 \end{matrix}] .

After applying

{\underline{M}}_{vK}

, the

L

,

M

, and

S

values are independently scaled according to the von-Kries hypothesis. In the present example, the scaling factors arise from the ratio between the AW and D65 WPs. These can be obtained from the following WP vectors:

Eq. (24)

[\begin{matrix} L (AW) \\ M (AW) \\ S (AW) \end{matrix}] = {\underline{M}}_{vK} {[\begin{matrix} X (WP) \\ Y (WP) \\ Z (WP) \end{matrix}]}_{scene} [\begin{matrix} L (D 65) \\ M (D 65) \\ S (D 65) \end{matrix}] = {\underline{M}}_{vK} {[\begin{matrix} X (WP) = 0.9504 \\ Y (WP) = 1.0000 \\ Z (WP) = 1.0888 \end{matrix}]}_{D 65} .

Finally, the inverse of the transformation matrix

{\underline{M}}_{vK}

is applied to convert each raw pixel vector back into the CIE XYZ color space.

The Bradford CAT³⁹ can be regarded as an improved version of the von-Kries CAT. A simplified linearized version is recommended by the ICC for use in digital imaging.⁴⁰ The linear Bradford CAT can be implemented in an analogous fashion as the von-Kries CAT, the difference being that the $L$ , $M$ , and $S$ tristimulus values are replaced by $ρ$ , $γ$ , and $β$ , which correspond to a “sharpened” artificial eye cone space. The transformation matrix is defined by

Eq. (25)

{\underline{M}}_{BFD} = [\begin{matrix} 0.8951 & 0.2664 & - 0.1614 \\ - 0.7502 & 1.7135 & 0.0367 \\ 0.0389 & - 0.0685 & 1.0296 \end{matrix}] .

Analogous to the independent scaling of the eye cone response functions hypothesized by von-Kries, a type of CAT can be applied in the camera raw space by directly scaling the raw channels. Consider a Bayer block for the AW obtained by photographing a 100% neutral diffuse reflector under the scene illumination. The following operation will adapt the AW to the reference white of the camera raw space:

Eq. (26)

{[\begin{matrix} R \\ G \\ B \end{matrix}]}_{RW} = {\underline{CAT}}_{AW \to RW} {[\begin{matrix} R \\ G \\ B \end{matrix}]}_{scene},

where

Eq. (27)

{\underline{CAT}}_{AW \to RW} = \underline{D} = {[\begin{matrix} \frac{1}{R (AW)} & 0 & 0 \\ 0 & \frac{1}{G (AW)} & 0 \\ 0 & 0 & \frac{1}{B (AW)} \end{matrix}]}_{scene} .

The diagonal scaling factors, known as raw channel multipliers, can be obtained directly from the raw data using the AW calculated by the camera. For example,

AW = D 65

for D65 scene illumination, in which case

Eq. (28)

{\underline{CAT}}_{D 65 \to RW} = {\underline{D}}_{D 65} = [\begin{matrix} \frac{1}{R (D 65)} & 0 & 0 \\ 0 & \frac{1}{G (D 65)} & 0 \\ 0 & 0 & \frac{1}{B (D 65)} \end{matrix}],

where

R (D 65)

,

G (D 65)

, and

B (D 65)

are extracted from the Bayer block for a 100% neutral diffuse reflector photographed under D65 scene illumination.

In the context of digital cameras, the type of CAT defined by raw channel multipliers has been found to work better in practice, particularly for extreme cases.²¹^,³² A reason for this is that the raw channel multipliers are applied in the camera raw space prior to application of a color conversion matrix. The camera raw space corresponds to a physical capture device, but CATs such as the linear Bradford CAT are applied in the CIE XYZ color space after applying a color conversion matrix that contains error. In particular, color errors that have been minimized in a nonlinear color space such as CIE LAB will be unevenly amplified, so the color conversion will no longer be optimal.⁴¹

4. Smartphone Cameras

Smartphone manufacturers, along with commercial raw conversion software developers, typically implement the conventional type of computational color conversion strategy used in color science that was introduced in Sec. 3. Since the camera raw space is transformed into CIE XYZ as the first step, image processing techniques can be applied in the CIE XYZ color space (or following a transformation into some other intermediate color space) before the final transformation to an output-referred RGB color space.

Consider the white-balanced transformation from the camera raw space to an output-referred RGB color space. Unlike in traditional digital cameras, the color demosaic is typically carried out first, so the vector notation used for the camera raw space below refers to raw pixel vectors rather than Bayer blocks. In the case of sRGB, the transformation that must be applied to every raw pixel vector is defined by

Eq. (29)

{[\begin{matrix} R_{L} \\ G_{L} \\ B_{L} \end{matrix}]}_{D 65} = {\underline{M}}_{sRGB}^{- 1} {\underline{CAT}}_{AW \to D 65} \underline{T} {[\begin{matrix} R \\ G \\ B \end{matrix}]}_{scene} .

The conversion can be decomposed into three steps.

1. After the camera has estimated the scene illumination WP (the AW), a characterization matrix $\underline{T}$ optimized for the AW that converts from the camera raw space to CIE XYZ is applied:
Eq. (30)
${[\begin{matrix} X \\ Y \\ Z \end{matrix}]}_{scene} = \underline{T} {[\begin{matrix} R \\ G \\ B \end{matrix}]}_{scene} .$
The optimized matrix $\underline{T}$ is typically normalized such that the AW in the CIE XYZ space is obtained when the raw pixel vector corresponding to a neutral diffuse reflector illuminated by the AW just reaches saturation:
Eq. (31)
${[\begin{matrix} X (AW) \\ Y (AW) \\ Z (AW) \end{matrix}]}_{scene} = \underline{T} {[\begin{matrix} R (AW) \\ G (AW) \\ B (AW) \end{matrix}]}_{scene},$
where $Y (WP) = 1$ and $\max {R (AW), G (AW), B (AW)} = 1$ . As discussed in Sec. 2.5, the green component is typically the first to saturate, so $R (AW) < 1$ and $B (AW) < 1$ in general.
2. Since $\underline{T}$ does not alter the AW, a CAT is applied to achieve WB by adapting the AW to the reference white of the chosen output-referred color space. This is D65 in the case of sRGB:
Eq. (32)
${[\begin{matrix} X \\ Y \\ Z \end{matrix}]}_{D 65} = {\underline{CAT}}_{AW \to D 65} {[\begin{matrix} X \\ Y \\ Z \end{matrix}]}_{scene} .$
The ICC recommends implementing the CAT using the linear Bradford CAT matrix defined by Eq. (25).
3. A matrix that converts from CIE XYZ to the linear form of the chosen output-referred color space is applied. In the case of sRGB,
Eq. (33)
${[\begin{matrix} R_{L} \\ G_{L} \\ B_{L} \end{matrix}]}_{D 65} = {\underline{M}}_{sRGB}^{- 1} {[\begin{matrix} X \\ Y \\ Z \end{matrix}]}_{D 65} .$

Finally, the digital output levels of the output image are determined by applying the nonlinear gamma encoding curve of the output-referred color space and reducing the bit depth to 8. In modern digital imaging, encoding gamma curves are designed to minimize visible banding artifacts when the bit depth is reduced, and the non-linearity introduced is later reversed by the display gamma.²⁸

To see that WB is correctly achieved, the above steps can be followed for the specific case of the raw pixel vector that corresponds to the AW. As required by Eq. (20), it is found that this maps to the reference white of the output-referred color space defined by the unit vector in that color space:

{[\begin{matrix} R_{L} = 1 \\ G_{L} = 1 \\ B_{L} = 1 \end{matrix}]}_{D 65} = {\underline{M}}_{sRGB}^{- 1} {\underline{CAT}}_{AW \to D 65} \underline{T} {[\begin{matrix} R (AW) \\ G (AW) \\ B (AW) \end{matrix}]}_{scene} .

Although the matrix transformation defined by Eq. (29) appears to be straightforward, the characterization matrix $\underline{T}$ should in principle be optimized for the AW. However, it is impractical to determine a characterization matrix optimized for each possible scene illumination WP that could occur. For example, if CCTs are specified to the nearest Kelvin and color tint is neglected, then 12,000 matrices would be required to cover scene illumination WPs from 2000 to 14,000 K.

The computationally simplest solution used on some mobile phone cameras is to approximate the optimized characterization matrix $\underline{T}$ using a single fixed matrix optimized for a representative illuminant. For example, this could be D65 illumination, in which case $\underline{T}$ optimized for the AW is approximated as ${\underline{T}}_{D 65}$ . The drawback of this very simple approach is that the color conversion loses some accuracy when the scene illumination WP differs significantly from the WP of the representative illuminant.

As described below, an advanced solution to the problem is to adopt the type of approach used by the Adobe DNG converter.³² The idea is to interpolate between two preset characterization matrices that are optimized for use with either a low-CCT or high-CCT illuminant. For a given scene illumination, an interpolated matrix optimized for the CCT of the AW can be determined.

4.1.

Interpolation Algorithm

If using the advanced approach mentioned above, the optimized characterization matrix $\underline{T}$ required by Eq. (29) can be calculated by interpolating between two characterization matrices $\underline{T_{1}}$ and $\underline{T_{2}}$ based on the scene illumination CCT estimate denoted by CCT(AW), together with the CCTs of the two characterization illuminants denoted by ${CCT}_{1}$ and ${CCT}_{2}$ , respectively, with ${CCT}_{1} < {CCT}_{2}$ . For example, illuminant 1 could be a low-CCT illuminant such as CIE illuminant A, whereas illuminant 2 could be a high-CCT illuminant such as D65.

The first step is to appropriately normalize $\underline{T_{1}}$ and $\underline{T_{2}}$ . Although characterization matrices are typically normalized according to their corresponding characterization illuminant WPs as demonstrated in Sec. 2.5, it is more convenient to normalize $\underline{T_{1}}$ and $\underline{T_{2}}$ according to a common WP when implementing an interpolation algorithm. Unfortunately, the AW cannot be expressed using the CIE XYZ color space at this stage since $\underline{T}$ is yet to be determined. Instead, the common WP could be chosen to be the reference white of the output-referred color space, which is D65 for sRGB. In this case, $\underline{T_{1}}$ and $\underline{T_{2}}$ should both be scaled according to Eq. (15):

Eq. (34)

{[\begin{matrix} R (WP) \\ G (WP) \\ B (WP) \end{matrix}]}_{D 65} = {\underline{T_{1}}}^{- 1} {[\begin{matrix} X (WP) = 0.9504 \\ Y (WP) = 1.0000 \\ Z (WP) = 1.0888 \end{matrix}]}_{D 65}, {[\begin{matrix} R (WP) \\ G (WP) \\ B (WP) \end{matrix}]}_{D 65} = {\underline{T_{2}}}^{- 1} {[\begin{matrix} X (WP) = 0.9504 \\ Y (WP) = 1.0000 \\ Z (WP) = 1.0888 \end{matrix}]}_{D 65},

where

Y (WP) = 1

and

\max {R (WP), G (WP), B (WP)} = 1

.

Unless the smartphone utilizes a color sensor that can directly estimate the scene illumination WP in terms of $(x, y)$ chromaticity coordinates, the AW is calculated by the camera in terms of raw values $R (AW)$ , $G (AW)$ , and $B (AW)$ , so the AW cannot be expressed using the CIE XYZ color space prior to the interpolation. However, the corresponding CCT(AW) requires knowledge of the $(x, y)$ chromaticity coordinates, which means converting to CIE XYZ via a matrix transformation $\underline{T}$ that itself depends upon the unknown CCT(AW). This problem can be solved using a self-consistent iteration procedure.³²

1. Make a guess for the AW chromaticity coordinates, $(x (AW), y (AW))$ . For example, the chromaticity coordinates corresponding to one of the characterization illuminants could be used.
2. Find the CCT value CCT(AW) that corresponds to the chromaticity coordinates $(x (AW), y (AW))$ . A widely used approach is to convert $(x (AW), y (AW))$ into the corresponding $(u (AW), v (AW))$ chromaticity coordinates on the 1960 UCS chromaticity diagram,²³^,²⁴ where isotherms are normal to the Planckian locus. This enables CCT(AW) to be determined using Robertson’s method.⁴² Alternatively, approximate formulas⁴³^–⁴⁵ or more recent algorithms⁴⁶ can be implemented.
3. Perform the interpolation so that
Eq. (35)
$\underline{T} (AW) = f [\underline{T_{1}} ({CCT}_{1}), \underline{T_{2}} ({CCT}_{2})],$
where $f$ is the interpolation function. The interpolation is valid for
${CCT}_{1} \leq CCT (AW) \leq {CCT}_{2} .$
If $CCT (AW) < {CCT}_{1}$ , then $\underline{T}$ should be set equal to $\underline{T_{1}}$ , and if $CCT (AW) > {CCT}_{2}$ , then $\underline{T}$ should be set equal to $\underline{T_{2}}$ .
4. Use $\underline{T}$ to transform the AW from the camera raw space to the CIE XYZ color space:
Eq. (36)
${[\begin{matrix} X (AW) \\ Y (AW) \\ Z (AW) \end{matrix}]}_{scene} = \underline{T} {[\begin{matrix} R (AW) \\ G (AW) \\ B (AW) \end{matrix}]}_{scene} .$
This yields a new guess for $(x (AW), y (AW))$ .
5. Repeat the procedure starting from step 2 until $(x (AW), y (AW))$ , CCT(AW), and $\underline{T}$ all converge to a stable solution.

After the interpolation has been carried out, $\underline{T}$ inherits the normalization of Eq. (34). However, the AW can now be expressed using the CIE XYZ color space, so $\underline{T}$ can be renormalized to satisfy Eq. (31).

If the smartphone utilizes a color sensor that can directly estimate the scene illumination WP in terms of $(x, y)$ chromaticity coordinates, then only steps 2 and 3 above are required.

5. Traditional Digital Cameras

Consider again the white-balanced transformation from the camera raw space to an output-referred RGB color space. In the case of sRGB, the transformation is defined by Eq. (29):

{[\begin{matrix} R_{L} \\ G_{L} \\ B_{L} \end{matrix}]}_{D 65} = {\underline{M}}_{sRGB}^{- 1} {\underline{CAT}}_{AW \to D 65} \underline{T} {[\begin{matrix} R \\ G \\ B \end{matrix}]}_{scene},

where

{\underline{CAT}}_{AW \to D 65}

adapts the scene illumination WP estimate (the AW) to the sRGB color space D65 reference white. Traditional camera manufacturers typically re-express the above equation in the following manner:

Eq. (37)

{[\begin{matrix} R_{L} \\ G_{L} \\ B_{L} \end{matrix}]}_{D 65} = \underline{R} \underline{D} {[\begin{matrix} R \\ G \\ B \end{matrix}]}_{scene} .

This equation can be interpreted by decomposing the conversion into two steps.

1. The matrix $\underline{D}$ is a diagonal WB matrix containing raw channel multipliers appropriate for the AW:
Eq. (38)
$\underline{D} = {[\begin{matrix} \frac{1}{R (AW)} & 0 & 0 \\ 0 & \frac{1}{G (AW)} & 0 \\ 0 & 0 & \frac{1}{B (AW)} \end{matrix}]}_{scene} .$
These are applied to the raw channels before the color demosaic. As shown by Eq. (27), the raw channel multipliers, in particular, serve to chromatically adapt the AW to the reference white of the camera raw space:
Eq. (39)
${[\begin{matrix} R = 1 \\ G = 1 \\ B = 1 \end{matrix}]}_{reference} = \underline{D} {[\begin{matrix} R (AW) \\ G (AW) \\ B (AW) \end{matrix}]}_{scene} .$
2. The matrix $\underline{R}$ is a color rotation matrix optimized for the scene illumination. After the color demosaic has been performed, $\underline{R}$ is applied to convert directly from the camera raw space to the linear form of the chosen output-referred color space. By comparison of Eqs. (29) and (37), $\underline{R}$ is algebraically defined as
Eq. (40)
$\underline{R} = {\underline{M}}_{sRGB}^{- 1} {\underline{CAT}}_{AW \to D 65} \underline{T} {\underline{D}}^{- 1} .$
Color rotation matrices have the important property that each of their rows sums to unity:
Eq. (41)
$R (1,1) + R (1,2) + R (1,3) = 1, R (2,1) + R (2,2) + R (2,3) = 1, R (3,1) + R (3,2) + R (3,3) = 1 .$
Consequently, $\underline{R}$ maps the reference white of the camera raw space directly to the reference white of the output-referred color space.²¹ In the case of sRGB,
Eq. (42)
${[\begin{matrix} R_{L} = 1 \\ G_{L} = 1 \\ B_{L} = 1 \end{matrix}]}_{D 65} = \underline{R} {[\begin{matrix} R = 1 \\ G = 1 \\ B = 1 \end{matrix}]}_{reference} .$

Combining Eqs. (39) and (42) shows that overall WB is achieved since the raw pixel vector corresponding to the AW is mapped to the reference white of the output-referred color space:

Eq. (43)

[\begin{matrix} R_{L} = 1 \\ G_{L} = 1 \\ B_{L} = 1 \end{matrix}]_{D 65} = \underline{R} \underline{D} {[\begin{matrix} R (AW) \\ G (AW) \\ B (AW) \end{matrix}]}_{scene} .

Like the characterization matrix $\underline{T}$ , the color rotation matrix $\underline{R}$ should in principle be optimized for the scene illumination. Rather than use an interpolation-based approach, the reformulation in the form of Eq. (37) enables traditional camera manufacturers to adopt an alternative and computationally simple approach that can be straightforwardly implemented on fixed-point number architecture.

5.1.

Multiplier and Matrix Decoupling

Although Eq. (37) appears to be a straightforward reformulation of Eq. (29), it has several advantages that arise from the raw channel multipliers contained within the WB matrix $\underline{D}$ having been extracted. As shown in Fig. 8, the variation of the elements of a color rotation matrix with respect to CCT is very small. The stability is greater than that of the elements of a conventional characterization matrix $\underline{T}$ , as evident from comparison of Figs. 5 and 8.

Fig. 8

Variation of the matrix elements of the raw-to-sRGB color rotation matrix $\underline{R}$ used by the Olympus E-M1 camera as a function of CCT.

Consequently, it suffices to determine a small set of $n$ preset color rotation matrices that cover a range of WPs or CCTs, with each matrix optimized for a particular preset WP or CCT:

Eq. (44)

{\underline{R}}_{i} = {\underline{M}}_{sRGB}^{- 1} {\underline{CAT}}_{AW \to D 65} {\underline{T}}_{i} {\underline{D}}_{i}^{- 1},

where

i = 1 \dots n

. When the AW is calculated by the camera, the color rotation matrix

{\underline{R}}_{i}

optimized for the closest-matching WP or CCT preset can be selected. However, the WB matrix

\underline{D}

appropriate for the AW is always applied prior to

{\underline{R}}_{i}

, so the overall color conversion can be expressed as

Eq. (45)

{[\begin{matrix} R_{L} \\ G_{L} \\ B_{L} \end{matrix}]}_{D 65} = ({\underline{M}}_{sRGB}^{- 1} {\underline{CAT}}_{AW \to D 65} {\underline{T}}_{i} {\underline{D}}_{i}^{- 1}) \underline{D} {[\begin{matrix} R \\ G \\ B \end{matrix}]}_{scene} .

Since

\underline{D}

is decoupled from the rotation matrices, this approach will achieve correct WB without the need to interpolate the rotation matrices.

It should be noted that the camera raw space correctly represents the scene (albeit via a non-standard color model) and that the raw channel multipliers contained within $\underline{D}$ are not applied to “correct” anything concerning the representation of the true scene white by the camera raw space, as often assumed. The multipliers are applied to chromatically adapt the AW to the reference white of the camera raw space as part of the overall CAT required to achieve WB by emulating the chromatic adaptation mechanism of the HVS. As shown in Fig. 4, the reference white of a camera raw space is typically a magenta color when expressed using CIE colorimetry, but it serves as a useful intermediary stage in the required color transformation as it facilitates the extraction of a channel scaling component that can be decoupled from the matrix operation. Other advantages of the reformulation include the following.

• The raw channel multipliers contained within $\underline{D}$ can be applied to the raw channels before the color mosaic is performed. This results in a demosaic of better quality.²¹
• The method can be efficiently implemented on fixed-point architecture.⁴⁷
• If desired, part of the raw channel scaling can be carried out in the analog domain using analog amplification. This is beneficial for image quality if the analog-to-digital converter (ADC) does not have a sufficiently high bit depth. Note that this type of analog amplification will affect the input to output-referred unit conversion factors $g_{i}$ defined by Eq. (80) in the Appendix.
• The raw channel multipliers contained within $\underline{D}$ that appear in Eq. (37) are stored in the proprietary raw file metadata and are applied by the internal JPEG image-processing engine of the camera. Since the raw channel multipliers do not affect the raw data, they can be utilized by external raw conversion software provided by the camera manufacturer and can be easily adjusted by the user.
• Scene illumination presets that include a color tint can be straightforwardly implemented by storing the appropriate preset color rotation matrices and raw channel multipliers, as illustrated in Sec. 5.2.

5.2.

Example: Olympus E-M1

Although the color matrices used by the camera manufacturers are generally unknown, certain manufacturers such as Sony and Olympus do reveal information about the color rotation matrices used by their cameras that can be extracted from the raw metadata.

Table 1 lists the data illustrated in Fig. 8 for the preset color rotation matrices used by the Olympus E-M1 digital camera, along with the scene illumination CCT ranges over which each matrix is applied. Figure 9 shows how the raw channel multipliers for the same camera vary as a function of CCT. The data was extracted from raw metadata using the freeware “ExifTool” application.⁴⁸ The color conversion strategy of the camera can be summarized as follows.

1. The camera determines the scene illumination WP estimate (the AW) using either an auto-WB algorithm, a selected scene illumination preset, or a custom CCT provided by the user. The AW is used to calculate the appropriate raw channel multipliers via Eq. (38) so that the diagonal WB matrix $\underline{D}$ can be applied to the raw channels. In particular, $\underline{D}$ serves to adapt the AW to the reference white of the camera raw space.
2. After the color demosaic is performed, the camera chooses the preset color rotation matrix ${\underline{R}}_{i}$ optimized for illumination with a CCT that provides the closest match to the CCT associated with the AW, or the closest matching scene illumination preset.
3. The camera applies ${\underline{R}}_{i}$ to convert to the output-referred color space selected in-camera by the user, such as sRGB. In particular, the camera raw space reference white is mapped to the reference white of the selected output-referred color space, which is D65 in the case of sRGB.

The Olympus E-M1 camera also includes several scene illumination presets. The color rotation matrices and associated raw channel multipliers for these scene presets are listed in Table 2. For a given CCT, notice that the scene preset matrices and multipliers are not necessarily the same as those listed in Table 1. This is because the scene preset renderings include a color tint away from the Planckian locus, so the chromaticity coordinates are not necessarily the same as those listed in Table 1 for a given CCT. For the same reason, notice that the “fine weather,” “underwater,” and “flash” scene mode presets actually use the same color rotation matrix but use very different raw channel multipliers.

Table 1

Raw-to-sRGB color rotation matrices corresponding to ranges of in-camera custom CCTs for the Olympus E-M1 camera with 12-100/4 lens and v4.1 firmware. The middle column lists the matrices extracted from the raw metadata, which are 8-bit fixed-point numbers. By dividing by 256, the right column lists to four decimal places the same matrices such that each row sums to unity rather than 256.

CCT range (K)	Rotation matrix (fixed point)	Rotation matrix
2000 → 3000	$[\begin{matrix} 320 & - 36 & - 28 \\ - 68 & 308 & 16 \\ 14 & - 248 & 490 \end{matrix}]$	$[\begin{matrix} 1.2500 & - 0.1406 & - 0.1094 \\ - 0.2656 & 1.2031 & 0.0625 \\ 0.0547 & - 0.9688 & 1.9141 \end{matrix}]$
3100 → 3400	$[\begin{matrix} 332 & - 52 & - 24 \\ - 58 & 320 & - 6 \\ 12 & - 192 & 436 \end{matrix}]$	$[\begin{matrix} 1.2969 & - 0.2031 & - 0.0938 \\ - 0.2266 & 1.2500 & - 0.0234 \\ 0.0469 & - 0.7500 & 1.7031 \end{matrix}]$
3500 → 3700	$[\begin{matrix} 340 & - 60 & - 24 \\ - 56 & 324 & - 12 \\ 12 & - 172 & 416 \end{matrix}]$	$[\begin{matrix} 1.3281 & - 0.2344 & - 0.0938 \\ - 0.2188 & 1.2656 & - 0.0469 \\ 0.0469 & - 0.6719 & 1.6250 \end{matrix}]$
3800 → 4000	$[\begin{matrix} 346 & - 68 & - 22 \\ - 52 & 332 & - 24 \\ 10 & - 160 & 406 \end{matrix}]$	$[\begin{matrix} 1.3516 & - 0.2656 & - 0.0859 \\ - 0.2031 & 1.2969 & - 0.0938 \\ 0.0391 & - 0.6250 & 1.5859 \end{matrix}]$
4200 → 4400	$[\begin{matrix} 346 & - 68 & - 22 \\ - 48 & 332 & - 28 \\ 12 & - 160 & 404 \end{matrix}]$	$[\begin{matrix} 1.3516 & - 0.2656 & - 0.0859 \\ - 0.1875 & 1.2969 & - 0.1094 \\ 0.0469 & - 0.6250 & 1.5781 \end{matrix}]$
4600 → 5000	$[\begin{matrix} 354 & - 76 & - 22 \\ - 44 & 336 & - 36 \\ 10 & - 148 & 394 \end{matrix}]$	$[\begin{matrix} 1.3828 & - 0.2969 & - 0.0859 \\ - 0.1719 & 1.3125 & - 0.1406 \\ 0.0391 & - 0.5781 & 1.5391 \end{matrix}]$
5200 → 5600	$[\begin{matrix} 366 & - 88 & - 22 \\ - 42 & 340 & - 42 \\ 10 & - 136 & 382 \end{matrix}]$	$[\begin{matrix} 1.4297 & - 0.3438 & - 0.0859 \\ - 0.1641 & 1.3281 & - 0.1641 \\ 0.0391 & - 0.5313 & 1.4922 \end{matrix}]$
5800 → 6600	$[\begin{matrix} 374 & - 96 & - 22 \\ - 42 & 348 & - 50 \\ 8 & - 124 & 372 \end{matrix}]$	$[\begin{matrix} 1.4609 & - 0.3750 & - 0.0859 \\ - 0.1641 & 1.3594 & - 0.1953 \\ 0.0313 & - 0.4844 & 1.4531 \end{matrix}]$
6800 → 14000	$[\begin{matrix} 388 & - 108 & - 24 \\ - 38 & 360 & - 66 \\ 8 & - 112 & 360 \end{matrix}]$	$[\begin{matrix} 1.5156 & - 0.4219 & - 0.0938 \\ - 0.1484 & 1.4063 & - 0.2578 \\ 0.0313 & - 0.4375 & 1.4063 \end{matrix}]$

Fig. 9

Raw channel multipliers used by the Olympus E-M1 camera as a function of CCT. The camera uses the same multipliers for both of the green channels.

Table 2

Raw-to-sRGB color rotation matrices and associated raw channel multipliers corresponding to in-camera scene modes for the Olympus E-M1 camera with 12-100/4 lens and v4.1 firmware. All values are 8-bit fixed-point numbers that can be divided by 256. Since the scene mode presets include a color tint away from the Planckian locus, the multipliers and matrices do not necessarily have the same values as the custom CCT presets with the same CCT listed in Table 1.

Scene mode	CCT (K)	Multipliers	Rotation matrix (fixed point)
Fine weather	5300	474 256 414	$[\begin{matrix} 366 & - 88 & - 22 \\ - 42 & 340 & - 42 \\ 10 & - 136 & 382 \end{matrix}]$
Fine weather with shade	7500	552 256 326	$[\begin{matrix} 388 & - 108 & - 24 \\ - 38 & 360 & - 66 \\ 8 & - 112 & 360 \end{matrix}]$
Cloudy	6000	510 256 380	$[\begin{matrix} 374 & - 96 & - 22 \\ - 42 & 348 & - 50 \\ 8 & - 124 & 372 \end{matrix}]$
Tungsten (incandescent)	3000	276 256 728	$[\begin{matrix} 320 & - 36 & - 28 \\ - 68 & 308 & 16 \\ 14 & - 248 & 490 \end{matrix}]$
Cool white fluorescent	4000	470 256 580	$[\begin{matrix} 430 & - 168 & - 6 \\ - 50 & 300 & 6 \\ 12 & - 132 & 376 \end{matrix}]$
Underwater		450 256 444	$[\begin{matrix} 366 & - 88 & - 22 \\ - 42 & 340 & - 42 \\ 10 & - 136 & 382 \end{matrix}]$
Flash	5500	562 256 366	$[\begin{matrix} 366 & - 88 & - 22 \\ - 42 & 340 & - 42 \\ 10 & - 136 & 382 \end{matrix}]$

For any given camera model, all preset color rotation matrices are dependent on factors such as the output-referred color space selected by the user in the camera settings (such as sRGB or Adobe^® RGB), the lens model used to take the photograph, and the firmware version. Due to sensor calibration differences between different examples of the same camera model, there can also be a dependence on the individual camera used to take the photograph.

For example, Fig. 10(a) shows a photo of a color chart in the camera raw space taken under D65 illumination. Like Fig. 6(a), the green color tint arises from the fact that the camera raw space $RGB$ values are being interpreted as RGB values in the sRGB color space for display purposes without any color characterization matrix applied to convert the colors. Figure 10(b) shows the same photo after applying the diagonal WB matrix $\underline{D}$ to chromatically adapt the AW to the camera raw space reference white. The raw channel multipliers remove the green tint, but the photo remains in the camera raw space. Remarkably, the colors appear realistic, although desaturated. To illustrate that the camera raw space reference white is actually a magenta color when expressed using CIE colorimetry, Fig. 10(c) converts (b) to the sRGB color space without any further chromatic adaptation by applying a conventional characterization matrix $\underline{T}$ followed by ${\underline{M}}_{sRGB}^{- 1}$ . In contrast, Fig. 10(d) was obtained by applying the appropriate raw channel multipliers followed by the sRGB color rotation matrix $\underline{R}$ in place of $\underline{T}$ and ${\underline{M}}_{sRGB}^{- 1}$ . The color rotation matrix includes a CAT that adapts the camera raw space reference white to the sRGB color space D65 reference white. In this particular case, $\underline{D} = {\underline{D}}_{D 65}$ , so the color rotation matrix $\underline{R}$ defined by Eq. (40) becomes

Eq. (46)

\underline{R} \equiv {\underline{R}}_{D 65} = {\underline{M}}_{sRGB}^{- 1} {\underline{T}}_{D 65} {\underline{D}}_{D 65}^{- 1} .

Substituting into Eq. (37) yields

Eq. (47)

{[\begin{matrix} R_{L} \\ G_{L} \\ B_{L} \end{matrix}]}_{D 65} = {\underline{M}}_{sRGB}^{- 1} {\underline{T}}_{D 65} {\underline{D}}_{D 65}^{- 1} {\underline{D}}_{D 65} {[\begin{matrix} R \\ G \\ B \end{matrix}]}_{scene} .

Consequently, the rotation matrix reverses the effect of the WB matrix since the scene and display illumination is the same.

Fig. 10

(a) Photo of a color chart in the camera raw space taken under D65 illumination. (b) After application of the appropriate raw channel multipliers. These remove the green tint, but the photo remains in the camera raw space. (c) After application of the appropriate raw channel multipliers and converting to sRGB without any further chromatic adaptation. The white patch reveals the true color of the camera raw space reference white. (d) After application of the appropriate raw channel multipliers and an sRGB color rotation matrix $\underline{R}$ .

6. DCRaw Open-Source Raw Converter

The widely used DCRaw open-source raw converter (pronounced “dee-see-raw”) written by D. Coffin can process a wide variety of raw image file formats. It is particularly useful for scientific analysis as it can decode raw files without demosaicing, it can apply linear tone curves, and it can directly output to the camera raw space and the CIE XYZ color space. Some relevant commands are listed in Table 3. However, DCRaw by default outputs directly to the sRGB color space with a D65 illumination WP by utilizing a variation of the traditional digital camera strategy described in the previous section.²⁸

Recall that the color rotation matrix optimized for use with the scene illumination is defined by Eq. (40):

\underline{R} = {\underline{M}}_{sRGB}^{- 1} {\underline{CAT}}_{AW \to D 65} \underline{T} {\underline{D}}^{- 1} .

Although digital cameras typically use a small set of preset rotation matrices optimized for a selection of preset illuminants, DCRaw instead takes a very computationally simple approach that uses only a single rotation matrix optimized for D65 scene illumination,

\underline{R} \approx {\underline{R}}_{D 65}

. This is achieved using a characterization matrix

{\underline{T}}_{D 65}

optimized for D65 illumination, which means that the

{\underline{D}}^{- 1}

matrix contained within

\underline{R}

is replaced by

{\underline{D}}_{D 65}^{- 1}

and the

{\underline{CAT}}_{AW \to D 65}

matrix is not required:

Eq. (48)

{\underline{R}}_{D 65} = {\underline{M}}_{sRGB}^{- 1} {\underline{T}}_{D 65} {\underline{D}}_{D 65}^{- 1} .

The diagonal WB matrix

{\underline{D}}_{D 65}

contains raw channel multipliers appropriate for D65 illumination:

Eq. (49)

{\underline{D}}_{D 65} = {[\begin{matrix} \frac{1}{R (WP)} & 0 & 0 \\ 0 & \frac{1}{G (WP)} & 0 \\ 0 & 0 & \frac{1}{B (WP)} \end{matrix}]}_{D 65} = [\begin{matrix} \frac{1}{R (D 65)} & 0 & 0 \\ 0 & \frac{1}{G (D 65)} & 0 \\ 0 & 0 & \frac{1}{B (D 65)} \end{matrix}] .

The overall transformation from the camera raw space to the linear form of sRGB is defined by

Eq. (50)

{[\begin{matrix} R_{L} \\ G_{L} \\ B_{L} \end{matrix}]}_{D 65} \approx {\underline{R}}_{D 65} \underline{D} {[\begin{matrix} R \\ G \\ B \end{matrix}]}_{scene},

which can be more explicitly written as

Eq. (51)

{[\begin{matrix} R_{L} \\ G_{L} \\ B_{L} \end{matrix}]}_{D 65} \approx {\underline{M}}_{sRGB}^{- 1} {\underline{T}}_{D 65} [\begin{matrix} \frac{R (D 65)}{R (AW)} & 0 & 0 \\ 0 & \frac{G (D 65)}{G (AW)} & 0 \\ 0 & 0 & \frac{B (D 65)}{B (AW)} \end{matrix}] {[\begin{matrix} R \\ G \\ B \end{matrix}]}_{scene} .

Consequently, all chromatic adaptation is performed using raw channel multipliers. Notice that the WB matrix

\underline{D}

appropriate for the scene illumination estimate is always applied to the raw data in Eq. (50), so WB is always correctly achieved in principle.

Table 3

A selection of relevant DCraw commands available in version 9.28. Note that the RGB output colorspace options use color rotation matrices and so should only be used with the correct raw channel multipliers due to the inbuilt CAT.

-v	Print verbose messages
-w	Use camera WB, if possible
-A <x y w h>	Average a gray box for WB
-r <r g b g>	Set custom WB
+M/-M	Use/do not use an embedded color matrix
-H [0-9]	Highlight mode (0 = clip, 1 = unclip, 2 = blend, 3+ = rebuild)
-o [0-6]	Output colorspace (raw, sRGB, Adobe, Wide, ProPhoto, XYZ, ACES)
-d	Document mode (no color, no interpolation)
-D	Document mode without scaling (totally raw)
-W	Do not automatically brighten the image
-b <num>	Adjust brightness (default = 1.0)
-g <p ts>	Set custom gamma curve (default = 2.222 4.5)
-q [0-3]	Set the interpolation quality
-h	Half-size color image (twice as fast as “-q 0”)
-f	Interpolate RGGB as four colors
-6	Write 16-bit instead of 8-bit
-4	Linear 16-bit, same as “-6 -W -g 1 1”
-T	Write TIFF instead of PPM

Although the color transformation matrix ${\underline{T}}_{D 65}$ is optimized for D65 scene illumination, applying the color rotation matrix ${\underline{R}}_{D 65}$ to transform from the camera raw space to sRGB is valid for any scene illumination CCT since color rotation matrices vary very slowly as a function of CCT, as evident from Fig. 8. However, ${\underline{R}}_{D 65}$ is the optimum choice for D65 scene illumination, so a drawback of this simplified approach is that the overall color transformation loses some accuracy when the scene illumination differs significantly from D65.

6.1.

Example: Olympus E-M1

DCRaw uses color rotation matrices obtained via Eq. (48), so a ${\underline{T}}_{D 65}$ characterization matrix is required for a given camera model. For this purpose, DCRaw uses the Adobe “ColorMatrix2” matrices from the Adobe^® DNG converter.³²

Due to highlight recovery logic requirements, the Adobe matrices map in the opposite direction to the conventional characterization matrices defined in Sec. 2.4, and therefore

Eq. (52)

{\underline{T}}_{D 65} = {(\frac{1}{c} \underline{ColorMatrix2})}^{- 1},

where

c

is a normalization constant. For the Olympus E-M1 digital camera, the DCRaw source code stores the ColorMatrix2 entries in the following manner:

7687, - 1984, - 606, - 4327, 11928, 2721, - 1381, 2339, 6452 .

Dividing by 10,000 and rearranging in matrix form yields

Eq. (53)

\underline{ColorMatrix2} = [\begin{matrix} 0.7687 & - 0.1984 & - 0.0606 \\ - 0.4327 & 1.1928 & 0.2721 \\ - 0.1381 & 0.2339 & 0.6452 \end{matrix}] .

Recall from Sec. 2.5 that characterization matrices are typically normalized so that the WP of the characterization illuminant maps to raw values such that the maximum value (typically the green channel) just reaches saturation when a 100% neutral diffuse reflector is photographed under the characterization illuminant. Although the ColorMatrix2 matrices are optimized for CIE illuminant D65, they are by default normalized according to the WP of CIE illuminant D50 rather than D65:

Eq. (54)

{[\begin{matrix} R (WP) \\ G (WP) \\ B (WP) \end{matrix}]}_{D 50} = \underline{ColorMatrix2} {[\begin{matrix} X (WP) = 0.9642 \\ Y (WP) = 1.0000 \\ Z (WP) = 0.8249 \end{matrix}]}_{D 50},

where

\max {R (WP), G (WP), B (WP)} = 1

. Accordingly, they need to be rescaled for use with DCRaw:

Eq. (55)

{[\begin{matrix} R (WP) \\ G (WP) \\ B (WP) \end{matrix}]}_{D 65} = \frac{1}{c} \underline{ColorMatrix2} {[\begin{matrix} X (WP) = 0.9504 \\ Y (WP) = 1.0000 \\ Z (WP) = 1.0888 \end{matrix}]}_{D 65},

where

\max {R (WP), G (WP), B (WP)} = 1

. In the present example, it is found that

c = 1.0778

, so

Eq. (56)

{\underline{T}}_{D 65}^{- 1} = [\begin{matrix} 0.7133 & - 0.1841 & - 0.0562 \\ - 0.4015 & 1.1068 & 0.2525 \\ - 0.1281 & 0.2170 & 0.5987 \end{matrix}] .

By considering the unit vector in the sRGB color space, the above matrix can be used to obtain the raw tristimulus values for the D65 illumination WP:

Eq. (57)

{[\begin{matrix} R (WP) = 0.4325 \\ G (WP) = 1.0000 \\ B (WP) = 0.7471 \end{matrix}]}_{D 65} = {\underline{T}}_{D 65}^{- 1} {\underline{M}}_{sRGB} {[\begin{matrix} R_{L} = 1 \\ G_{L} = 1 \\ B_{L} = 1 \end{matrix}]}_{D 65},

where

{\underline{M}}_{sRGB}

converts from the linear form of sRGB to CIE XYZ. Now Eq. (49) can be used to extract the raw channel multipliers for scene illumination with a D65 WP:

Eq. (58)

{\underline{D}}_{D 65} = [\begin{matrix} 2.3117 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1.3385 \end{matrix}] .

Finally, the color rotation matrix can be calculated from Eq. (48):

Eq. (59)

{\underline{R}}_{D 65} = [\begin{matrix} 1.7901 & - 0.6689 & - 0.1212 \\ - 0.2167 & 1.7521 & - 0.5354 \\ 0.0543 & - 0.5582 & 1.5039 \end{matrix}] .

Each row sums to unity as required. The form of the matrix is similar to the in-camera Olympus matrices listed in Table 1. For comparison purposes, the appropriate listed matrix is the one valid for scene illuminant CCTs ranging from 5800 to 6600 K. Some numerical differences are expected since D65 illumination has a

D_{u v} = 0.0032

color tint. Other numerical differences are likely due to differing characterization practices between Olympus and Adobe. Additionally, Adobe uses HSV (hue, saturation, and value) tables to emulate the final color rendering of the in-camera JPEG processing engine.

6.2.

DCRaw and MATLAB

As shown in Table 3, DCRaw includes many commands that are useful for scientific research. However, it is important to note that the RGB output color space options use color rotation matrices rather than the concatenation of the raw to CIE XYZ and CIE XYZ to RGB matrices. Since color rotation matrices include an inbuilt CAT, these options will only achieve the expected result in combination with the correct raw channel multipliers. For example, setting each raw channel multiplier to unity will not prevent some partial chromatic adaptation from being performed if the sRGB output is selected since the DCRaw color rotation matrix incorporates the ${\underline{D}}_{D 65}^{- 1}$ matrix, which is a type of ${\underline{CAT}}_{RW \to D 65}$ .

A robust way to use DCRaw for scientific research is via the “dcraw -v -D -4 -T filename” command, which provides linear 16-bit TIFF output in the raw color space without white balancing, demosaicing, or color conversion. Subsequent processing can be performed after importing the TIFF file into MATLAB^® using the conventional “imread” command. Reference 49 provides a processing tutorial. The color chart photos in the present article were produced using this methodology.

For example, after importing the file into MATLAB via the above commands, a viewable output image in the sRGB color space without any white balancing can be obtained by applying the appropriate characterization matrix $\underline{T}$ after the color demosaic, followed by direct application of the standard CIE XYZ to sRGB matrix, ${\underline{M}}_{sRGB}^{- 1}$ .

7. Adobe DNG

The Adobe^® DNG is an open-source raw file format developed by Adobe.³²^,⁵⁰ The freeware DNG Converter can be used to convert any raw file into the DNG format.

Although the DNG converter does not aim to produce a viewable output image, it does perform a color conversion from the camera raw space into the profile connection space (PCS) based on the CIE XYZ color space with a D50 illumination WP.⁴⁰ (This is not the actual reference white of CIE XYZ, which is CIE illuminant E.) Consequently, the color processing model used by the DNG converter must provide appropriate characterization matrices along with a strategy for achieving correct WB in relation to the PCS. When processing DNG files, raw converters can straightforwardly map from the PCS to any chosen output-referred color space and associated reference white.

The DNG specification provides two different color processing models referred to here as method 1 and method 2. Method 1 adopts a similar strategy as smartphones and commercial raw converters, the difference being that the data remain in the PCS. By using raw channel multipliers, method 2 adopts a similar strategy as traditional digital cameras. However, the multipliers are applied in conjunction with a so-called forward matrix instead of a rotation matrix since the mapping is to the PCS rather than to an output-referred RGB color space.

7.1.

Method 1: Color Matrices

The transformation from the camera raw space to the PCS is defined as follows:

Eq. (60)

{[\begin{matrix} X \\ Y \\ Z \end{matrix}]}_{D 50} = {\underline{CAT}}_{AW \to D 50} {\underline{C}}^{- 1} {[\begin{matrix} R \\ G \\ B \end{matrix}]}_{scene} .

Here

\underline{C}

is an Adobe color matrix optimized for the scene AW. Due to highlight recovery logic requirements, Adobe color matrices map in the direction from the CIE XYZ color space to the camera raw space:

Eq. (61)

{[\begin{matrix} R \\ G \\ B \end{matrix}]}_{scene} = \underline{C} {[\begin{matrix} X \\ Y \\ Z \end{matrix}]}_{scene} .

This is the opposite direction to a conventional characterization matrix

\underline{T}

, so

Eq. (62)

\underline{C} \propto {\underline{T}}^{- 1} .

After the

\underline{C}

inverse maps from the camera raw space to CIE XYZ, the linear Bradford CAT is applied to adapt the AW to the WP of the PCS.

Analogous to the issue described in Sec. 4 for smartphones, the implementation of Eq. (60) is complicated by the fact that $\underline{C}$ should be optimized for the scene AW. The optimized $\underline{C}$ matrix is determined by interpolating between two color matrices labeled ColorMatrix1 and ColorMatrix2, where ColorMatrix1 should be obtained from a characterization performed using a low-CCT illuminant such as CIE illuminant A and ColorMatrix2 should be obtained from a characterization performed using a high-CCT illuminant such as CIE illuminant D65.³²

The optimized matrix $\underline{C}$ is calculated by interpolating between ColorMatrix1 and ColorMatrix2 based on the scene illumination CCT estimate denoted by CCT(AW), together with the CCTs associated with each of the two characterization illuminants denoted by ${CCT}_{1}$ and ${CCT}_{2}$ , respectively, with ${CCT}_{1} < {CCT}_{2}$ .

7.2.

Color Matrix Normalization

Recall from Sec. 2.5 that characterization matrices are typically normalized so that the characterization illuminant WP in the CIE XYZ color space just saturates the raw data in the camera raw space and that the green raw channel is typically the first to saturate. However, in the present context, the Adobe ColorMatrix1 and ColorMatrix2 matrices require a common normalization that is convenient for performing the interpolation. Analogous to Sec. 4.1, the AW is not known in terms of the CIE XYZ color space prior to the interpolation. Instead, ColorMatrix1 and ColorMatrix2 are by default normalized so that the WP of the PCS just saturates the raw data:

Eq. (63)

{[\begin{matrix} R (WP) \\ G (WP) \\ B (WP) \end{matrix}]}_{D 50} = \underline{ColorMatrix1} {[\begin{matrix} X (WP) = 0.9642 \\ Y (WP) = 1.0000 \\ Z (WP) = 0.8249 \end{matrix}]}_{D 50}, {[\begin{matrix} R (WP) \\ G (WP) \\ B (WP) \end{matrix}]}_{D 50} = \underline{ColorMatrix2} {[\begin{matrix} X (WP) = 0.9642 \\ Y (WP) = 1.0000 \\ Z (WP) = 0.8249 \end{matrix}]}_{D 50},

where

\max {R (WP), G (WP), B (WP)} = 1

. For example, the default ColorMatrix1 and ColorMatrix2 for the Olympus E-M1 camera are, respectively, normalized as follows:

Eq. (64)

{[\begin{matrix} R (WP) = 0.5471 \\ G (WP) = 1.0000 \\ B (WP) = 0.6560 \end{matrix}]}_{D 50} = [\begin{matrix} 1.1528 & - 0.5742 & 0.0118 \\ - 0.2453 & 1.0205 & 0.2619 \\ - 0.0751 & 0.189 & 0.6539 \end{matrix}] {[\begin{matrix} X (WP) = 0.9642 \\ Y (WP) = 1.0000 \\ Z (WP) = 0.8249 \end{matrix}]}_{D 50}, {[\begin{matrix} R (WP) = 0.4928 \\ G (WP) = 1.0000 \\ B (WP) = 0.6330 \end{matrix}]}_{D 50} = [\begin{matrix} 0.7687 & - 0.1984 & - 0.0606 \\ - 0.4327 & 1.1928 & 0.2721 \\ - 0.1381 & 0.2339 & 0.6452 \end{matrix}] {[\begin{matrix} X (WP) = 0.9642 \\ Y (WP) = 1.0000 \\ Z (WP) = 0.8249 \end{matrix}]}_{D 50} .

The interpolated $\underline{C}$ initially inherits this normalization. However, after $\underline{C}$ has been determined, the CIE XYZ values for the AW will be known. Consequently, the Adobe DNG SDK source code later re-normalizes Eq. (60) so that the AW in the camera raw space maps to the WP of the PCS when the raw data just saturates:

Eq. (65)

{[\begin{matrix} X (WP) = 0.9641 \\ Y (WP) = 1.0000 \\ Z (WP) = 0.8249 \end{matrix}]}_{D 50} = {\underline{CAT}}_{AW \to D 50} {\underline{C}}^{- 1} {[\begin{matrix} R (AW) \\ G (AW) \\ B (AW) \end{matrix}]}_{scene},

where

\max {R (WP), G (WP), B (WP)} = 1

. This is equivalent to re-normalizing

\underline{C}

as follows:

Eq. (66)

{[\begin{matrix} R (AW) \\ G (AW) \\ B (AW) \end{matrix}]}_{scene} = \underline{C} {[\begin{matrix} X (AW) \\ Y (AW) \\ Z (AW) \end{matrix}]}_{scene},

where

Y (AW) = 1

and

\max {R (WP), G (WP), B (WP)} = 1

.

7.3.

Linear Interpolation Based on Inverse CCT

The method 1 interpolation algorithm is the same as that described in Sec. 4.1, except that ColorMatrix1, ColorMatrix2, and $\underline{C}$ replace $\underline{T_{1}}$ , $\underline{T_{2}}$ , and $\underline{T}$ , respectively. Furthermore, the Adobe DNG specification requires the interpolation method to be linear interpolation based upon inverse CCT.³²

Again, the interpolation itself is complicated by the fact that the AW is typically calculated by the camera in terms of raw values $R (AW)$ , $G (AW)$ , and $B (AW)$ , but the corresponding CCT(AW) requires knowledge of the $(x, y)$ chromaticity coordinates. This means converting to CIE XYZ via a matrix transformation $\underline{C}$ that itself depends upon the unknown CCT(AW), which can be solved using a self-consistent iteration procedure.

1. Make a guess for the AW chromaticity coordinates, $(x (AW), y (AW))$ . For example, the chromaticity coordinates corresponding to one of the characterization illuminants could be used.
2. Find the CCT value CCT(AW) that corresponds to the chromaticity coordinates $(x (AW), y (AW))$ using one of the methods listed in step 2 of Sec. 4.1.
3. Perform a linear interpolation:
Eq. (67)
$\underline{C} = α \underline{ColorMatrix1} + (1 - α) \underline{ColorMatrix2},$
where $α$ is the CCT-dependent weighting that depends upon inverse-CCT:
Eq. (68)
$α = \frac{{(CCT (AW))}^{- 1} - {({CCT}_{2})}^{- 1}}{{({CCT}_{1})}^{- 1} - {({CCT}_{2})}^{- 1}} .$
These weights (denoted by $g$ and $1 - g$ in the Adobe DNG SDK source code) are illustrated in Fig. 11 for a pair of example ${CCT}_{1}$ and ${CCT}_{2}$ values. The interpolation is valid for
$CCT (1) \leq CCT (AW) \leq CCT (2) .$
If $CCT (AW) < {CCT}_{1}$ , then $\underline{C}$ should be set equal to ColorMatrix1, and if $CCT (AW) > {CCT}_{2}$ , then $\underline{C}$ should be set equal to ColorMatrix2.
4. Use $\underline{C}$ to transform the AW from the camera raw space to CIE XYZ:
Eq. (69)
${[\begin{matrix} X (AW) \\ Y (AW) \\ Z (AW) \end{matrix}]}_{scene} = {\underline{C}}^{- 1} {[\begin{matrix} R (AW) \\ G (AW) \\ B (AW) \end{matrix}]}_{scene} .$
This yields a new guess for $(x (AW), y (AW))$ .
5. Repeat the procedure starting from step 2 until $(x (AW), y (AW))$ , CCT(AW), and $\underline{C}$ all converge to a stable solution.
6. Normalize the color conversion according to Eq. (65).

Figure 12 illustrates the results of inverse-CCT based linear interpolation using the Adobe color matrices defined by Eq. (64) for the Olympus E-M1 camera. Note that the ColorMatrix2 is the same as that defined by Eq. (53), which was extracted from the DCRaw source code.

Fig. 11

Linear interpolation weighting factors $α$ and $1 - α$ -based on inverse CCT with ${CCT}_{1} = 2855 K$ and ${CCT}_{2} = 6504 K$ .

Fig. 12

Optimized color matrix $\underline{C}$ plotted as a function of CCT and obtained via inverse-CCT-based linear interpolation of the Adobe ColorMatrix1 (illuminant A, ${CCT}_{2} = 2855 K$ ) and ColorMatrix2 (illuminant D65, ${CCT}_{2} = 6504 K$ ) color conversion matrices for the Olympus E-M1 camera.

Since $\underline{C}$ maps in the direction from the CIE XYZ color space to the camera raw space, the inverse of the interpolated $\underline{C}$ can be compared to a conventional characterization matrix $\underline{T}$ at a given illuminant CCT. Figure 13 shows the inverse of the interpolated $\underline{C}$ plotted as a function of CCT, and this figure can be compared with Fig. 5, which shows conventional characterization matrices for the same camera optimized for a selection of CCTs. Although the two plots use different normalizations since the characterization matrices are normalized according to their characterization illuminant WP rather than the WP of the PCS, the variation with respect to CCT is similar. However, it is evident that the interpolated $\underline{C}$ loses accuracy for CCTs below ${CCT}_{1}$ .

Fig. 13

Inverse of the interpolated color matrix $\underline{C}$ plotted in Fig. 12.

7.4.

Method 2: Forward Matrices

Consider the transformation from the camera raw space to the PCS defined by Eq. (60):

{[\begin{matrix} X \\ Y \\ Z \end{matrix}]}_{D 50} = {\underline{CAT}}_{AW \to D 50} {\underline{C}}^{- 1} {[\begin{matrix} R \\ G \\ B \end{matrix}]}_{scene},

where

\underline{C}

is the Adobe color matrix optimized for the scene AW. Method 2 reformulates the above transformation in the following manner:

Eq. (70)

{[\begin{matrix} X \\ Y \\ Z \end{matrix}]}_{D 50} = \underline{F} \underline{D} {[\begin{matrix} R \\ G \\ B \end{matrix}]}_{scene} .

The color conversion can be decomposed into two steps.

1. Analogous to the color conversion strategy of traditional digital cameras described in Sec. 5, the diagonal matrix $\underline{D}$ defined by Eq. (38) contains raw channel multipliers appropriate for the AW, i.e., the estimated scene illumination WP estimate:
$\underline{D} = {[\begin{matrix} \frac{1}{R (AW)} & 0 & 0 \\ 0 & \frac{1}{G (AW)} & 0 \\ 0 & 0 & \frac{1}{B (AW)} \end{matrix}]}_{scene} .$
In particular, the raw channel multipliers serve to chromatically adapt the AW to the reference white of the camera raw space:
Eq. (71)
$[\begin{matrix} R = 1 \\ G = 1 \\ B = 1 \end{matrix}] = \underline{D} {[\begin{matrix} R (AW) \\ G (AW) \\ B (AW) \end{matrix}]}_{scene} .$
Note that the Adobe DNG specification also accommodates raw channel multipliers applied in the analog domain.³² However, the latest digital cameras utilize ADCs with a relatively high bit depth of order 12 or 14 and consequently elect to apply raw channel multipliers in the digital domain.
2. The forward matrix $\underline{F}$ is a type of characterization matrix that maps from the camera raw space to the PCS and is optimized for the scene illumination. Since the PCS is based on the CIE XYZ color space with a D50 illumination WP, the forward matrix $\underline{F}$ includes a built-in CAT as it must also adapt the reference white of the camera raw space to the WP of D50 illumination:
Eq. (72)
${[\begin{matrix} X (WP) = 0.9642 \\ Y (WP) = 1.0000 \\ Z (WP) = 0.8249 \end{matrix}]}_{D 50} = \underline{F} [\begin{matrix} R = 1 \\ G = 1 \\ B = 1 \end{matrix}] .$

Since the forward matrix $\underline{F}$ should be optimized for the scene AW, it is in practice determined by interpolating between two forward matrices analogous to the interpolation approach used by method 1. The Adobe DNG specification provides tags for two forward matrices labeled ForwardMatrix1 and ForwardMatrix2, which should again be obtained from characterizations performed using a low-CCT illuminant and a high-CCT illuminant, respectively. The same interpolation method described in the previous section should be used, with ForwardMatrix1, ForwardMatrix2, and $\underline{F}$ replacing ColorMatrix1, ColorMatrix2, and $\underline{C}$ , respectively,

Eq. (73)

\underline{F} = α \underline{ForwardMatrix1} + (1 - α) \underline{ForwardMatrix2} .

Figure 14 shows the optimized forward matrix interpolated from ForwardMatrix1 and ForwardMatrix2 and expressed as a function of CCT for the Olympus E-M1 camera.

Fig. 14

Optimized forward matrix $\underline{F}$ plotted as a function of CCT and obtained via inverse-CCT-based linear interpolation of the Adobe ForwardMatrix1 (illuminant A, ${CCT}_{2} = 2855 K$ ) and Forward Matrix2 (illuminant D65, ${CCT}_{2} = 6504 K$ ) matrices for the Olympus E-M1 camera. Evidently, the elements of the optimized forward matrix $\underline{F}$ vary very slowly and in a stable manner as a function of CCT, analogous to the color rotation matrix elements illustrated in Fig. 8.

7.5.

Forward Matrix Specification

By comparing Eqs. (60) and (70), $\underline{F}$ is algebraically related to the color matrix $\underline{C}$ as follows:

Eq. (74)

\underline{F} = {\underline{CAT}}_{AW \to D 50} {\underline{C}}^{- 1} {\underline{D}}^{- 1} .

Since

\underline{F}

is interpolated from ForwardMatrix1 and ForwardMatrix2 in practice, these are defined as

Eq. (75)

\underline{ForwardMatrix1} = {\underline{CAT}}_{AW \to D 50} {\underline{ColorMatrix1}}^{- 1} {\underline{D}}^{- 1} \underline{ForwardMatrix2} = {\underline{CAT}}_{AW \to D 50} {\underline{ColorMatrix2}}^{- 1} {\underline{D}}^{- 1} .

According to Eq. (72), the optimized forward matrix

\underline{F}

is by definition normalized such that the unit vector in the camera raw space maps to the D50 WP of the PCS.³² This means that ForwardMatrix1 and ForwardMatrix2 must also be normalized in this manner. For example, the default ForwardMatrix1 and ForwardMatrix2 for the Olympus E-M1 camera are, respectively, normalized as follows:

Eq. (76)

{[\begin{matrix} X (WP) = 0.9643 \\ Y (WP) = 0.9999 \\ Z (WP) = 0.8251 \end{matrix}]}_{D 50} = [\begin{matrix} 0.4734 & 0.3618 & 0.1291 \\ 0.2765 & 0.6827 & 0.0407 \\ 0.2116 & 0.0006 & 0.6129 \end{matrix}] [\begin{matrix} R = 1 \\ G = 1 \\ B = 1 \end{matrix}], {[\begin{matrix} X (WP) = 0.9643 \\ Y (WP) = 1.0000 \\ Z (WP) = 0.8252 \end{matrix}]}_{D 50} = [\begin{matrix} 0.4633 & 0.3244 & 0.1766 \\ 0.2779 & 0.6661 & 0.0560 \\ 0.1722 & 0.0033 & 0.6497 \end{matrix}] [\begin{matrix} R = 1 \\ G = 1 \\ B = 1 \end{matrix}] .

The official D50 WP of the PCS is actually

X = 0.9642

,

Y = 1.0000

, and

Z = 0.8249

,⁴⁰ which is a 16-bit fractional approximation of the true D50 WP defined by

X = 0.9642

,

Y = 1.0000

, and

Z = 0.8251

.

8. Conclusions

The opening section of this paper showed how the DCRaw open-source raw converter can be used to directly characterize a camera without needing to determine and invert the OECF and illustrated how characterization matrices are normalized in practice. As a consequence of camera metameric error, the camera raw space for a typical camera was shown to be warped away from the triangular shape accessible to additive linear combinations of three fixed primaries on the $x y$ chromaticity diagram, and the available gamut was shown to be dependent on the characterization illuminant. It was also shown that the reference white of a typical camera raw space has a strong magenta color tint.

Subsequently, this paper investigated and compared the type of color conversion strategies used by smartphone cameras and commercial raw converters, the image-processing engines of traditional digital cameras, DCRaw, and the Adobe DNG converter.

Smartphones and raw conversion software applications typically adopt the type of color conversion strategy familiar in color science. This involves the application of a characterization matrix $\underline{T}$ to transform from the camera raw space to the CIE XYZ color space, a CAT to chromatically adapt the estimated WP of the scene illumination to the reference white of an output-referred color space (such as D65 for sRGB), and finally a transformation from CIE XYZ to the linear form of the chosen output-referred color space. Since the optimized characterization matrix is CCT-dependent unless the Luther-Ives condition is satisfied, an optimized matrix can be determined by interpolating between two preset characterization matrices, one optimized for a low-CCT illuminant and the other optimized for a high-CCT illuminant. Simpler solutions include using a fixed characterization matrix optimized for representative scene illumination.

For traditional digital cameras, this paper showed how the overall color conversion is typically reformulated in terms of raw channel multipliers $\underline{D}$ along with a set of color rotation matrices $\underline{R}$ . The raw channel multipliers act as a type of CAT by chromatically adapting the scene illumination WP estimate to the reference white of the camera raw space. Since the rows of a color rotation matrix each sum to unity, the rotation matrix subsequently transforms from the camera raw space directly to the chosen output-referred RGB color space and at the same time chromatically adapts the camera raw space reference white to that of the output-referred color space. It was shown that the variation of the elements of a color rotation matrix with respect to CCT is very small, so only a small selection of preset rotation matrices are needed, each optimized for a specified preset illuminant. This enables raw channel multipliers appropriate for the scene illumination WP estimate to be applied in combination with the preset rotation matrix associated with the closest-matching WP. The primary advantages of the reformulation are that interpolation is not required and the method can be efficiently implemented on fixed-point architecture. Furthermore, image quality can be improved by applying the raw channel multipliers prior to the color demosaic.

It was shown that DCRaw uses a similar model as traditional digital cameras, except that only a single color rotation matrix is used for each camera, specifically a matrix optimized for D65 illumination, ${\underline{R}}_{D 65}$ . Although the overall color conversion loses some accuracy when the scene illumination differs significantly from D65, an advantage of decoupling the raw channel multipliers from the characterization information represented by the color rotation matrix is that WB can be correctly achieved for any type of scene illumination provided raw channel multipliers appropriate for the scene illumination are applied. It was shown that the rotation matrices used by DCRaw can be derived from the inverses of the “ColorMatrix2” color characterization matrices used by the Adobe DNG converter.

The Adobe DNG converter maps the camera raw space and scene illumination WP estimate to an intermediate stage in the overall color conversion, namely the PCS based on the CIE XYZ color space with a D50 WP. Method 1 defines the approach that is also used in commercial raw converters and advanced smartphones. A color matrix $\underline{C}$ optimized for the scene illumination is obtained by interpolating between the “ColorMatrix1” low-CCT and “ColorMatrix2” high-CCT preset matrices. Due to highlight recovery logic requirements, these color matrices map in the opposite direction to conventional characterization matrices. Furthermore, the ColorMatrix1 and ColorMatrix2 matrices are initially normalized according to the WP of the PCS rather than their corresponding characterization illuminants. Since the Adobe color matrices are freely available, their appropriately normalized inverses can serve as useful high-quality characterization matrices when camera characterization equipment is unavailable.

Method 2 offered by the Adobe DNG converter uses raw channel multipliers in a similar manner as traditional digital cameras. However, these are applied in combination with a so-called forward matrix rather than a rotation matrix since the Adobe DNG converter does not directly map to an output-referred RGB color space, so the forward matrix rows do not each sum to unity. Although the optimized forward matrix is determined by interpolating the “ForwardMatrix1” and “ForwardMatrix2” preset matrices, the variation of the optimized forward matrix with respect to CCT is very small, analogous to a rotation matrix.

9. Appendix: Raw Data Model

Consider the raw values expressed as an integration over the spectral passband of the camera according to Eq. (5):

R = k \int_{λ_{1}}^{λ_{2}} R_{1} (λ) {\tilde{E}}_{e, λ} d λ, G = k \int_{λ_{1}}^{λ_{2}} R_{2} (λ) {\tilde{E}}_{e, λ} d λ, B = k \int_{λ_{1}}^{λ_{2}} R_{3} (λ) {\tilde{E}}_{e, λ} d λ .

Although

{\tilde{E}}_{e, λ}

can be regarded as the average spectral irradiance at a photosite, it is more precisely described as the spectral irradiance convolved with the camera system point-spread function (PSF)

h (x, y, λ)

and sampled at positional coordinates

(x, y)

on the sensor plane:

Eq. (77)

{\tilde{E}}_{e, λ} (x, y) = [E_{e, λ, ideal} (x, y) * h (x, y, λ)] comb [\frac{x}{p_{x}}, \frac{y}{p_{y}}],

where

p_{x}

and

p_{y}

are the pixel pitches in the horizontal and vertical directions. A noise model can also be included.²⁸^,⁵¹ The quantity denoted by

E_{e, λ, ideal} (x, y)

is the ideal spectral irradiance at the sensor plane that would theoretically be obtained in the absence of the system PSF:

Eq. (78)

E_{λ, ideal} (x, y) = \frac{π}{4} L_{e, λ} (\frac{x}{m}, \frac{y}{m}) \frac{1}{N_{w}^{2}} T \cos^{4} {φ (\frac{x}{m}, \frac{y}{m})},

where

L_{e, λ}

is the corresponding scene spectral radiance,

m

is the system magnification,

N_{w}

is the working

f

-number of the lens,

T

is the lens transmittance factor, and

φ

is the object-space angle between the optical axis and the indicated scene coordinates. If the vignetting profile of the lens is known, the cosine fourth term can be replaced by the relative illumination factor, which is an image-space function describing the real vignetting profile.⁵²

The constant $k$ that appears in Eq. (5) places an upper bound on the magnitude of the raw values. It can be shown²⁸ that $k$ is given by

Eq. (79)

k = \frac{A_{p} t}{g_{i} e},

where

t

is the exposure duration and

g_{i}

is the conversion factor between electron counts and raw values for mosaic

i

, expressed using

e_{-} / DN

units.⁵³^,⁵⁴ The conversion factor is inversely proportional to the ISO gain

G_{ISO}

, which is the analog gain setting of the programmable gain amplifier situated upstream from the ADC:

Eq. (80)

g_{i} = \frac{U}{G_{ISO, i}}, U = \frac{n_{e, i, FWC}}{n_{DN, i, clip}} .

Here

U

is the unity gain, which is the gain setting at which

g_{i} = 1

. Full-well capacity is denoted by

n_{e, i, FWC}

, and

n_{DN, i, clip}

is the raw clipping point, which is the maximum available raw level. This value is not necessarily as high as the maximum raw level provided by the ADC given its bit-depth

M

, which is

2^{M} - 1

DN, particularly if the camera includes a bias offset that is subtracted before the raw data is written.²⁸^,⁵³

The least analog amplification is defined by $G_{ISO} = 1$ , which corresponds to the base ISO gain.²⁸^,⁵¹ The numerical values of the corresponding camera ISO settings $S$ are defined using the JPEG output rather than the raw data.⁵⁵^,⁵⁶ These user values also take into account digital gain applied via the JPEG tone curve. When comparing raw output from cameras based on different sensor formats, equivalent rather than the same exposure settings should be used when possible.⁵⁷

As noted in Sec. 2.2, the actual raw values obtained in practice are quantized values modeled by taking the integer part of Eq. (5), and it is useful to subsequently normalize them to the range [0,1] by dividing Eq. (5) by the raw clipping point.

References

1.

E. Machishima, “Color CCD imager having a system for preventing reproducibility degradation of chromatic colors,” U.S. Patent 5,253,047 (1993).

2.

K. E. Spaulding, R. M. Vogel and J. R. Szczepanski, “Method and apparatus for color-correcting multi-channel signals of a digital camera,” U.S. Patent 5,805,213 (1998).

3.

J. Von Kries, “Chromatic adaptation,” Sources of Color Science, 145 –148 MIT Press, Cambridge (1970). Google Scholar

4.

H. E. Ives, “The relation between the color of the illuminant and the color of the illuminated object,” Trans. Illum. Eng. Soc., 7 62 –72 (1912). https://doi.org/10.1002/COL.5080200112 Google Scholar

5.

J. Nakamura, “Basics of image sensors,” Image Sensors and Signal Processing for Digital Still Cameras, 53 –93 CRC Press, Taylor & Francis Group, Boca Raton, Florida (2006). Google Scholar

6.

J. Jiang et al., “What is the space of spectral sensitivity functions for digital color cameras?,” in IEEE Workshop Appl. Comput. Vision, 168 –179 (2013). https://doi.org/10.1109/WACV.2013.6475015 Google Scholar

7.

R. Luther, “Aus dem Gebiet der Farbreizmetrik (On color stimulus metrics),” Z. Tech. Phys., 12 540 –558 (1927). ZTPHAU 0373-0093 Google Scholar

8.

P.-C. Hung, “Sensitivity metamerism index for digital still camera,” Proc. SPIE, 4922 1 –14 (2002). https://doi.org/10.1117/12.483116 PSISDG 0277-786X Google Scholar

9.

P.-C. Hung, “Color theory and its application to digital still cameras,” Image Sensors and Signal Processing for Digital Still Cameras, 205 –222 CRC Press, Taylor & Francis Group, Boca Raton, Florida (2006). Google Scholar

10.

P. M. Hubel et al., “Matrix calculations for digital photography,” in Proc., IS&T Fifth Color Imaging Conf., 105 –111 (1997). Google Scholar

11.

J. Holm, I. Tastl and S. Hordley, “Evaluation of DSC (digital still camera) scene analysis error metrics—Part 1,” in Proc., IS&T/SID Eighth Color Imaging Conf., 279 –287 (2000). Google Scholar

12.

R. M. Vogel, “Digital imaging device optimized for color performance,” U.S. Patent 5,668,596 (1997).

13.

G. D. Finlayson and Y. Zhu, “Finding a colour filter to make a camera colorimetric by optimisation,” Lect. Notes Comput. Sci., 11418 53 –62 (2019). https://doi.org/10.1007/978-3-030-13940-7_5 LNCSD9 0302-9743 Google Scholar

14.

D. Alleysson, S. Susstrunk and J. Hérault, “Linear demosaicing inspired by the human visual system,” IEEE Trans. Image Process., 14 (4), 439 –449 (2005). https://doi.org/10.1109/TIP.2004.841200 IIPRE4 1057-7149 Google Scholar

15.

International Organization for Standardization, “Graphic technology and photography—colour target and procedures for the colour characterisation of digital still cameras (DCSs),” (2012). Google Scholar

16.

International Electrotechnical Commission, “Multimedia systems and equipment—colour measurement and management—Part 2-1: colour management—default RGB colour space—sRGB,” (1999). Google Scholar

17.

Adobe Systems Incorporated, “Adobe^® RGB (1998) color image encoding,” (2005). Google Scholar

18.

International Organization for Standardization, “Photography and graphic technology—extended colour encodings for digital image storage, manipulation and interchange—Part 2: reference output medium metric RGB colour image encoding (ROMM RGB),” (2013). Google Scholar

19.

J. Holm, “Capture color analysis gamuts,” in Proc. 14th Color and Imaging Conf., IS&T, 108 –113 (2006). Google Scholar

20.

B. E. Bayer, “Color imaging array,” U.S. Patent 3,971,065 (1976).

21.

D. Coffin, (2015). Google Scholar

22.

Y. Ohno, “Practical use and calculation of CCT and DUV,” LEUKOS, 10 (1), 47 –55 (2014). https://doi.org/10.1080/15502724.2014.839020 Google Scholar

23.

D. L. MacAdam, “Projective transformations of I. C. I. color specifications,” J. Opt. Soc. Am., 27 294 (1937). https://doi.org/10.1364/JOSA.27.000294 JOSAAH 0030-3941 Google Scholar

24.

CIE (Commission Internationale de l’Eclairage), in Proc. 14th Session, 36 (1959). Google Scholar

25.

Commission Internationale de l’Eclairage, “Colorimetry,” Vienna (2004). Google Scholar

26.

G. D. Finlayson and M. S. Drew, “White-point preserving color correction,” in Proc., IS&T Fifth Color Imaging Conf., 258 –261 (1997). Google Scholar

27.

International Organization for Standardization, “Photography—electronic still picture cameras—methods for measuring opto-electronic conversion functions (OECFs),” (2009). Google Scholar

28.

D. A. Rowlands, Physics of Digital Photography, IOP Publishing Ltd., Bristol (2017). Google Scholar

29.

D. Varghese, R. Wanat, R. K. Mantiuk, “Colorimetric calibration of high dynamic range images with a ColorChecker chart,” in 2nd Int. Conf. and SME Workshop HDR Imaging, (2014). Google Scholar

30.

J. McCann, “Do humans discount the illuminant?,” Proc. SPIE, 5666 9 –16 (2005). https://doi.org/10.1117/12.594383 PSISDG 0277-786X Google Scholar

31.

International Organization for Standardization, “Photography—electronic still picture imaging—vocabulary,” (2012). Google Scholar

32.

Adobe Systems Incorporated, “Adobe digital negative (DNG) specification,” (2012). Google Scholar

33.

G. Buchsbaum, “A spatial processor model for object color perception,” J. Franklin Inst., 310 1 –26 (1980). https://doi.org/10.1016/0016-0032(80)90058-7 JFINAB 0016-0032 Google Scholar

34.

E. H. Land and J. McCann, “Lightness and Retinex theory,” J. Opt. Soc. Am., 61 (1), 1 –11 (1971). https://doi.org/10.1364/JOSA.61.000001 JOSAAH 0030-3941 Google Scholar

35.

S. Hordley, “Scene illuminant estimation: past, present and future,” Color Res. Appl., 31 303 (2006). https://doi.org/10.1002/col.20226 CREADU 0361-2317 Google Scholar

36.

E. Y. Lam, G. S. K. Fung, “Automatic white balancing in digital photography,” Single-Sensor Imaging: Methods and Applications for Digital Cameras, 267 –294 CRC Press, Boca Raton, Florida (2009). Google Scholar

37.

Commission Internationale de l’Eclairage, “Fundamental chromaticity diagram with physiological axes—Part 1,” Vienna (2006). Google Scholar

38.

M. D. Fairchild, Color Appearance Models, 3rd ed.Wiley, New York (2013). Google Scholar

39.

K. M. Lam, “Metamerism and colour constancy,” (1985). Google Scholar

40.

International Color Consortium, “Image technology colour management—architecture, profile format, and data structure,” (2010). Google Scholar

41.

J. Holm et al., “Color processing for digital photography,” Color Engineering: Achieving Device Independent Color, 179 –220 John Wiley & Sons, Ltd., Chichester (2002). Google Scholar

42.

A. R. Robertson, “Computation of correlated color temperature and distribution temperature,” J. Opt. Soc. Am., 58 1528 (1968). https://doi.org/10.1364/JOSA.58.001528 JOSAAH 0030-3941 Google Scholar

43.

Q. Xingzhong, “Formulas for computing correlated color temperature,” Color Res. Appl., 12 (5), 285 (1987). https://doi.org/10.1002/col.5080120511 CREADU 0361-2317 Google Scholar

44.

C. S. McCamy, “Correlated color temperature as an explicit function of chromaticity coordinates,” Color Res. Appl., 17 142 (1992). https://doi.org/10.1002/col.5080170211 CREADU 0361-2317 Google Scholar

45.

J. Hernández-Andrés, R. L. Lee and J. Romero, “Calculating correlated color temperatures across the entire gamut of daylight and skylight chromaticities,” Appl. Opt., 38 5703 (1999). https://doi.org/10.1364/AO.38.005703 APOPAI 0003-6935 Google Scholar

46.

C. Li et al., “Accurate method for computing correlated color temperature,” Opt. Express, 24 (13), 14066 (2016). https://doi.org/10.1364/OE.24.014066 OPEXFF 1094-4087 Google Scholar

47.

K. A. Parulski, R. H. Hibbard and J. D’Luna, “Programmable Digital Circuit for Performing a Matrix Multiplication,” U.S. Patent No. 5,001,663 (1991).

48.

“ExifTool,” https://exiftool.org Google Scholar

49.

R. Sumner, “Processing RAW Images in MATLAB,” (2014). Google Scholar

50.

Adobe Systems Incorporated, “Introducing the digital negative specification: information for manufacturers,” (2004). Google Scholar

51.

D. A. Rowlands, Field Guide to Photographic Science, SPIE Press, Bellingham (2020). Google Scholar

52.

P. Maeda, P. Catrysse and B. Wandell, “Integrating lens design with digital camera simulation,” Proc. SPIE, 5678 48 (2005). https://doi.org/10.1117/12.588153 PSISDG 0277-786X Google Scholar

53.

E. Martinec, “Noise, dynamic range and bit depth in digital SLRs,” (2008). Google Scholar

54.

T. Mizoguchi, “Evaluation of image sensors,” Image Sensors and Signal Processing for Digital Still Cameras, 179 –203 CRC Press, Taylor & Francis Group, Boca Raton, Florida (2006). Google Scholar

55.

Camera & Imaging Products Association, “Sensitivity of digital cameras,” (2004). Google Scholar

56.

International Organization for Standardization, “Photography—digital still cameras—determination of exposure index, ISO speed ratings, standard output sensitivity, and recommended exposure index,” (2006). Google Scholar

57.

D. A. Rowlands, “Equivalence theory for cross-format photographic image quality comparisons,” Opt. Eng., 57 (11), 110801 (2018). https://doi.org/10.1117/1.OE.57.11.110801 Google Scholar

Biography

D. Andrew Rowlands received his BSc degree in mathematics and physics and his PhD in physics from the University of Warwick, UK, in 2000 and 2004, respectively. He has held research positions at the University of Bristol, UK, Lawrence Livermore National Laboratory, USA, Tongji University, China, and the University of Cambridge, UK, and he has authored three books on the science of digital photography.

Citation Download Citation

D. Andrew Rowlands "Color conversion matrices in digital cameras: a tutorial," Optical Engineering 59(11), 110801 (17 November 2020). https://doi.org/10.1117/1.OE.59.11.110801

Received: 24 August 2020; Accepted: 27 October 2020; Published: 17 November 2020

Access the abstract

JOURNAL ARTICLE
36 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

CITATIONS

Cited by 13 scholarly publications.

Explore citations on Lens.org

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Cameras

Matrices

Digital cameras

Computed tomography

Optical engineering

RGB color model

Colorimetry

1.

Introduction

Eq. (1)

2.

Camera Raw Space

2.1.

Gamut

Eq. (2)

Eq. (3)

Fig. 1

Fig. 2

Fig. 3

2.2.

Raw Values

Eq. (4)

Eq. (5)

2.3.

Reference White

Eq. (6)

Eq. (7)

Eq. (8)

Fig. 4

2.4.

Camera Color Characterization

Eq. (9)

Eq. (10)

Eq. (11)

Eq. (12)

Eq. (13)

Eq. (14)

Fig. 5

Fig. 6

2.5.

Characterization Matrix Normalization

Eq. (15)

Eq. (16)

Eq. (17)

3.

White Balance

Eq. (18)

Eq. (19)

Eq. (20)

Fig. 7

3.1.

Chromatic Adaptation Transforms

Eq. (21)

Eq. (22)

Eq. (23)

Eq. (24)

Eq. (25)

Eq. (26)

Eq. (27)

Eq. (28)

4.

Smartphone Cameras

Eq. (29)

Eq. (30)

Eq. (31)

Eq. (32)

Eq. (33)

4.1.

Interpolation Algorithm

Eq. (34)

Eq. (35)

Eq. (36)

5.

Traditional Digital Cameras

Eq. (37)

Eq. (38)

Eq. (39)

Eq. (40)

Eq. (41)

Eq. (42)

Eq. (43)

5.1.

Multiplier and Matrix Decoupling

Fig. 8

Eq. (44)

Eq. (45)

5.2.