Color Science History and the ICC Profile Specifications
A lot of people assume that the ICC prohibition against negative tristimulus values is based on some kind of physical reality grounded in human perception of color. Actually, exactly the opposite is true: negative primaries are used to describe human color perception. Furthermore, an adequate characterization of many device primaries also requires negative tristimulus values.
Written in 2010. Updated in March 2015.
Physical primaries require negative tristimulus values
Even in technical fields of study, if you really want to understand your subject matter, history matters. To understand the ICC profile specifications, it helps to know a little bit about the history of color science and a little bit about 20th-century computer technology.
The tristimulus theory of human color perception says that any color that humans can see, can be reproduced, or "matched", by some combination of Red, Blue, and Green single-wavelength light beams. The wavelengths used to make the red, blue and green light beams are called "primaries".
Experiments conducted in the 1920s by David Wright and John Guild found that in actual fact, many colors couldn't be matched in any straightforward way using a mix of red, blue, and green primaries (single wave-length light sources):
The observer would alter the brightness of each of the three primary beams until a match to the test color was observed. Not all test colors could be matched using this technique. When this was the case, a variable amount of one of the primaries could be added to the test color, and a match with the remaining two primaries was carried out with the variable color spot. For these cases, the amount of the primary added to the test color was considered to be a negative value [emphasis added]. —Wikipedia, "CIE 1931 color space"]
The above quote makes it sound like maybe only a few test colors needed a "negative value" to achieve a match. In actual fact, almost all the test colors needed a negative tristimulus value to achieve a match. "Color: the 1931 CIE color-matching functions and chromaticity chart" (an article worth reading over several times) says "Until the blue goes to zero thru negative values, exactly one of the 3 numbers is negative. At long wavelengths, when blue is zero, we just use red and green."
"Paper and pencil" is why the 1931 CIE XYZ color space has only non-negative tristimulus values
The CIE 1931 XYZ color space is a mathematical model of the results of Wright's and Guild's experimental data from the 1920s. X, Y, and Z are derived from, but do not refer directly to the physical wavelenths of light (the R, G, and B primaries) used in the color matching studies. Instead, Y represents luminance, and X and Z hold chromaticity information.
Unlike the primaries in the color matching studies, the XYZ color space values are all non-negative. Why? According to Douglas A. Kerr's paper, "The CIE XYZ and yxY Color Spaces" (another article worth reading over several times), the CIE wanted to avoid negative numbers for their XYZ color space to make doing mathematical calculations less error-prone:
Analytical work done with the curves involves multiplications of light spectrum values by the curve values at different wavelengths, something that is today done easily by a computer, but which, done with desktop calculators and paper, was very tedious. There was concern that the need to keep track of both positive and negative values could increase the risk of a mis-step.
But cameras don't "see" like humans see
When you think about it, it makes sense that digital camera input matrix profiles might require negative XYZ tristimulus values. The 1931 CIE XYZ color space was made "just positive enough" to avoid the need to use negative values when describing human color perception. Making the space "more positive than absolutely necessary" would have meant dealing with unnecessarily big numbers. Just like negative numbers, big numbers are a source of error when doing calcualations with paper and pencil.
But digital camera sensors don't respond to light waves in the same ways as the human eye-brain responds. If (per the historically impossible) Wright and Guild had been testing camera sensors rather than humans, their test data would have been different, and a mathematically derived "just positive enough" CIE XYZ digital camera color space would also have been different.
To summarize: The actual 1931 CIE XYZ human color space can be used to create a matrix camera profile that describes any given camera sensor's response to light, as determined by profiling the camera. But an accurate camera profile requires using negative XYZ tristimulus values. If the CIE had moved the XYZ color space farther into the positive numbers, then today's camera input matrices wouldn't require negative tristimulus values.
1994: 20th century technology and ICC V2 profile specifications — no negative tristimulus values
OK, paper and pencil calculations explain why the CIE XYZ color space has non-negative tristimulus values. But why did the ICC likewise insist on non-negative tristimulus values? After all, the ICC specifications are for color profiles applied by computers, and computers don't make math errors (well, they do, but only under circumstances far removed from the topic at hand).
Let's start with a little history. The CIE has been around in one form or another for a long time:
The International Commission on Illumination (usually abbreviated CIE for its French name, Commission internationale de l'éclairage) is the international authority on light, illumination, color, and color spaces. It was established in 1913 as a successor to the Commission Internationale de Photométrie. —Wikipedia, "International Commission on Illumination"
However, the ICC is a relative newcomer:
The International Color Consortium was formed in 1993 by eight industry vendors in order to create an open, vendor-neutral color management system which would function transparently across all operating systems and software packages. . . . The eight founding members of the ICC were Adobe, Agfa, Apple, Kodak, Microsoft, Silicon Graphics, Sun Microsystems, and Taligent. —Wikipedia, "International Color Consortium"
As measured by the personal computers we use today, even mainframes from the 1990s were slow. Processor speeds were under 100MHz. Level 2 cache was measured in (tens, not hundreds of) kilobytes. RAM was getting cheaper and computers were using more and more of it, but it was still very expensive.
Before the ICC released its V2 color profile specifications in 1994, color management was handled "in-house" by all the big establishments who made prints from digital files (those were prints on paper, remember paper?), such as magazines, newspapers, and fine art print-makers. And every company that made prints on paper from digital files had its own procedures for handling color management.
In the 1990s, those digital image files were produced by scanning film. Digital cameras were not yet in widespread use:
In 1991, Kodak brought to market the Kodak DCS-100, the beginning of a long line of professional Kodak DCS SLR cameras that were based in part on film bodies . . . It used a 1.3 megapixel sensor and was priced at $13,000. . .
In 1997 the first megapixel cameras for consumers were marketed. . . .
1999 saw the introduction of the Nikon D1, a 2.74 megapixel camera that was the first digital SLR developed entirely by a major manufacturer, and at a cost of under $6,000 at introduction was affordable by professional photographers and high-end consumers. —Wikipedia, "History of the camera"
It is easy to see why in the early and mid 1990s the manufacturers of digital imaging hardware (monitors, scanners, and the not-yet-ready-for-prime-time digital cameras) and especially software (the first color-managed version of Adobe Photoshop came out in 1998) wanted to standardize color management: selling new hardware and software is a lot easier if the company doing the buying doesn't have to completely revamp their in-house color management procedures.
The original V2 ICC specifications were released in 1994. My best guess (there are references in the Specification itself to making computing easier) is that the ICC excluded negative XYZ values from V2 color profiles because negative numbers would have placed an additional computing burden on the hardware and software in use in the early 1990s. Somewhere in one of Dan Margulis's excellent books he talks about it taking hours to apply a single curve to a small (by today's standards) digital file.
Despite any computing advantage of not allowing negative tristimulus values, it did not escape the attention of everyone that non-negative tristimulus values are sometimes necessary to describe devices such as digital cameras. In an essay written in 1997 called "What's wrong with the ICC profile format anyway?", Graeme Gill wrote:
This section on the XYZ type says "Tristimulus values must be non-negative. The signed encoding allows for implementation optimizations by minimizing the number of fixed formats." In fact, since the XYZ array type is used to store the matrix profile "primaries", and the collection of the RGB "primaries" forms a transform matrix from linearised device channels into XYZ, there is no guarantee that the matrix values will be positive. . . . Some real world profiles have matrix "primaries" with -ve values.
In fact, as I found when I examined the camera matrix input (inverse) profiles in the dcraw adobe_coeff table, most, probably all, digital cameras have negative tristimulus values. And as you can see from the V4 profile specifications presented below, the ICC is well aware of this fact.
2010: 21st century technology and ICC V4 profile specifications — finally, negative tristimulus values are allowed
Sixteen years and 400 digital camera models later, has the ICC changed its stance on negative primaries?
In a word, no. And yes.
First the "no". Quoting from "International Color Consortium® Specification, ICC.1:2004-10 (Profile version 4.2.0.0)", section "10.27 XYZType" says
The XYZType contains an array of three encoded values for the XYZ tristimulus values. The number of sets of values is determined from the size of the tag. When used the byte assignment and encoding shall be as given in Table 62. Tristimulus values shall be non-negative. The signed encoding allows for implementation optimizations by minimizing the number of fixed formats.
and section "5.1.11 XYZNumber" says
A set of three fixed signed 4-byte/32-bit quantities used to encode CIEXYZ tristimulus values (which cannot be negative) where byte usage is assigned as specified in table 7. . . . [Note 2 says Signed numbers are employed for this type to accommodate negative values arisiing [sic] during calculations.]
Now for the "yes." In section 1.1.3, "PCS encoding range is limited to [0,2) [sic]" of ICC Votable Proposal Submission, Floating-Point Device Encoding Range, dated June 16, 2006, the ICC admits that the requirement that XYZ values be non-negative can be a problem for "some" devices:
It is also possible that some device values may have corresponding XYZ values that are negative. Such values can result from digital camera color analysis matrices, or chromatic adaptation transforms applied to extremely saturated blue colors. In most cases, it is acceptable to clip negative XYZ values to zero as such values do not correspond to real colors. However in some cases this may be unacceptable, for example if perfect round-tripping is desired.
One comment: The type of device to which this proposed revision of the ICC Specification is referring is digital motion picture cameras, not digital still cameras (you might find "Using ICC Profiles for Motion Picture Production" of interest). The proposed solution is to add support to ICC profiles for 32-bit floating point XYZ values. These 32-bit floating point XYZ values (but not 16-bit integer values), "are allowed to go negative when such values are supported by the device" ("ICC_Chiba", page 8, under Clause 9.1, entitled "Add new paragraph at the end").
As I said previously, a close reading of the 1994 ICC Specification suggests that the real reason negative XYZ values were excluded in the first place was because the hardware and software in use thirteen years ago would have choked trying to use negative numbers. Fast-forwarding to today, it seems that still and motion digital cameras alike will eventually be (or maybe even already are) be able to use ICC-compliant 32-bit floating point profiles and computations, with allowable negative XYZ values.
Fortunately, most imaging software is perfectly happy to work with color profiles that have negative tristimulus values. It's only when you convert to CIELAB or to a standard ICC profile RGB working space like ProPhotoRGB that you risk losing colors and details that were present in your raw file.
ICC V4 color space specifications are being/have been altered to "allow" the negative tristimulus values that image editing software has always happily accomodated. So far, I haven't used V4 profiles in my digital darkroom. LCMS2 can handle V4 profiles, and I don't know which, if any, open source imaging softwares make full use of V4 profiles.
Update, March 2015: almost all free/libre software now uses LCMS2 and handles V2 and V4 ICC profiles. The excellent ArgyllCMS as yet only works with V2 profiles, for good and sufficient reasons that are outside the scope of this article.
Does YOUR camera input profile have negative tristimulus values?
I was thinking about my own camera input profile and about the ICC prohibition of nonnegative tristimulus values and I got curious: how many digital camera matrix profiles have negative tristimulus values? It occurred to me that dcraw — which covers essentially every make and model of digital camera ever made — might hold an answer.
Most of the cameras in the dcraw database use "camera coefficients" to convert an image's raw color to one of four different user-specified working spaces (assuming you don't use the "-i 0" switch to ask for raw color output).
These dcraw camera coefficients are actually inverted camera matrix profiles of the exact same type that you can create using Argyll CMS "scanin" and "colprof" (together with an IT8 or other target shot) to make a matrix profile for your own camera.
I used a spreadsheet to invert the dcraw camera coefficients and calculate what percentage of the resulting camera input matrix primaries had negative tristimulus values. To summarize the spreadsheet findings in table form:
red | green | blue | |
---|---|---|---|
X | 0% | 11% | 4% |
Y | 3% | 0% | 93% |
Z | 12% | 100% | 0% |
red | green | blue | |
---|---|---|---|
X | 1.27307 | 0.17095 | 0.14907 |
Y | 0.55384 | 0.83498 | -0.17736 |
Z | 0.06345 | -0.32588 | 1.79265 |
Of the 233 cameras in my spreadsheet (these cameras are from a 2012 version of dcraw; as of 2015 dcraw supports more than 600 cameras), all of them had at least 2 negative tristimulus values. Nine cameras matrices had 4 negative primaries each, another 36 camera matrices had 3 negative tristimulus values each, and all the rest had 2 negative tristimulus values each. Every single camera matrix in the spreadsheet had a negative green Z tristimulus value. All but 16 but cameras had negative blue Y tristimulus values.
It does seem that most or all digital cameras have negative tristimulus values in their input matrix profiles, and thus are at risk (depending on the image) of losing detail in the blue (and possibly other) channel(s) upon converting raw color output to a standard, ICC-compliant working space.
Notes
- "Round-tripping" from a suitably large custom working space to CIELAB and back destroys information exactly the same as converting to a standard ICC RGB working space profile, unless LCMS2 unbounded ICC profile conversions are used.
- Based on my spreadsheet findings, I also wonder about clipping to 0 in the green channel, but haven't yet made an investigation.
- I used dcraw V8.99 for my camera coefficient spreadsheet calculations. Only 245 of the 361 cameras supported by dcraw V8.99 have separate entries in the adobe_coeff table. Why? Sometimes multiple camera models use the same entry in the adobe_coeff table, presumably because the same basic camera was marketed under multiple makes or model numbers. And a few cameras don't use the table at all.
- Of the 245 unique camera models in the adobe_coeff table, twelve cameras have 12 rather than 9 camera coefficients, and so were excluded from further consideration (I don't know the math for deriving a camera profile matrix for 4-color cameras from the camera coefficients), leaving 233 sets of camera coefficients that I analyzed using a spreadsheet.
- You can download my dcraw-camera-coefficients spreadsheet, available in .ods or .xls format.
- If you perhaps doubt that dcraw camera coefficients are in fact derived from camera matrix input profiles, I invite you to look at my annotated and outlined dcraw code, specifically section F, which creates camera coefficients out of the values from a Colorchecker target shot. Also see section E and section M4a, which respectively calculate, then apply camera matrices derived from Adobe Dng camera coefficients stored in section K (of my annotated dcraw c-code, that is; alas the default dcraw c-code isn't neatly outlined for you!).