skip to main content

Floating Point Dcraw + DCB, AMaZE, LMMSE, AFD, & VCD
Part 3: Overview of the floating point c code

Wherein I share with you the nitty-gritty of a c-coding newbie diving head-first into c code pointers and arrays.

Written in 2011.

Summary of the double floating point raw processing procedure

  1. Take the (necessarily) integer data read off from the raw file and type cast it to double floating point, and determine the maximum and minimum image values
  2. Subtract the noise floor and multiply by the user-supplied (at the command line) white balance multipliers, clip saturated pixels (only if requested at the command line) to the default or user-specified sensor saturation value, clip negative image values to 0.0, and rescale the image so that the maximum image value is 52428.0
  3. Send the resulting image array to the double floating point interpolation routines
  4. Rescale all the interpolated image values to a maximum of 65035.0, then apply the user-specified gamma using (as specified by the user) either a gamma lookup table or full double floating point math, then type cast the image values back to integer and write the image out to disk as a 16-bit tiff.

It should be noted that I am not a coding guru. Critique of my code from people who know c code better than I do is more than welcome.

Raw data is integer data

Raw data by its very nature is integer data. Digital camera sensors are composed of an array of "sensels" (light-sensitive pixels). Typically, these sensels have red, blue, and green light-filtering caps, usually arranged in a Bayer pattern to capture red, blue, and green light levels.

An illustration of how a bayer pattern is used to filter light as it hits a CCD. (Image and description courtesy of Wikipedia Commons)

Depending on your camera's bit-depth, each sensel can record digital values ranging from 0 up to 1023, 4095, 16383, or 65535 (for 10-, 12-, 14-, and 16-bit Analog-to-Digital converters, respectively. As it is a common source of confusion, I will point out that the camera bit-depth does NOT determine the effective dynamic range between the darkest dark level and the brightest bright level of light that your camera can capture. Rather camera bit-depth determines "how many individual integer steps" there are from the darkest dark to the brightest bright light capturable by your camera sensor.

In simpler words, bit-depth doesn't determine your camera's dynamic range. Rather bit-depth divides up your camera's dynamic range, with the number/fineness/size of these divisions or "steps" determined by the bit-depth.

Of course, you don't really get the full range of integer light levels or "steps" indicated by your camera's nominal bit-depth. For example, a 12-bit camera nominally has 4096 discrete intervals from "no light" to "full well". But my Canon 400D, for example, loses roughly 256 steps to noise at the bottom end (the "noise floor"), and 370 more steps at the top end because of what's called "sensor saturation". Not to be confused with saturated colors in an image file, "sensor saturation" (the dcraw "-S <integer>" command line option) refers to the maximum amount of light a sensel can record before it has reached its maximum, "full well" state. So really, my Canon 400D has only 4096 - (255+370) or 3,470 distinguishable levels of light for capturing raw image data.

dcraw code has a very long table, called the adobe_coeff table, that holds noise floor and full well values for most of the cameras supported by dcraw. At least for some cameras, the dcraw code sections that "parse" the raw files also calculate the noise floor level on an image-by-image basis.

Rename the variable "image" as "integerimage"

The first step in making dcraw-float use floating point processing was simple: I replaced the variable name "image" with the variable name "integerimage" in all sections of the dcraw code that identify the raw file and then read the raw file values. (The other alternative was to replace all occurances of "image" in the interpolation algorithms with another variable name, perhaps "floatimage". My decision was based on the sheer number of places "image" occurs in the various interpolation algorithms.)

Convert the integer image data to floating point

To create an array of image values (necessarily integer values, because that is what is recorded in the raw file) from your raw file, dcraw (and dcraw-float) performs the following steps:

  1. Identifies your camera.
  2. Calculates the output image height and width.
  3. CalculateS your image "noise floor".
  4. Reads your image "full well" or saturation point from the adobe_coeff table containing information for all the cameras supported by dcraw (almost all the cameras — some cameras, such as Foveons and the Canon 600 Point and Shoot, require a lot of special code that's not part of the normal dcraw image processing flow).
  5. Reads off all the raw file values into an array of integer image values.

And here's the very simple dcraw-float code that converts the integer image information to double floating point:

void CLASS create_floatimage()
  unsigned int size, i, c;
  double val, saturation, dark, image_max=0, image_min=DBL_MAX;
  size = height*width;

  for (i=0; i < size*4; i++)
//type cast "integerimage" from integer to double and set val equal to integerimage
    val = (double)integerimage[0][i];
    if (!val) continue;

//keep track of maximum and minimum values
    if (val> image_max) image_max = val;
    if (val< image_min) image_min = val;

//set the double floating point array "image" equal to val

As an aside, although I use the word "array", the variables "image" and "integerimage" in the code above and below are both c code pointers, not simple arrays; unless you plan on modifying dcraw code yourself, don't worry about the difference between pointers and arrays.

If you study the above code snippet for a minute or two, you'll see that the "for" loop goes through "integerimage", keeping track of the highest and lowest values it finds. Along the way, all the integerimage values are "type cast" from type "integer" (unsigned short integer, actually) to type "double floating point," creating the floating point image "array". When the "for" loop is over, dcraw-float gives you a screen message giving you what your image maximum and minimum values are.

Subtract the noise floor values and multiply by the white balance multiplier values

Unlike dcraw-float, the default dcraw code has to do some fancy contortions, er, calculations before applying your white balance multipliers. In the process the default dcraw code performs the following steps:

  1. Visits the adobe_coeff table for your camera matrix values.
  2. Multiplies these values by an sRGB matrix (the effects of which multiplication are undone mathematically at a later stage of processing, if you really aren't planning on outputting your image in the sRGB color space).
  3. Calculates a mysterious matrix called "num".
  4. divides the camera matrix values by num (without which step your image would have a horrendously strong color cast).
  5. Takes the inverse of the camera matrix (thereby creating an actual matrix camera profile)
  6. Nses "num" to calculate "pre_mul".
  7. Divides (or is it multiplies? the default dcraw code is opaque to this c code newbie) your actual white balance multipliers by "pre_mul" . . . Whew!

Using a custom camera profile, along with specifying the white balance multipliers at the command line ("-r r g1 b g2") allows all of the above-listed mathematical cartwheels to be avoided. The code snippet below does the black point and white balance image scaling, and also produces clipped or unclipped highlights, depending on whether you use "-H 0" or "-H 1" at the command line.

void CLASS scale_colors()
  unsigned int size, i, c, noise=0, blown=0, scaleblown=0;
  double val, valmax=0, valmin=DBL_MAX, newmax=0, newmin=DBL_MAX, scale_factor, saturation;

//fetch the noise floor values and print them to your screen
  FORC4 cblack[c] += black;
  fprintf (stderr,_("6b black point cblack="));
  FORC4 fprintf (stderr, " %i", cblack[c]); fputc ('\n', stderr);

//loop through the image array
  size = height*width;
  for (i=0; i < size*4; i++)

//set val equal to the image array value located at [0][i]
    val = image[0][i];
    if (!val) continue;

//check for sensels that reached sensor saturation
    if (val>=saturation) {blown=blown+1;}

//substract the noise floor
    val -= cblack[i & 3];

//multiply by the white balance multipliers specified at the command line
    val *= user_mul[i & 3];

//keep track of maximum and minimum post-scaling image values
    if (val > valmax) valmax = val;
    if (val < valmin) valmin = val;

    if (val < 0.0)
//clip negative scaled values to 0.0
      val = 0.0;
//keep track of sensel values below the noise floor

    if ((val >= saturation) && (highlight==0))
//if using '-H 0' clip highlight values to the sensor saturation value
      val = saturation;

//keep track of image values that are "blown" by clipping to the sensor saturation value

//set the image array value located at  [0][i] back equal to val
    image[0][i] = val;

//rescale image values so that the maximum image value is 52428.0
    for (i=0; i < size*4; i++)
    image[0][i] = image[0][i]* 52428.0/valmax;
    if (image[0][i]>newmax) newmax=image[0][i];
    if (image[0][i]<newmin) newmin=image[0][i];

The first "for" loop above subtracts the "noise floor" value from each "array" value in image, and then multiplies the result by the appropriate white balance multiplier. How this loop "knows" which array entry is for which color sensel cap is a bit of c code magic I haven't figured out yet. That's part of the beauty of being a newbie programmer. You only need to figure out what's necessary for the task at hand.

The second "for" loop rescales the resulting image values so that the maximum image value is 52428.0. Why 52428.0? It's four-fifths of 65035.0, which is the maxium image value allowed in a 16-bit image.

Most of the interpolation algorithms divide each image value by 65035.0 to produce a working range between 0.0 and 1.0. If I scale the image values all the way to 65035.0, most of these algorithms will produce values greater than 65035.0 in the image highlights. Then the algorithms usually "clip" the image value to 65035.0. I've tried in the algorithm code loops to eliminate premature clipping (dcraw-float rescales at the very end). But some of those algorithms are pretty complicated, so I took the precaution of factoring in a bit of headroom.

I use double floating point, instead of just floating point, to get the degree of accuracy needed to accomodate scaling, rescaling, dividing by 65035.0 and then multiplying by 65035.0 without introducing rounding errors. c code technically doesn't require that double floating point really be more accurate than floating point, but it probably is on all modern computers (it is on mine).


In c code, there is always a function called "main" that controls the flow of the program. In the default dcraw, main calls "pre_interpolate" before it calls any of the actual interpolation routines.

I won't show the code, but changing the dcraw pre_interpolate (and also border_interpolate) to use double floating point instead of integer calculations was mostly just a matter of making sure that all the variables that get set equal to "image" and vice-versa had their "type" properly set to double rather than the default dcraw unsigned short int.

dcraw's pre_interpolate function actually does two very different tasks: It resizes the image if you ask dcraw to do its chromatic aberration correction (which functionality is disabled in dcraw-float). And it sets the value of every other green image pixel equal to the value of the green pixel two "pixel slots over" (or maybe it's "down" instead of "over"). In other words, pre_interpolate seems to do a very heavy-handed "green equilibration". I remain puzzled by the pre_interpolate code, and if you understand the code, please share!

So what, you might be asking, is "green equilibration"? It seems that for some camera sensors, the green-capped sensels on the rows that alternate red with green, have a different sensitivity to light than the green-capped sensels on the rows that alternate blue with green. The resulting image is likely to have a blocky-looking texture in what should be smoothly transitioning areas of color. The cure is to make the greens in the green-red row match the greens in the green-blue row, hence, "green equilibration".

There are more sophisticated algorithms for achieving "green equilibration" than the default dcraw "pre-interpolate", but I haven't fine-tuned dcraw-float to use them. Neither have I added any of the newer chromatic aberration correction algorithms to dcraw-float. Both tasks are on my "to-do" list (truthfully, most of the things on my "to-do" list never get done).

(If you use "-f" at the command line, this default dcraw default "heavy-handed green equilibration" step is skipped, and the code (both dcraw and dcraw-float) uses four-color VNG interpolation.)


"Main" doesn't call "border_interpolate"; rather, "border-interpolate" is called (or not called) directly by the various interpolation algorithms. What border-interpolation does should be pretty obvious by its name. Interpolation algorithms work with variously-sized "windows" that slide across the image "array" making calculations on all the pixels inside the window. Near the edges of the image there aren't enough pixels to properly fill the window. Some algorithms do their own border-interpolation. Some call upon the dcraw default border-interpolation.

The interpolation algorithms

Again, changing the interpolation algorithms from integer processing to floating point processing was mostly a matter of making sure that all the variables that get set equal to "image" and vice-versa had their "type" properly set to double rather than unsigned short int. It's not as easy as it sounds, because some of the algorithms are pretty long and complicated. The trick is to figure out which parts of the algorithm are calculating the "window" (those variables need to be integer variables) and which parts are actually sliding the window over the image array to calculate the interpolated image values.

The default dcraw algorithms, bilinear, VNG, and PPG, are made a bit tricker than they otherwise would be, because the default dcraw code tends to re-use variable names, first in the "window" loop, then in the "interpolation" loop (same variable name, very different meanings). But you can't have the same variable be both integer and floating point, at least not within the same function. So all the double-duty variable names had to be given a separate variable name for each code loop.

The final complication is that there are very efficient c code math manipulations (like "shifting bits" for efficient multiplying and dividing) that only work with integer variables. All these "efficient code statements" need to be replaced with actual floating point division and multiplication when you switch from integer to floating point processing. In case you are wondering, yes, some algorithms are noticeably slower when the efficient integer processing is replaced with floating point math, others, not so much. VNG is much slower.

Writing out the interpolated image file to disk

In dcraw-float, the function "write_ppm_tiff" is where:

  1. the floating point image "array" is rescaled to have a maximum value of 65035.0
  2. the user-chosen gamma is applied, using either a gamma-lookup-table (considerably faster, slightly less accurate) or using double floating-point math, followed by rounding to the nearest integer value), and finally
  3. the image is type cast to integer and written to disk.

Why rescale the image to a maximum value of 6035.0? I promise you that no pixels that were not already clipped in the raw file get clipped by this final rescaling. And highlight colors are also not altered (unlike in-camera-produced jpegs that apply a heavy-handed S-curve to artificially raise image highlights to a uniform white point and lower shadow values to a uniform black point to give the image more "pop"). Rather this rescaling serves to get all the image values as far from 0 as possible while it is still a true 16-bit image, without clipping the highlights, making it easier to manipulate the shadow tones in post-processing.

For more details about how dcraw-float does floating point processing, see the dcraw-float source code. Please read the licensing information at the top of the source code.