Speeding up RGB to CIE-Lab conversion code...

Speeding up RGB to CIE-Lab conversion code...

Hi all,
I have some code I converted from pseudo code to Delphi 5 code.

It converts from RGB to CIE-Lab, but I want to speed it up.

I am taking each bitmap pixel RGB and converting it using RGBToCIELab. I am
then finding the smallest Sqr difference between that result and the CIE-Lab
values stored with my palette to find the best matching colour.

If I use Difference = Sqr(paletteR - R) + Sqr(paletteG - G) + Sqr(paletteB -
B) , the whole bitmap conversion is quite fast (small fraction of a second
per pixel).

But when I use
RGBToCIELab() on the RGB bitmap value and do

Difference = Sqr(paletteCIEL - L) + Sqr(paletteCIEa - a) +
Sqr(paletteCIEb - b)

the conversion slows down to about 2 1/2 seconds per pixel

procedure RGBToCIELab(R,G,B: Byte; var CIEL,CIEa,CIEb: Single);
var
var_R,var_G,var_B: Single;
var_X,var_Y,var_Z: Single;
X,Y,Z: Single;
begin
//Observer. = 2 Illuminant = D65
var_R := ( R / 255 ); //R = From 0 to 255
var_G := ( G / 255 ); //G = From 0 to 255
var_B := ( B / 255 ); //B = From 0 to 255

if ( var_R>> 0.04045 )then
var_R := Power(( ( var_R + 0.055 ) / 1.055 ),2.4)
else
var_R := var_R / 12.92;

if ( var_G>> 0.04045 ) then
var_G := Power(( ( var_G + 0.055 ) / 1.055 ),2.4)
else
var_G := var_G / 12.92;

if ( var_B>> 0.04045 ) then
var_B := Power(( ( var_B + 0.055 ) / 1.055 ),2.4)
else
var_B := var_B / 12.92;

var_R := var_R * 100;
var_G := var_G * 100;
var_B := var_B * 100;

//Observer. = 2 Illuminant = D65
X := var_R * 0.4124 + var_G * 0.3576 + var_B * 0.1805;
Y := var_R * 0.2126 + var_G * 0.7152 + var_B * 0.0722;
Z := var_R * 0.0193 + var_G * 0.1192 + var_B * 0.9505;

var_X := X / 95.047; //Observer := 2 Illuminant := D65
var_Y := Y / 100.000;
var_Z := Z / 108.883;

if ( var>X > 0.008856 )then
var_X := Power(var_X ,( 1/3 ))
else
var_X := ( 7.787 * var_X ) + ( 16 / 116 );

if ( var>Y > 0.008856 ) then
var_Y := Power(var_Y ,( 1/3 ))
else
var_Y := ( 7.787 * var_Y ) + ( 16 / 116 );

if ( var>Z > 0.008856 ) then
var_Z := Power(var_Z ,( 1/3 ))
else
var_Z := ( 7.787 * var_Z ) + ( 16 / 116 );

CIEL := ( 116 * var_Y ) - 16;
CIEa := 500 * ( var_X - var_Y );
CIEb := 200 * ( var_Y - var_Z );
end;

Any ideas on how to speed this up?

--
Cheers,
Paul.

"The plastic veneer of civilization is easily melted in the heat of the
moment" - Paul Nicholls.
XXXX@XXXXX.COM

Speeding up RGB to CIE-Lab conversion code...

"2 1/2 seconds per pixel" ..thats uhm.. quite slow ;) I tried with your
code, and cannot even make it 2 second per image.. (allright didn't try
the 5MP digicam images...).

First thing you want get rid of is all those divisions. Define some
reciprocal constants..e.g. instead of doing R / 255, do something like:

const
r_255 = 1/255;

[..]

begin
//Observer. = 2 Illuminant = D65
var_R := ( R * r_255 );
[..]

Get the idea?

Next thing may be to recycle some of the variables, dunno how much this
would give you though, e.g:

var_X := X / 95.047; //Observer := 2 Illuminant := D65
var_Y := Y / 100.000;
var_Z := Z / 108.883;

X,Y and Z is not used anymore from here on, so why not use X,Y and Z from
now on instead of var_X...etc (not to mention the return variables in the
header... these are only used in the end... perhaps these could be used too,
and thereby let you remove another three vars..). But, as said, I'm not sure
how much you're gaining since its FP math anyway... if you could code it
solely in Fixed integer math, you'd gain alot of speed I believe.

Regards,

Michael

Speeding up RGB to CIE-Lab conversion code...

On Tue, 1 Feb 2005 16:55:18 +1100, "Paul Nicholls"

Woha. My palette generator manages to convert the bitmap to CIELab,
extract a 256 color palette (from 200k unique colors) and replace each
color in bitmap with its closest match in about 30 seconds, for a
1024x1536 image.

Are you using single's for storing L, a and b ? If so, dont :) Use
integer math (I used byte for L and shortint for a and b).

Convert like this:

result.l:= round(lab[0] * (255 / 100));
result.a:= round(lab[1] * (127 / 100));
result.b:= round(lab[2] * (127 / 100));

If that doesn't provide enough precision (I think it does), scale them
by another 16 bits.

- Asbjn

Speeding up RGB to CIE-Lab conversion code...

think you mean 2.5 milliseconds per pixel? I can't imagine that your
routine without any loops takes 2.5 seconds per pixel, unless you're on a
ZX81.

On the other hand, if you do this in each pixel every time for all your
palette entries you're doing something silly. You should store your palette
list in CIELab, not in RGB, at least for the time you're doing the
conversion of the whole picture. You then only need one conversion per
pixel, and one for each palette entry.

Nils

"Paul Nicholls" < XXXX@XXXXX.COM > wrote in message
news:41ff19f2\$ XXXX@XXXXX.COM ...
am
CIE-Lab
Sqr(paletteB -
> X := var_R * 0.4124 + var_G * 0.3576 + var_B * 0.1805;> > Y := var_R * 0.2126 + var_G * 0.7152 + var_B * 0.0722;> > Z := var_R * 0.0193 + var_G * 0.1192 + var_B * 0.9505;> >> > var_X := X / 95.047; //Observer := 2 Illuminant := D6>
> var_Y := Y / 100.000>
> var_Z := Z / 108.883>
> if ( var>X > 0.008856 )the>
> var_X := Power(var_X ,( 1/3 )>
> els>
> var_X := ( 7.787 * var_X ) + ( 16 / 116 )>
> if ( var>Y > 0.008856 ) the>
> var_Y := Power(var_Y ,( 1/3 )>
> els>
> var_Y := ( 7.787 * var_Y ) + ( 16 / 116 )>
> if ( var>Z > 0.008856 ) the>
> var_Z := Power(var_Z ,( 1/3 )>
> els>
> var_Z := ( 7.787 * var_Z ) + ( 16 / 116 )>
> CIEL := ( 116 * var_Y ) - 16>
> CIEa := 500 * ( var_X - var_Y )>
> CIEb := 200 * ( var_Y - var_Z )>
> end>
> Any ideas on how to speed this up>
> -->
> Cheers>
> Paul>
> "The plastic veneer of civilization is easily melted in the heat of th>
> moment" - Paul Nicholls>
> XXXX@XXXXX.COM >
>

Speeding up RGB to CIE-Lab conversion code...

Would it be possible to see your code?

My palette is around 450 colours, but I should be able to adapt your code...

If you can't share your code I will try this instead :-)

cheers,
Paul.

Speeding up RGB to CIE-Lab conversion code...

On Wed, 2 Feb 2005 09:25:50 +1100, "Paul Nicholls"

Not ripe for public consumption im affraid.

My algorithm goes something like this: compute "best" common color for
all colors (root cluster). While not enough colors in palette, split
the "worst" (highest error) cluster in two, and find the best color
for each of the new clusters. So you simply specify a target number of
colors and it'll do the rest :) Of course it takes more time to
produce more colors.

Seems to work nice on many images.

I've found that my code spends most of its time in two places,
RBG->CIELab conversion and finding the best color in the output based
on the generated palette (this opperation takes the most time of the
two). I'm toying with the idea to use a lookup table and interpolate
when doing the conversions. Might try it out and see if it's much
worse.

- Asbjn

Speeding up RGB to CIE-Lab conversion code...

I know how that is with many code bits I have written! <G>

so what is a cluster? a collection of colours?

This makes sense, it would be slower to find the matching CIELab value from
the final palette...

I was thinking, perhaps I could do something like this which could be fairly
fast I think (not tested):

// done once at palette loadup/creation
for each colour in the palette
{
paletteCIELab := RGBToCIELab(palette_R,palette_G,palette_B);
// store palette RGB & CIELab in a binary search list sorted by
CIELab...
}

for each pixel in bitmap
{
// if RGB not stored yet, store, calc and return value otherwise return
value.
// use a binary search list...

pixelCIELab := RGBToCIELabCache(pixel_R,pixel_G,pixel_B);

// search in binary list created from palette for nearest CIELab
value using pixelCIELab and return matching RGB values.
// store the new RGB values in bitmap...
}

Speeding up RGB to CIE-Lab conversion code...

On Wed, 2 Feb 2005 11:30:26 +1100, "Paul Nicholls"

Yes, a cluster is simply a collection of colors for which i compute a
common color (the average color). So i start with only one cluster
(all the colors in the image), and then work on that, splitting it in
half etc etc.

Yes, that is the slow step by far. I haven't really come up with any
better way though. Binary searching like you indicate might work
though.

I've done this now, and well... It might be worth it. Using only fixed
point code, the time is reduced from 1.5 seconds to 0.6 (using the
1536x1024 image). However the way i got my code right now i have to
use a floatingpoint interpolation routine to get the precision
required, which increases the time to 0.9 seconds. Im sure that if i
inlined all the interpolation using fixed point i could get the
precision and the same timing. I'll get back to you on this.

- Asbjn

Speeding up RGB to CIE-Lab conversion code...

n Wed, 02 Feb 2005 02:07:22 +0100, Lord Crc < XXXX@XXXXX.COM >
wrote:

Yeah, just did it, and got _almost_ the same precision as not using
the LUT (ie doing a regular conversion per pixel). Slight banding in
"shallow" gradients, but it didnt affect palette generation much. And
the speed dropped to a further 0.45 seconds.

Here's the code if you're interested:

type
TColorLab24 = packed record
case integer of
0: (l: byte; a, b: shortint; d: byte);
1: (long: integer);
end;
TLabPixelArray = array of TColorLab24;

function BmpToLabPixels2(Bmp: TBitmap): TLabPixelArray;
var
x, y, i: integer;
p: PRGBTriple;
c: TColorRGB;
l: TColorLab;
lut: array of TColorLab24;
r, r2, g, g2, b, b2: integer;
rf, gf, bf: integer;
t: single;
ca, cfa: array[0..255] of integer;
c1, c2: ^TColorLab24;
ltg1, ltg2: integer;
atg1, atg2: integer;
btg1, btg2: integer;
ltb1, ltb2: integer;
atb1, atb2: integer;
btb1, btb2: integer;
begin
Bmp.PixelFormat:= pf24bit;
SetLength(result, Bmp.Width * Bmp.Height);

SetLength(lut, 64*64*64);

// build LUT
for r:= 0 to 63 do
begin
for g:= 0 to 63 do
begin
for b:= 0 to 63 do
begin
i:= ((r * 64) + g) * 64 + b;

c[0]:= r * (1/63);
c[1]:= g * (1/63);
c[2]:= b * (1/63);

l:= RgbToLab(c);

lut[i].l:= round(l[0] * (255 / 100));
lut[i].a:= round(l[1] * (127 / 100));
lut[i].b:= round(l[2] * (127 / 100));
end;
end;
end;

// so we get max precision, without the cost of
// floating point code in the innerloop
for i:= 0 to 255 do
begin
t:= i * (63 / 255);
ca[i]:= trunc(t);
cfa[i]:= trunc((t-ca[i]) * 4);
end;

i:= 0;
for y:= 0 to Bmp.Height-1 do
begin
p:= Bmp.ScanLine[y];
for x:= 0 to Bmp.Width-1 do
begin
r:= p^.rgbtRed;
g:= p^.rgbtGreen;
b:= p^.rgbtBlue;

rf:= cfa[r];
r:= ca[r];

gf:= cfa[g];
g:= ca[g];

bf:= cfa[b];
b:= ca[b];

if r < 63 then
r2:= r + 1
else
r2:= 63;

if g < 63 then
g2:= g + 1
else
g2:= 63;

if b < 63 then
b2:= b + 1
else
b2:= 63;

// compute offsets
r:= r * 64*64;
r2:= r2 * 64*64;
g:= g * 64;
g2:= g2 * 64;

// perform 3d linear interpolation
// each step adds 2 bit of precision
// for a total of 6 bits

// rf
c1:= @lut[r+g+b];
c2:= @lut[r2+g+b];
ltg1:= (integer(c1.l) shl 2 + (integer(c2.l) - c1.l) * rf);
atg1:= (integer(c1.a) shl 2 + (integer(c2.a) - c1.a) * rf);
btg1:= (integer(c1.b) shl 2 + (integer(c2.b) - c1.b) * rf);

// rf
c1:= @lut[r+g2+b];
c2:= @lut[r2+g2+b];
ltg2:= (integer(c1.l) shl 2 + (integer(c2.l) - c1.l) * rf);
atg2:= (integer(c1.a) shl 2 + (integer(c2.a) - c1.a) * rf);
btg2:= (integer(c1.b) shl 2 + (integer(c2.b) - c1.b) * rf);

// gf
ltb1:= (integer(ltg1) shl 2 + (integer(ltg2) - ltg1) * gf);
atb1:= (integer(atg1) shl 2 + (integer(atg2) - atg1) * gf);
btb1:= (integer(btg1) shl 2 + (integer(btg2) - btg1) * gf);

// rf
c1:= @lut[r+g+b2];
c2:= @lut[r2+g+b2];
ltg1:= (integer(c1.l) shl 2 + (integer(c2.l) - c1.l) * rf);
atg1:= (integer(c1.a) shl 2 + (int

Speeding up RGB to CIE-Lab conversion code...

<SNIP>

Thanks!!

Just for interests sake, I may also try my binary lookup thingy to see how
it performs :-)

Thanks for the code,
cheers,
Paul

Speeding up RGB to CIE-Lab conversion code...

can I assume that

TColorRGB = record
r,g,b: Single; // ??
end;

TColorLab = record
L: Byte;
a,b: ShortInt;
end;

what is your RgbToLab() function code?

<SNIP>

cheers,
Paul.

Speeding up RGB to CIE-Lab conversion code...

On Thu, 3 Feb 2005 10:08:39 +1100, "Paul Nicholls"

Eh almost, both are simply "array[0..2] of single".

And my rgb->lab code is converted from the easyrgb reference i posted.

- Asbjn

Speeding up RGB to CIE-Lab conversion code...

oh, ok <chuckle> I'm all set then :)

cheers,
Paul.

Speeding up RGB to CIE-Lab conversion code...

On Tue, 1 Feb 2005 16:55:18 +1100, "Paul Nicholls"

var_R := RValueLookup[R];
var_G := GValueLookup[G];
var_B := BValueLookup[B];

The power function is slow and each of these tables has only 256
entries. Precalculate!
>> //Observer. = 2 Illuminant = D65> > X := var_R * 0.4124 + var_G * 0.3576 + var_B * 0.1805;> > Y := var_R * 0.2126 + var_G * 0.7152 + var_B * 0.0722;> > Z := var_R * 0.0193 + var_G * 0.1192 + var_B * 0.9505;

And you could even precalculate the multiplications if you wanted to,
but I doubt it would make much difference. This would triple the size
of the lookup table.
> > var_X := X / 95.047; //Observer := 2 Illuminant := D6>
> var_Y := Y / 100.000>
> var_Z := Z / 108.883;

Definitely fold these into the previous step. The gain is small but
the effort is also small. >
> if ( var>X > 0.008856 )the>
> var_X := Power(var_X ,( 1/3 )>
> els>
> var_X := ( 7.787 * var_X ) + ( 16 / 116 )>
> if ( var>Y > 0.008856 ) the>
> var_Y := Power(var_Y ,( 1/3 )>
> els>
> var_Y := ( 7.787 * var_Y ) + ( 16 / 116 )>
> if ( var>Z > 0.008856 ) the>
> var_Z := Power(var_Z ,( 1/3 )>
> els>
> var_Z := ( 7.787 * var_Z ) + ( 16 / 116 );

Unfortunately, another expensive set of calculations and I see no easy
way to optimize it. >
> CIEL := ( 116 * var_Y ) - 16>
> CIEa := 500 * ( var_X - var_Y )>
> CIEb := 200 * ( var_Y - var_Z )>
>end;

No ideas here, either.

Speeding up RGB to CIE-Lab conversion code...

>> //Observer. = 2 Illuminant = D65 >>> var_R := ( R / 255 ); //R = From 0 to 255 >>> var_G := ( G / 255 ); //G = From 0 to 255 >>> var_B := ( B / 255 ); //B = From 0 to 255 >>> >>> if ( var_R>> 0.04045 )then >>> var_R := Power(( ( var_R + 0.055 ) / 1.055 ),2.4) >>> else >>> var_R := var_R / 12.92; >>> >>> if ( var_G>> 0.04045 ) then >>> var_G := Power(( ( var_G + 0.055 ) / 1.055 ),2.4) >>> else >>> var_G := var_G / 12.92; >>> >>> if ( var_B>> 0.04045 ) then >>> var_B := Power(( ( var_B + 0.055 ) / 1.055 ),2.4) >>> else >>> var_B := var_B / 12.92; >>> >>> var_R := var_R * 100; >>> var_G := var_G * 100; >>> var_B := var_B * 100; >> >> var_R := RValueLookup[R]; >> var_G := GValueLookup[G]; >> var_B := BValueLookup[B]; >> >> The power function is slow and each of these tables has only 256 >> entries. Precalculate! >> >>> //Observer. = 2 Illuminant = D65>>>> X := var_R * 0.4124 + var_G * 0.3576 + var_B * 0.1805;>>>> Y := var_R * 0.2126 + var_G * 0.7152 + var_B * 0.0722;>>>> Z := var_R * 0.0193 + var_G * 0.1192 + var_B * 0.9505;> >> > And you could even precalculate the multiplications if you wanted to,> > but I doubt it would make much difference. This would triple the size> > of the lookup table.> >>>>> var_X := X / 95.047; //Observer := 2 Illuminant := D6>> >> var_Y := Y / 100.000>> >> var_Z := Z / 108.883>
> Definitely fold these into the previous step. The gain is small bu>
> the effort is also small>
>> >> if ( var>X > 0.008856 )the>> >> var_X := Power(var_X ,( 1/3 )>> >> els>> >> var_X := ( 7.787 * var_X ) + ( 16 / 116 )>> >>> >> if ( var>Y > 0.008856 ) the>> >> var_Y := Power(var_Y ,( 1/3 )>> >> els>> >> var_Y := ( 7.787 * var_Y ) + ( 16 / 116 )>> >>> >> if ( var>Z > 0.008856 ) the>> >> var_Z := Power(var_Z ,( 1/3 )>> >> els>> >> var_Z := ( 7.787 * var_Z ) + ( 16 / 116 )>
> Unfortunately, another expensive set of calculations and I see no eas>
> way to optimize it>
>> >> CIEL := ( 116 * var_Y ) - 16>> >> CIEa := 500 * ( var_X - var_Y )>> >> CIEb := 200 * ( var_Y - var_Z )>> >>end>
> No ideas here, either.

Thanks for all the ideas Loren :-)

Cheers,
Paul