Ah after some more research I think they may use MMX
Now I just have to learn how to use it as well.
Thanks alot for the help.
Glenn
----- Original Message ----
From: Alexander Osipov <0...@inbox.ru>
To: Glenn Pierce
Sent: Thursday, 21 September, 2006 8:25:48 PM
Subject: Re[2]: [imagedsp] Fast Convolution
Hello Glenn,
Ok, I see..
May be its simple SSE/MMX optimization (vectorization)?
You can test this convoluiton code about speed/kernel
size scalability for different kernel sizes to be sure.
Thursday, September 21, 2006, 12:28:08 PM, you wrote:
GP> I thought that was likly too but I did a test with a un separable kernel
like below.
GP> The speed was similiar to a kernel of all 1's.
GP> static float array[7][7] = {{1.0, 6.0, 1.0, 1.0, 1.0, 1.0, 1.0},
GP> {1.0, 4.0, 1.0, 5.0, -1.0, 6.0, 1.0},
GP> {7.0, 1.0, 3.0, 1.0, 1.0, 1.0, 1.0},
GP> {1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0},
GP> {1.0, 7.0, 1.0, 1.0, 1.0, 1.0, 1.0},
GP> {8.0, 1.0, -1.0, 1.0, 9.0, 1.0, 1.0},
GP> {1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0}};
GP> void Imaq_Convolution (IPIImageRef in, IPIImageRef out)
GP> {
GP> int i, nPoints, fftX, fftY;
GP> IPIConvoDesc matrix;
GP> IPI_Cast (in, IPI_PIXEL_SGL);
GP> IPI_Cast (out, IPI_PIXEL_SGL);
GP> matrix.matrixWidth = 7;
GP> matrix.matrixHeight = 7;
GP> matrix.matrixElements = (float*) &array;
GP> matrix.divider = 48.0;
GP> IPI_Convolute (in, IPI_NOMASK, out, &matrix, IPI_BO_CLEAR);
GP> }
GP> ----- Original Message ----
GP> From: Alexander Osipov <0...@inbox.ru>
GP> To: i...; g...@yahoo.co.uk
GP> Sent: Sunday, 17 September, 2006 6:33:11 PM
GP> Subject: Re: [imagedsp] Fast Convolution
GP> Hello glennpierce2001,
GP> Most probable that their code written as separable kernel
GP> filter (like gaussian blur are).
GP> In this case convolution realized not as 2D filter, but as two 1D
GP> filters.
GP> So it takes less than Width*Heigth* KernelSize* KernelSize operations
in common case,
GP> but takes 2*Width*Heigth* KernelSize, so you can optimize
GP> Gaussian Blur approximately at KernelSize/2 times.
GP> At first filtering stage you apply 1D gaussian kernel horizontally,
and
GP> at second filtering stage 1D gaussian kernel vertically.
GP> Additionaly, Gaussian Blur (at sample) can be optimized further
GP> using recursive filtering (separated too), but you should use
floating point
GP> calculations for calculating this convolution with small error,
GP> so it often not more optimal.
GP> Friday, September 15, 2006, 4:04:18 PM, you wrote:
GP> gycu> Hi
GP> gycu> I have previosly been using a image processing library called Imaq
vision.
GP> gycu> I am now wiriting my own functions.
GP> gycu> In Imaq there is a convolution function that takes any sized kernel
and convolves it with an image.
GP> gycu> For a image of size 1280*1040 and a kernel of 7*7 it does this in
around 250 millisecs on my machine.
GP> gycu> My quick test case
GP> gycu> of looping through the kernel values for each pixel in the image is
much slower.
GP> gycu> ie
GP> gycu> int x, y, i, j, out
GP> gycu> for(y=0; y < image_height; y++)
GP> gycu> for(x=0; x < image_width; x++)
GP> gycu> for(j=0; j < kernel_height; j++)
GP> gycu> for(i=0; i < _widthkernel; i++)
GP> gycu> out++;
GP> gycu> printf("%d", out);
GP> gycu> This code takes around 650 millsecs, and doesn't do anthing
useful.
GP> gycu> Does anyone know how the Imaq convolution code achieves it speed.
GP> gycu> I find it hard to believe that they wrote special cases for each
possible kernel size, to remove the inner loops ?
GP> gycu> Also I know their implementation is not fourier based as the image
requires a border depending on the kernel size.
GP> gycu> Thanks for any help.
GP> --
GP> Best regards,
GP> Alexander mailto:0xef15h@inbox. ru
GP>
--
Best regards,
Alexander mailto:0...@inbox.ru