DSPRelated.com
Forums

Re: Re[2]: Fast Convolution

Started by Glenn Pierce September 25, 2006
Ah after some more research I think they may use MMX
Now I just have to learn how to use it as well.

Thanks alot for the help.

Glenn

----- Original Message ----
From: Alexander Osipov <0...@inbox.ru>
To: Glenn Pierce
Sent: Thursday, 21 September, 2006 8:25:48 PM
Subject: Re[2]: [imagedsp] Fast Convolution

Hello Glenn,

Ok, I see..
May be its simple SSE/MMX optimization (vectorization)?
You can test this convoluiton code about speed/kernel
size scalability for different kernel sizes to be sure.

Thursday, September 21, 2006, 12:28:08 PM, you wrote:

GP> I thought that was likly too but I did a test with a un separable kernel like below.
GP> The speed was similiar to a kernel of all 1's.

GP> static float array[7][7] = {{1.0, 6.0, 1.0, 1.0, 1.0, 1.0, 1.0},
GP> {1.0, 4.0, 1.0, 5.0, -1.0, 6.0, 1.0},
GP> {7.0, 1.0, 3.0, 1.0, 1.0, 1.0, 1.0},
GP> {1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0},
GP> {1.0, 7.0, 1.0, 1.0, 1.0, 1.0, 1.0},
GP> {8.0, 1.0, -1.0, 1.0, 9.0, 1.0, 1.0},
GP> {1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0}};

GP> void Imaq_Convolution (IPIImageRef in, IPIImageRef out)
GP> {
GP> int i, nPoints, fftX, fftY;
GP> IPIConvoDesc matrix;

GP> IPI_Cast (in, IPI_PIXEL_SGL);
GP> IPI_Cast (out, IPI_PIXEL_SGL);

GP> matrix.matrixWidth = 7;
GP> matrix.matrixHeight = 7;
GP> matrix.matrixElements = (float*) &array;
GP> matrix.divider = 48.0;

GP> IPI_Convolute (in, IPI_NOMASK, out, &matrix, IPI_BO_CLEAR);
GP> }

GP> ----- Original Message ----
GP> From: Alexander Osipov <0...@inbox.ru>
GP> To: i...; g...@yahoo.co.uk
GP> Sent: Sunday, 17 September, 2006 6:33:11 PM
GP> Subject: Re: [imagedsp] Fast Convolution

GP> Hello glennpierce2001,

GP> Most probable that their code written as separable kernel
GP> filter (like gaussian blur are).
GP> In this case convolution realized not as 2D filter, but as two 1D
GP> filters.
GP> So it takes less than Width*Heigth* KernelSize* KernelSize operations in common case,
GP> but takes 2*Width*Heigth* KernelSize, so you can optimize
GP> Gaussian Blur approximately at KernelSize/2 times.
GP> At first filtering stage you apply 1D gaussian kernel horizontally, and
GP> at second filtering stage 1D gaussian kernel vertically.
GP> Additionaly, Gaussian Blur (at sample) can be optimized further
GP> using recursive filtering (separated too), but you should use floating point
GP> calculations for calculating this convolution with small error,
GP> so it often not more optimal.

GP> Friday, September 15, 2006, 4:04:18 PM, you wrote:

GP> gycu> Hi

GP> gycu> I have previosly been using a image processing library called Imaq vision.
GP> gycu> I am now wiriting my own functions.

GP> gycu> In Imaq there is a convolution function that takes any sized kernel and convolves it with an image.

GP> gycu> For a image of size 1280*1040 and a kernel of 7*7 it does this in around 250 millisecs on my machine.

GP> gycu> My quick test case

GP> gycu> of looping through the kernel values for each pixel in the image is much slower.

GP> gycu> ie

GP> gycu> int x, y, i, j, out

GP> gycu> for(y=0; y < image_height; y++)
GP> gycu> for(x=0; x < image_width; x++)
GP> gycu> for(j=0; j < kernel_height; j++)
GP> gycu> for(i=0; i < _widthkernel; i++)
GP> gycu> out++;

GP> gycu> printf("%d", out);

GP> gycu> This code takes around 650 millsecs, and doesn't do anthing useful.

GP> gycu> Does anyone know how the Imaq convolution code achieves it speed.
GP> gycu> I find it hard to believe that they wrote special cases for each possible kernel size, to remove the inner loops ?

GP> gycu> Also I know their implementation is not fourier based as the image requires a border depending on the kernel size.

GP> gycu> Thanks for any help.

GP> --
GP> Best regards,
GP> Alexander mailto:0xef15h@inbox. ru

GP>

--
Best regards,
Alexander mailto:0...@inbox.ru