Forums

Fast 2d convolution using FFTW

Started by pegels March 29, 2006
Hi Community!

I've got a problem that has been driving me mad for several days: 
I need a quick 2d convolution of a picture and a big (say 70x70) kernel. I
was thinking of doing it in frequency domain using FFTW (the programme is
in C++).


This is the way I am doing it:

1. Pad the filter to the size of the input image. (padding "all around",
so the "important" part is in the middle)
2. Compute fft of the filter and the image
3. Multiply the two spectra.
4. IFFT

It's one of the simplest thing in the world, but I am still getting
meaningless results.

The image is 630x366. and the kernel I tried was a gaussian with sigma=5 ,
which is much smaller than the image.

If I just do fft and ifft of the image, I get the original image.

The same goes for the kernel, BUT: if I look at the spectrum of the
kernel, it's strange. I tried FFT of a gaussian kernel, which should give
me a gaussian. (just a different "size") but the spectrum seems to contain
many same copies of the gaussian. Is this the mistake? (still, if I take
ifft of that strange spectrum, I get the original gaussian back)

Could you please have a look at the code to see if I'm doing something
wrong??? I am getting really desperate - I even checked if the
multiplication of complex numbers works well :) - it does.

I am using CImg library for work with images.
And I am also using FFTW++, which wraps around the fftw library.
I am doing it in "pure" complex way - but I also tried real2complex and
complex2real transform does the same thing.

    typedef std::complex<double> Complex;
    //temporary images
    CImg<double> mask_fft(input,false);//spectrum of the mask
    CImg<double> mask_ifft(mask,true); //inverse transform of transformed
mask
    CImg<double> image_fft(input,false);//spec.of the image
    CImg<double> image_ifft(input,false);//inverse transform of the
spectrum of image
                                           //multiplied by spec. of the
mask

    CImg<double> image_mask_mul_fft(input,false);//the multiplied spectra

    //mask has been padded - has the same size as the input image
    int size_x = input.dimx();
    int size_y = input.dimy();

    //pointers to memory for fftw
    Complex *in;
    Complex *out, *out_temp;
    
    //just for debugging
    double foo;

    
    //save the input images
    mask.normalize(0,255); //just for viewing
    mask.save("1_mask_spat_norm.bmp");
    input.save("2_image_spat.bmp");

    //now allocate memory blocks for fftw - all complex
    in = (Complex *)fftw_malloc(sizeof(Complex) * (size_x) * (size_y));
    out =  (Complex *) fftw_malloc(sizeof(Complex) * (size_x)*(size_y));
    out_temp =  (Complex *) fftw_malloc(sizeof(Complex) *
(size_x)*(size_y));

    //define transforms - image and mask have the same size, so I can use
the same plan
    fft2d Forward(size_x,size_y,-1,in,out);
    fft2d Backward(size_x,size_y,1,out,in);

    //now fill in the input array with data from input image
    // I have tested it, no mistake here
    for(int i=0;i<size_x;i++){
        for(int j=0;j<size_y;j++){
                foo = input(i,j,0,0); //read first channel from input
image
                in[j* size_x + i] = foo; //put the value in the array
        }
    }

    //transform input image and put value into "out"
    Forward.fft(in,out);

    //now load the mask into the input array
    for(int i=0;i<size_x;i++){
        for(int j=0;j<size_y;j++){
                foo = mask(i,j,0,0); //read data from conv.kernel
                in[j* size_x + i] = foo;
        }
    }

    //now transform the conv.kernel - mask
    Forward.fft(in,out_temp);

    //save the fft of the image


    for(int i=0;i<size_x;i++){
        for(int j=0;j<size_y;j++){
            image_fft(i,j,0,0) = image_fft(i,j,0,1) = image_fft(i,j,0,2) =
std::abs(out[j*size_x + i]);
       }
    }
    image_fft.save("3_im_fft.bmp");

    //save the fft of the filter
    //spectrum of the filter is really strange. but IFFT works, so I don't
know what to think about it.
    for(int i=0;i<size_x;i++){
        for(int j=0;j<size_y;j++){
            mask_fft(i,j,0,0) = mask_fft(i,j,0,1) = mask_fft(i,j,0,2) =
std::abs(out_temp[j*size_x + i]);
       }
    }
    mask_fft.normalize(0,255); //normalize, so that it can be viewed
easily
    mask_fft.save("4_mask_fft_norm.bmp");


    //now I have the spectrum of image and mask - out and out_temp
    //so all I need is to multiply them

//as the spectrum of the filter is ..... strange. the result is a
nonsense
    //now the real filtering - multiply spectra
    //something wrong here :((
    //multiply the two spectra
   //what am I doing wrong?
    for(int i=0;i<size_x;i++){
        for(int j=0;j<size_y;j++){
              out[j* size_x + i] = out[j*size_x + i] * out_temp[j* size_x
+ i];
                //out[j* size_x + i] = out_temp[j* size_x + i];
       }
    }

    //save the multiplied spectrums
    for(int i=0;i<size_x;i++){
        for(int j=0;j<size_y;j++){
                image_mask_mul_fft(i,j,0,0) = 
                    image_mask_mul_fft(i,j,0,1) = 
                        image_mask_mul_fft(i,j,0,2) =
std::abs(out[j*size_x + i]);
       }
    }     
    image_mask_mul_fft.save("5_multiplied spectra.bmp");

    //backward transform of the multiplied spectrums
    Backward.fftNormalized(out,in);
    
    for(int i=0;i<size_x;i++){
        for(int j=0;j<size_y;j++){
            foo =  std::abs(in[j* size_x + i]);
            image_ifft(i,j,0,0) = image_ifft(i,j,0,1) =
image_ifft(i,j,0,2) = foo;
        }
    }

    //just to check - inverse transform of spectrum of filter
    Backward.fftNormalized(out_temp,in);
    for(int i=0;i<size_x;i++){
        for(int j=0;j<size_y;j++){
            foo =  std::abs(in[j* size_x + i]);
            mask_ifft(i,j,0,0) = foo;
        }
    }
    image_ifft.save("6_im_ifft.bmp");
    mask_ifft.save("7_mask_ifft.bmp");
    return image_ifft;

I would really be grateful for any help. 


if you know that the FT of a gaussian is a gaussian, then why do fft of 
kernel?
just multiply the fft of image with invented gaussian in fourier domain, 
then IFFT

all you are doing is lo pass filtering - so you are only selecting the low 
frequency components of image right?

so you are selecting all fourier components a certain distance from origin 
(dc spatial frequency), and throwing away rest, 'apodising' higher frequency 
components so you don't get ringing...

what you need to work out is output from the FFTW program - where is origin?
also zero padding always stinks - you may have to smooth abrupt transistion 
into zeros, otherwise get ringing convolved into image.

A trick for getting origin right is to premultiply with a comb (-1)^(x+y)... 
(alternate the sign of each pixel, both horizontally and vertically) on both 
sides of fft

good luck



An FFT performs a discrete Fourier transform (DFT), not a Fourier
transform.  The DFT of a single Gaussian (a) doesn't give you exactly a
Gaussian and (b) gives you a periodic sequence of almost-Gaussians.  In
particular, there should be an almost-Gaussian centered at each corner
of your image.  See also Q3.10 in the FFTW FAQ about the origin in the
frequency domain.

Cordially,
Steven G. Johnson

Hi!
Thanks a lot for quick anwer.

Actually, there was the oldest mistake ever... Wrong indexing.