Hello! I have been trying to implement an image in image search algorithm using fft, as described in the dspguide, chapter 24 (http://www.dspguide.com/), using FFT Convolution. As far as I have understood it, the search image is used as some kind of a filter (like sharpen, etc.) on the original image (in which I search) in order to give the resulting image a distribution of similarity of the image one is searching for at that position. (Bright values = the image you are searching for fits this part of the image you search in very well.) I hope this is kind of understandable.. Anyway, this is the algorithm as I have understood it, and I have been trying to implement/copy it, but in vain: 1. embed the 2 images (large: image to search in, small: image to search in the large one) into 2 large images, the edge-size of which is to be a power of 2 and larger than the sum of the 2 original image-edge-sizes The smaller image is being embedded after having been rotated by 180 degrees. (is the restriction on the size of the edges necessary??) 2. fill the rest of the new images with 0s (black) 3. do a 2d fft on both images 4. multiply them, small * large (ordering is important, isnt it?) 5.do a 2d ifft on the product 6. this should be the desired image/matrix (high values imply a high similarity between the small image and the large image at that position) (I have complex numbers here.. they are not an image anymore. Do I just take the real part?) Unfortunately, all I get out of this is rubbish.. and I dont rather know, why. I have tried a few different possibilities, like offsetting the image (256 grayscale) values, so that they are between -128 and 127, and a few other things, but nothing seems to work really well. If you have any ideas about what I am doing wrong, what I should be doing, or where I could find additional information about the matter, please tell me!! Many thx in advance, Andreas

# convolution with fft for image in image search/similarity

Started by ●September 30, 2003

Reply by ●September 30, 20032003-09-30

Andreas, The method you're using seems OK, although I'm not sure about the reversing one image by 180 degrees. One thing you might need to do is to pre-whiten the images. The link: http://www.itee.uq.edu.au/~kootsoop/prewhiteningexample.htm shows an example of trying to find the singer's nose using correlation. It doesn't work very well unless you do some sort of pre-whitening (like doing a row-wise difference, as a simple example). Ciao, Peter K. -- Peter J. Kootsookos "I will ignore all ideas for new works [..], the invention of which has reached its limits and for whose improvement I see no further hope." - Julius Frontinus, c. AD 84

Reply by ●September 30, 20032003-09-30

In article <blcgkv$2in$1@nets3.rz.RWTH-Aachen.DE>, "Andreas" <reiff(remove_me)@amo.de> wrote: | Anyway, this is the algorithm as I have understood it, and I have been | trying to implement/copy it, but in vain: | 1. embed the 2 images (large: image to search in, small: image to search in | the large one) into 2 large images, the edge-size of which is to be a power | of 2 and larger than the sum of the 2 original image-edge-sizes You can cut this down to be the power of 2 greater or equal to the larger image size, unless you don't want "wrap-around" correlations (matches which begin at the right edge and wrap around to the left, or wrap bottom to top). | The smaller image is being embedded after having been rotated by 180 | degrees. Apparently the rotation by 180 degrees is to reverse the search image from top-to-bottom and left-to-right. This allows the subsequent multiplication in the frequency domain to perform correlation instead of convolution, but usually it is much easier to just perform the multiplication by the complex conjugate of the frequency domain components of the search image (no reversal / rotation required). | 4. multiply them, small * large (ordering is important, isnt it?) No, the order is not important. Multiplication is commutative. | 5.do a 2d ifft on the product | 6. this should be the desired image/matrix (high values imply a high | similarity between the small image and the large image at that position) (I | have complex numbers here.. they are not an image anymore. Do I just take | the real part?) At this point you should have only real components (the imaginary part should be zero, or very close to zero). The steps you state sound correct, so there appears to be something wrong in your implementation. Verify that: 1) the result of both forward FFTs are conjugate symmetric (real components of first half are mirrored in the second half, and imaginary components are mirrored and negated). 2) The result of the multiplication of the two FFTs is also conjugate symmetric. -- Tim Olson

Reply by ●October 1, 20032003-10-01

Many thanks for both your answers! It was actually my own stupidity (or the strange ideas of the programmers of Matlab). I had never used the program before and assumed when writing M1 * M2 it would do like I have been told in endless mathematics lectures. But of course, it didnt, and instead I had to use the .* operator. By the way, multiplication of matrices is associative but not communitative! This doesnt effect this application too much, because it doesnt matter which is the input and which the convolution kernel, but in general, results should be different: 1 0 * 2 3 = 2 3 0 0 0 0 0 0 2 3 * 1 0 = 2 0 0 0 0 0 0 0 (This actually is the normal multiplication.. in a program like Matlab I would have to use .* as I know now.) Just for correctness ;-) Anyway, thx a lot for all your help, I have been despairing over this for many, many days, and without you telling me I was on the right way I probably would have given up!!