Forums

Image correlation via FFT. Why doesn't this work?

Started by typewriter April 20, 2007
Hello,

I have a program that performs FFT convolution to search
for occurrences of a small image inside another image.

I feel like I understand the concept, but my program doesn't
do what it should.  The small search image is simply cut out from the
larger image, so I should see a bright spike in the resultant FFT
convolution.  But I am seeing a very ambiguous result.

I created a web page with pictures and my fairly basic source code, if
anyone has the desire to take a look.. I'd really appreciate it.

The web page is located at: http://viz.nu/fft/

Thanks very much in advance,

Kristopher


_____________________________________
Do you know a company who employs DSP engineers?  
Is it already listed at http://dsprelated.com/employers.php ?
typewriter said the following on 20/04/2007 23:36:
> Hello, > > I have a program that performs FFT convolution to search > for occurrences of a small image inside another image. > > I feel like I understand the concept, but my program doesn't > do what it should. The small search image is simply cut out from the > larger image, so I should see a bright spike in the resultant FFT > convolution. But I am seeing a very ambiguous result. > > I created a web page with pictures and my fairly basic source code, if > anyone has the desire to take a look.. I'd really appreciate it. > > The web page is located at: http://viz.nu/fft/
I tried your images in MATLAB, I get a very similar end result to you, but with a very prominent peak near the bottom right. One thing to try is to first remove the DC offset from both your input images (i.e. by subtracting the mean in the spatial domain). -- Oli
Oli, Thanks for looking at this.  I will try the DC offset removal
you mentioned.  I wonder if you can answer a couple questions I have?

1. If the original image is less than 512x512, and I am zero padding
to 512x512, then what does any correlation information mean that
falls outside the original image boundaries?

2. Is there a good way to scale the generated correlation "image"
for viewing (ie, [0-1] grayscale)? I have tried linear scaling, 
and messing around with log scaling. Maybe the scaling I am 
using to make the correlation more visible is hurting my results.

Thanks for any more information, I really do appreciate it.


Kristopher


>typewriter said the following on 20/04/2007 23:36: >> Hello, >> >> I have a program that performs FFT convolution to search >> for occurrences of a small image inside another image. >> >> I feel like I understand the concept, but my program doesn't >> do what it should. The small search image is simply cut out from the >> larger image, so I should see a bright spike in the resultant FFT >> convolution. But I am seeing a very ambiguous result. >> >> I created a web page with pictures and my fairly basic source code, if >> anyone has the desire to take a look.. I'd really appreciate it. >> >> The web page is located at: http://viz.nu/fft/ > > >I tried your images in MATLAB, I get a very similar end result to you, >but with a very prominent peak near the bottom right. > >One thing to try is to first remove the DC offset from both your input >images (i.e. by subtracting the mean in the spatial domain). > > >-- >Oli >
_____________________________________ Do you know a company who employs DSP engineers? Is it already listed at http://dsprelated.com/employers.php ?
On Apr 23, 3:34 pm, "typewriter" <kristopher.coll...@gmail.com> wrote:
> Oli, Thanks for looking at this. I will try the DC offset removal > you mentioned. I wonder if you can answer a couple questions I have? > > 1. If the original image is less than 512x512, and I am zero padding > to 512x512, then what does any correlation information mean that > falls outside the original image boundaries? > > 2. Is there a good way to scale the generated correlation "image" > for viewing (ie, [0-1] grayscale)? I have tried linear scaling, > and messing around with log scaling. Maybe the scaling I am > using to make the correlation more visible is hurting my results. > > Thanks for any more information, I really do appreciate it. > > Kristopher > > > > > > >typewriter said the following on 20/04/2007 23:36: > >> Hello, > > >> I have a program that performs FFT convolution to search > >> for occurrences of a small image inside another image. > > >> I feel like I understand the concept, but my program doesn't > >> do what it should. The small search image is simply cut out from the > >> larger image, so I should see a bright spike in the resultant FFT > >> convolution. But I am seeing a very ambiguous result. > > >> I created a web page with pictures and my fairly basic source code, if > >> anyone has the desire to take a look.. I'd really appreciate it. > > >> The web page is located at:http://viz.nu/fft/ > > >I tried your images in MATLAB, I get a very similar end result to you, > >but with a very prominent peak near the bottom right. > > >One thing to try is to first remove the DC offset from both your input > >images (i.e. by subtracting the mean in the spatial domain). > > >-- > >Oli > > _____________________________________ > Do you know a company who employs DSP engineers? > Is it already listed athttp://dsprelated.com/employers.php?- Hide quoted text - > > - Show quoted text -
Do you necessarily expect the maximum result when you get alignment? May not happen even for an ideally extracted smaller image. You might try normalizing the energy in the parts of the images used to compute the result for each 2D offset (obviously constant for the smaller one, but offset-dependent for the larger image). Also, the possibility for a "bright spike" is highly data dependent. Best to hope for a maximum, rather than a spike. Dirk
>Also, the possibility for a "bright spike" is highly data dependent. >Best to hope for a maximum, rather than a spike.
I was thinking the same thing, so I began finding the maximum. However my results show that the maximum is not always in the "correct" place. For *some* images (including the example Lichtenstein image on my page viz.nu/fft), the program indicates a maximum at the *correct* point, and for others it does not. I tried using the football/soccer images on this page : http://www.gamedev.net/reference/programming/features/imageproc/page3.asp Where the author is performing the exact same method I am. However, not only do I not match his white peak, but my maximum is in the wrong place (it falls in the man's shirt). I whipped up another web page to show what's happening : http://www.viz.nu/fft/page2.html . I've gone over my functions many times, and do not see anything suspicious. Thanks very much for your help guys. I wouldn't be posting if I hadn't done so much on my own to try and get this. So your help is greatly appreciated! Kristopher
>Do you necessarily expect the maximum result when you get alignment? >May not happen even for an ideally extracted smaller image. You might >the result for each 2D offset (obviously constant for the smaller one, >but offset-dependent for the larger image). > > >Dirk > >
>On Apr 23, 3:34 pm, "typewriter" <kristopher.coll...@gmail.com> wrote: >> Oli, Thanks for looking at this. I will try the DC offset removal >> you mentioned. I wonder if you can answer a couple questions I have? >> >> 1. If the original image is less than 512x512, and I am zero padding >> to 512x512, then what does any correlation information mean that >> falls outside the original image boundaries? >> >> 2. Is there a good way to scale the generated correlation "image" >> for viewing (ie, [0-1] grayscale)? I have tried linear scaling, >> and messing around with log scaling. Maybe the scaling I am >> using to make the correlation more visible is hurting my results. >> >> Thanks for any more information, I really do appreciate it. >> >> Kristopher >> >> >> >> >> >> >typewriter said the following on 20/04/2007 23:36: >> >> Hello, >> >> >> I have a program that performs FFT convolution to search >> >> for occurrences of a small image inside another image. >> >> >> I feel like I understand the concept, but my program doesn't >> >> do what it should. The small search image is simply cut out from
the
>> >> larger image, so I should see a bright spike in the resultant FFT >> >> convolution. But I am seeing a very ambiguous result. >> >> >> I created a web page with pictures and my fairly basic source code,
if
>> >> anyone has the desire to take a look.. I'd really appreciate it. >> >> >> The web page is located at:http://viz.nu/fft/ >> >> >I tried your images in MATLAB, I get a very similar end result to
you,
>> >but with a very prominent peak near the bottom right. >> >> >One thing to try is to first remove the DC offset from both your
input
>> >images (i.e. by subtracting the mean in the spatial domain). >> >> >-- >> >Oli >> >> _____________________________________ >> Do you know a company who employs DSP engineers? >> Is it already listed athttp://dsprelated.com/employers.php?- Hide
quoted text -
>> >> - Show quoted text - >
_____________________________________ Do you know a company who employs DSP engineers? Is it already listed at http://dsprelated.com/employers.php ?
I have uploaded an interactive Java applet of my program.  Sometimes the
peak value corresponds to the correct search location, other times it does
not.  So strange!!!

http://www.viz.nu/fft/


>>Also, the possibility for a "bright spike" is highly data dependent. >>Best to hope for a maximum, rather than a spike. > >I was thinking the same thing, so I began finding the maximum. However
my
>results show that the maximum is not always in the "correct" place. For >*some* images (including the example Lichtenstein image on my page >viz.nu/fft), the program indicates a maximum at the *correct* point, and >for others it does not. > >I tried using the football/soccer images on this page : >http://www.gamedev.net/reference/programming/features/imageproc/page3.asp >Where the author is performing the exact same method I am. However, not >only do I not match his white peak, but my maximum is in the wrong place >(it falls in the man's shirt). I whipped up another web page to show >what's happening : http://www.viz.nu/fft/page2.html . > >I've gone over my functions many times, and do not see anything >suspicious. > >Thanks very much for your help guys. I wouldn't be posting >if I hadn't done so much on my own to try and get this. So >your help is greatly appreciated! > >Kristopher > > >
_____________________________________ Do you know a company who employs DSP engineers? Is it already listed at http://dsprelated.com/employers.php ?
On 21 Apr, 00:36, "typewriter" <kristopher.coll...@gmail.com> wrote:
> Hello, > > I have a program that performs FFT convolution to search > for occurrences of a small image inside another image. > > I feel like I understand the concept, but my program doesn't > do what it should. The small search image is simply cut out from the > larger image, so I should see a bright spike in the resultant FFT > convolution. But I am seeing a very ambiguous result.
First: Correlation methods only looks for "similar" results, not "the same." So if the texture of your image is similar across, you might find partial matches all over, meaning that the actual match does not stand as clearly out as you might have hoped for. As for correlaion: I am not sure is is the correct method to use. Correlation works for zero-mean signals, as found in radar, sonar and communication applications. Images are not zero mean, so I am wondering if covariance would be better. I would try this: - Select your target sub-frame and remove its mean - Select a sub-frame in your base image and remove is mean - Compute the covariance coefficient between the two sub-frames as a double sum c(u,v) = sum_u sum_v x(n-u,m-v)y(n-u,m-v) The reason for doing this the hard way is that the mean has to be removed from each sub-frame independently. To see why removing the mean is important, consider the signals x1 = [1 0]' y1 = [0 1]' which is a cosine and sine sampled at 0 ad pi/2. The two are orthogonal, as expected: c1 = x1'y2 = 1*0 + 0*1 = 0. If we add a non-zero mean to either one, the story changes: x2 = x1 + 10 c2 = x2'y1 = (1+10)*0 + (0+10)*1 = 10. So if the mean varies significantly between subframes in the image, correlation does not work. If you want to use covariance on a frame-by-frame basis, you have to remove the mean of each particular sub-frame, and FFT-based methods can no longer be used. Rune
>>I have a program that performs FFT convolution to search >> for occurrences of a small image inside another image. >> Online Applet : http://www.viz.nu/fft/
Rune said:
>First: Correlation methods only looks for "similar" results, not "the >same." So if the texture of your image is similar across, you might >find partial matches all over, meaning that the actual match does >not stand as clearly out as you might have hoped for.
Right, but if the small image is cut out directly from the larger one, the convolution should generate a maximum at the match point(however unclear it may be to the naked eye). Also, I *want* to see "similar" areas, but have the "best" match appear as the maximum, as I understand this method to work. (see: http://www.dspguide.com/ch24/6.htm )
>As for correlaion: I am not sure is is the correct method to use. >Correlation works for zero-mean signals, as found in radar, sonar
Perhaps I'm not performing "textbook" correlation. Correct, images are not zero mean, but it should still work. This method has been used succesfully by others, ( see : http://www.gamedev.net/reference/programming/features/imageproc/page3.asp ). I could subtract the mean, but doing so for each subframe would really slow it down. I am using the same test images as the URL above, with which the author sucessfully demonstrated this concept. Although those images are among the ones that my program fails on! But it *does* work with others.. _____________________________________ Do you know a company who employs DSP engineers? Is it already listed at http://dsprelated.com/employers.php ?
On Apr 24, 3:41 pm, "typewriter" <kristopher.coll...@gmail.com> wrote:
> >Also, the possibility for a "bright spike" is highly data dependent. > >Best to hope for a maximum, rather than a spike. > > I was thinking the same thing, so I began finding the maximum. However my > results show that the maximum is not always in the "correct" place. For > *some* images (including the example Lichtenstein image on my page > viz.nu/fft), the program indicates a maximum at the *correct* point, and > for others it does not. > > I tried using the football/soccer images on this page :http://www.gamedev.net/reference/programming/features/imageproc/page3... > Where the author is performing the exact same method I am. However, not > only do I not match his white peak, but my maximum is in the wrong place > (it falls in the man's shirt). I whipped up another web page to show > what's happening :http://www.viz.nu/fft/page2.html. > > I've gone over my functions many times, and do not see anything > suspicious. > > Thanks very much for your help guys. I wouldn't be posting > if I hadn't done so much on my own to try and get this. So > your help is greatly appreciated! > > Kristopher > > > > > > > > >Do you necessarily expect the maximum result when you get alignment? > >May not happen even for an ideally extracted smaller image. You might > >the result for each 2D offset (obviously constant for the smaller one, > >but offset-dependent for the larger image). > > >Dirk > > >On Apr 23, 3:34 pm, "typewriter" <kristopher.coll...@gmail.com> wrote: > >> Oli, Thanks for looking at this. I will try the DC offset removal > >> you mentioned. I wonder if you can answer a couple questions I have? > > >> 1. If the original image is less than 512x512, and I am zero padding > >> to 512x512, then what does any correlation information mean that > >> falls outside the original image boundaries? > > >> 2. Is there a good way to scale the generated correlation "image" > >> for viewing (ie, [0-1] grayscale)? I have tried linear scaling, > >> and messing around with log scaling. Maybe the scaling I am > >> using to make the correlation more visible is hurting my results. > > >> Thanks for any more information, I really do appreciate it. > > >> Kristopher > > >> >typewriter said the following on 20/04/2007 23:36: > >> >> Hello, > > >> >> I have a program that performs FFT convolution to search > >> >> for occurrences of a small image inside another image. > > >> >> I feel like I understand the concept, but my program doesn't > >> >> do what it should. The small search image is simply cut out from > the > >> >> larger image, so I should see a bright spike in the resultant FFT > >> >> convolution. But I am seeing a very ambiguous result. > > >> >> I created a web page with pictures and my fairly basic source code, > if > >> >> anyone has the desire to take a look.. I'd really appreciate it. > > >> >> The web page is located at:http://viz.nu/fft/ > > >> >I tried your images in MATLAB, I get a very similar end result to > you, > >> >but with a very prominent peak near the bottom right. > > >> >One thing to try is to first remove the DC offset from both your > input > >> >images (i.e. by subtracting the mean in the spatial domain). > > >> >-- > >> >Oli > > >> _____________________________________ > >> Do you know a company who employs DSP engineers? > >> Is it already listed athttp://dsprelated.com/employers.php?-Hide > quoted text - > > >> - Show quoted text - > > _____________________________________ > Do you know a company who employs DSP engineers? > Is it already listed athttp://dsprelated.com/employers.php?- Hide quoted text - > > - Show quoted text -
Try doing the energy normalization (actually the root of it IIRC) on the two images that I mentioned earlier. Then all values should be in range [-1.0,1.0]. Compute something along the lines of a correlation coefficient (name might be something similar) for each possible alignment. There is an easy way (FFT based filtering) to compute the energy for each point by starting with the larger image with all values squared and a smaller image (same size as the actual smaller image) composed entirely of a constant value (you can figure out the scaling) so that the right squared values are summed for each offset. In order to test your complete algorithm I would extract a smaller image from a section of the larger image so the ideal result matching result is possible and predictable. If you don't get the ideal result, you have errors in your code. Dirk
On 25 Apr, 23:58, "typewriter" <kristopher.coll...@gmail.com> wrote:
> I could subtract the mean, but doing so for each subframe would really > slow it down.
Ah. So you would rather do things fast but incorrectly than do things correctly? I know, the FFT is a cute tool, but it is *not* a silver bullet! There *are* DSP appliations where the FFT can't be used.
> I am using the same test images as the URL above, with which the author > sucessfully demonstrated this concept. Although those images are among > the ones that my program fails on! But it *does* work with others..
If you try the same methods on the same data as were published but achieve different results, there are three possibilities: 1) You don't apply the same algorithm as was published. 2) What is published is not what was done. 3) The data have changed between test and publication, as might be a side effect of images being compressed for web publication. Of these options, item 1) is the one you can work with. Make sure you understand the algorithm and make sure you implement it correctly. Only when you are 100% sure of that, and still can't reproduce the published results, can you start thinking about the possibility of items 2) or 3) being relevant. Rune