comp.dsp | Again trouble with cross correlation

Hello group,

thanks to your help I was able to get my last problem solved (zero
padding). Now it works, only small problem is still there: When my
object moves  x, y pixels it works when x, y is not too small. For
example 10, 7 movement, it works brilliant. But when movement is -1,
-1 or -1, 0 or -1, 1 or 0, -1 or 0, 1 or  1, -1 or 1, 0 it always
detects a false positive maximum at 0, 0. These 7 cases are really the
only thing not working. As soon as movement is 2, 0 or 2, 1 it works
exactly. I have looked at the images and it really is so as I say:
When I move -1, -1 for example and I normalize my CCXed picture to
have values from 0-1000 then there is a pixel with value 1000 (my
false positive maximum) in 0, 0 and a pixel with say 950 in -1, -1.

How do I get around this? I really need to accurately see small
movement too. I read something about dividing by the local variance
but did not see a formula or anything and do not actually know if this
would help my case. Can anyone think of something?

Thank you very much!
Greetings,
Frederick

Reply by Rune Allnor ●July 14, 20092009-07-14

On 14 Jul, 21:35, Frederick Mengin <FrederickMen...@gmx.de> wrote:
> Hello group,
>
> thanks to your help I was able to get my last problem solved (zero
> padding). Now it works, only small problem is still there: When my
> object moves &#4294967295;x, y pixels it works when x, y is not too small. For
> example 10, 7 movement, it works brilliant. But when movement is -1,
> -1 or -1, 0 or -1, 1 or 0, -1 or 0, 1 or &#4294967295;1, -1 or 1, 0 it always
> detects a false positive maximum at 0, 0. These 7 cases are really the
> only thing not working.

So to get the overview: The cases that don't work are:

(-1 -1)
(-1  0)
(-1  1)
( 0 -1)
( 0  1)
( 1 -1)
( 1  0)

What is different with (1,1)? All the others represent
movements along or across edges in the image, while
the (1,1) movement represents a movement towards the
interior of the image. Could this be an edge effect?
Is the target area located too close to the edge of
the image? Are there wrap-around effects that are not
accounted for?

> As soon as movement is 2, 0 or 2, 1 it works
> exactly. I have looked at the images and it really is so as I say:
> When I move -1, -1 for example and I normalize my CCXed picture to
> have values from 0-1000 then there is a pixel with value 1000 (my
> false positive maximum) in 0, 0 and a pixel with say 950 in -1, -1.
>
> How do I get around this? I really need to accurately see small
> movement too. I read something about dividing by the local variance
> but did not see a formula or anything and do not actually know if this
> would help my case. Can anyone think of something?

The one thing I would suggest is to use the
cross *covariance*, not the cross correlation.
Then test with perfect data, that is, noise-free
simulated data.

Rune

Reply by Frederick Mengin ●July 14, 20092009-07-14

Hello Rune,

On 14 Jul., 22:02, Rune Allnor <all...@tele.ntnu.no> wrote:
> So to get the overview: The cases that don't work are:
>
> (-1 -1)
> (-1 &#4294967295;0)
> (-1 &#4294967295;1)
> ( 0 -1)
> ( 0 &#4294967295;1)
> ( 1 -1)
> ( 1 &#4294967295;0)
>
> What is different with (1,1)?

Actually nothing - it was my fault :-( The (1, 1) case does't work
also just like the others above. I am sorry for this mistake that  I
made.

> All the others represent
> movements along or across edges in the image, while
> the (1,1) movement represents a movement towards the
> interior of the image. Could this be an edge effect?
> Is the target area located too close to the edge of
> the image? Are there wrap-around effects that are not
> accounted for?

It cannot be a wrap-around effect as the image is quite big (640x480)
compared to the object (about 5x5) and  the object is dead center.

> > As soon as movement is 2, 0 or 2, 1 it works
> > exactly. I have looked at the images and it really is so as I say:
> > When I move -1, -1 for example and I normalize my CCXed picture to
> > have values from 0-1000 then there is a pixel with value 1000 (my
> > false positive maximum) in 0, 0 and a pixel with say 950 in -1, -1.
>
> > How do I get around this? I really need to accurately see small
> > movement too. I read something about dividing by the local variance
> > but did not see a formula or anything and do not actually know if this
> > would help my case. Can anyone think of something?
>
> The one thing I would suggest is to use the
> cross *covariance*, not the cross correlation.
> Then test with perfect data, that is, noise-free
> simulated data.

I did as you instructed and tested with noise-free data: voila, it
yielded perfect results. So could it be that because so many noise
stays invariant (e.g. translated (0, 0)) this gives me a false
positive? However I cannot simply ignore (0, 0) because sometimes the
image does not move at all and I need to detect these case. Your tip
using the cross covariance - how would I do that? From what I've read
so far, I cannot distinguish between cross covariance and cross
correlation, as both seem to be \int f^*(t) g(x + t) dt (so my
operation would also be IFFT(FFT(F) * FFT(G)')  where ' is the complex
conjugate. Could you clearify this?

Thanks you for your help!
Greetings,
Frederick

Reply by Rune Allnor ●July 14, 20092009-07-14

On 14 Jul, 23:06, Frederick Mengin <FrederickMen...@gmx.de> wrote:

> I did as you instructed and tested with noise-free data: voila, it
> yielded perfect results. So could it be that because so many noise
> stays invariant (e.g. translated (0, 0)) this gives me a false
> positive?

Yep.

> However I cannot simply ignore (0, 0) because sometimes the
> image does not move at all and I need to detect these case. Your tip
> using the cross covariance - how would I do that? From what I've read
> so far, I cannot distinguish between cross covariance and cross
> correlation, as both seem to be \int f^*(t) g(x + t) dt (so my
> operation would also be IFFT(FFT(F) * FFT(G)') &#4294967295;where ' is the complex
> conjugate. Could you clearify this?

Subtract the mean from both pictures:

Cgf = IFFT(FFT(F-mean(F)) * FFT(G-mean(G))');

Rune

Reply by Frederick Mengin ●July 14, 20092009-07-14

Hello Rune,

On 14 Jul., 23:20, Rune Allnor <all...@tele.ntnu.no> wrote:
> On 14 Jul, 23:06, Frederick Mengin <FrederickMen...@gmx.de> wrote:
>
> > I did as you instructed and tested with noise-free data: voila, it
> > yielded perfect results. So could it be that because so many noise
> > stays invariant (e.g. translated (0, 0)) this gives me a false
> > positive?
>
> Yep.

Yikes :-(

> > However I cannot simply ignore (0, 0) because sometimes the
> > image does not move at all and I need to detect these case. Your tip
> > using thecrosscovariance - how would I do that? From what I've read
> > so far, I cannot distinguish betweencrosscovariance andcross
> >correlation, as both seem to be \int f^*(t) g(x + t) dt (so my
> > operation would also be IFFT(FFT(F) * FFT(G)') &#4294967295;where ' is the complex
> > conjugate. Could you clearify this?
>
> Subtract the mean from both pictures:
>
> Cgf = IFFT(FFT(F-mean(F)) * FFT(G-mean(G))');

Ok, I tried this - but it did not change the problem. Some things I'm
thinking about: Maybe just looking for the "hottest" pixel is not a so
good idea after all - could it help to also look at the sourrounding
pixels to determine the center? Are there algorithms for doing these?
I also thought about using a crude workaround: Just ignore (0, 0) when
looking for the hottest pixel - then also subtract the images and take
the L2 norm. If it is below a certain threshold the images are equal
(i.e. not translated at all) otherwise the result from the CCX is
correct. However I am reluctant to doing it this way because it would
be kind of botch (I do not know if this is the correct word, I just
typed it in into dict.cc - I'm looking for "Pfusch").

Thank you again for your help!
Greetings,
Frederick

Reply by Rune Allnor ●July 14, 20092009-07-14

On 14 Jul, 23:46, Frederick Mengin <FrederickMen...@gmx.de> wrote:

> > Subtract the mean from both pictures:
>
> > Cgf = IFFT(FFT(F-mean(F)) * FFT(G-mean(G))');
>
> Ok, I tried this - but it did not change the problem.

Then the next thing to try is to compute the *local*
covariance across the image. This means that you can
no longer use the FFT, but have to do a convolution.

In pseudocode:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
x: Image
y: (2L+1)*(2L+1) mask

[M,N] / size(x)
cxy = zeros(M,N)
for n=1:N
   for m=1:M
      z = x(m-L:m+L,n-L:n+L);
      cxy(m,n) = sum(sum((z-mean(z)).*(y-mean(y))));
   end
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

In plain English: For every (m,n), extract the
(2L+1)*(2L+1) part of the image that is covered
by the mask y into a variable z. Subtract the mean
from this sub image and the mask y. Compute the
correlation coefficient cxy(m,n) as the inner
product between z and y, both with means subtracted.

I think this will work, but it might take a lot
longer to run. But more often than not, slow algorithms
that actually work take presedence over fast algorithms
that don't.

Rune

Reply by Frederick Mengin ●July 14, 20092009-07-14

On 15 Jul., 00:18, Rune Allnor <all...@tele.ntnu.no> wrote:
> On 14 Jul, 23:46, Frederick Mengin <FrederickMen...@gmx.de> wrote:
>
> > > Subtract the mean from both pictures:
>
> > > Cgf = IFFT(FFT(F-mean(F)) * FFT(G-mean(G))');
>
> > Ok, I tried this - but it did not change the problem.
>
> Then the next thing to try is to compute the *local*
> covariance across the image. This means that you can
> no longer use the FFT, but have to do a convolution.
>
> In pseudocode:
>
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> x: Image
> y: (2L+1)*(2L+1) mask
>
> [M,N] / size(x)
> cxy = zeros(M,N)
> for n=1:N
> &#4294967295; &#4294967295;for m=1:M
> &#4294967295; &#4294967295; &#4294967295; z = x(m-L:m+L,n-L:n+L);
> &#4294967295; &#4294967295; &#4294967295; cxy(m,n) = sum(sum((z-mean(z)).*(y-mean(y))));
> &#4294967295; &#4294967295;end
> end
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

OK, here is how I adapted this in (not yet working) C++ code:

const unsigned int L = 5;
Picture y(2 * L + 1, 2 * L + 1);
/* How to initialize y? */
y -= y.Avg();

Picture cxy(N, M);
cxy.Clear();

/* Cutoff at L because stencil z will be undefined otherwise */
for (unsigned int n = L; n < cxy.GetWidth() - L; n++) {
	for (unsigned int m = L; m < cxy.GetHeight() - L; m++) {
		Picture z(2 * L + 1, 2 * L + 1);
		z.CopyFrom(x, m - L, n - L);
		z -= z.Avg();
		for (unsigned int x = 0; x < z.GetWidth(); x++) {
			for (unsigned int y = 0; y < z.GetHeight(); y++) {
				cxy(x, y) += z(x, y) * y(x, y);
			}
		}
	}
}

> In plain English: For every (m,n), extract the
> (2L+1)*(2L+1) part of the image that is covered
> by the mask y into a variable z. Subtract the mean
> from this sub image and the mask y. Compute thecorrelationcoefficient cxy(m,n) as the inner
> product between z and y, both with means subtracted.

The problem is (I commented in the code also): How do I initialize y,
the mask? What kind of stencil is appropriate here? Also: What do I do
in the regions where I cannot fully copy from x into z (e.g. from 1 to
L-1 and from Width - L to Width)? Just skip over them? Did I get the
basic idea right at all?

> I think this will work, but it might take a lot
> longer to run. But more often than not, slow algorithms
> that actually work take presedence over fast algorithms
> that don't.

Yes, if it would work that'd be awesome, no matter how long it takes (O
(n^2) instead of O(n logn), but the main thing is the result is
correct).

Thanks again,
Greetings,
Frederick

Again trouble with cross correlation

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group