comp.dsp | Speed up of 2D Lorentzian filter?

Hi all,

I'm looking for a way to speed up a special image reconstruction 
algorithm. [*]

Actually, the filter is very simple linear filter. In essence, the image 
is Fourier transformed, multiplied by

	1/(1+A*(u^2+v^2))

(u, v are the conjugate coordinates of x,y) and transformed back. 
Realistic images can be as large as 4096x4096, so with padding an 8kx8k 
FFT is performed on several thousand images, which takes hours even on 
powerful machines.

I am relatively ignorant of classic filter design, but looking at this 
transfer functions, I have the feeling that it should be possible to do 
using some kind of recursive filter. For instance, the 1D-Fourier 
transform of a Lorientzian is an exponential exp(-|x|) - therefore, in 
the 1D case, I think this would be a simple exponential smoothing filter 
- isn't it? Is there any way to design an IIR filter that operates 
similarly in 2D, and is rotationally symmetric?

Best regards,

	Christian



[*] The algorithm is used to correct X-ray images taken at synchrotrons 
where the X-rays are deflected from the straight line due to refraction 
and the details can be found here:

https://imagej.nih.gov/ij/plugins/ankaphase/ankaphase-userguide.html#tth_sEc1.3

Reply by Evgeny Filatov ●February 4, 20172017-02-04

On 04.02.2017 15:21, Christian Gollwitzer wrote:
> Hi all,
>
> I'm looking for a way to speed up a special image reconstruction
> algorithm. [*]
>
> Actually, the filter is very simple linear filter. In essence, the image
> is Fourier transformed, multiplied by
>
>      1/(1+A*(u^2+v^2))
>
> (u, v are the conjugate coordinates of x,y) and transformed back.
> Realistic images can be as large as 4096x4096, so with padding an 8kx8k
> FFT is performed on several thousand images, which takes hours even on
> powerful machines.
>
> I am relatively ignorant of classic filter design, but looking at this
> transfer functions, I have the feeling that it should be possible to do
> using some kind of recursive filter. For instance, the 1D-Fourier
> transform of a Lorientzian is an exponential exp(-|x|) - therefore, in
> the 1D case, I think this would be a simple exponential smoothing filter
> - isn't it? Is there any way to design an IIR filter that operates
> similarly in 2D, and is rotationally symmetric?
>
> Best regards,
>
>      Christian
>
>
>
> [*] The algorithm is used to correct X-ray images taken at synchrotrons
> where the X-rays are deflected from the straight line due to refraction
> and the details can be found here:
>
> https://imagej.nih.gov/ij/plugins/ankaphase/ankaphase-userguide.html#tth_sEc1.3
>

I don't know anything about 2D filtering. Would just comment anyway, 
because it's fun.

With the 1D transfer function exp(-|x|), it won't be a single 
exponential smoothing filter. More like, a sum of two exponential 
smoothing filters running in opposite directions.

That sort of highlights the problem you'd have with the 2D case. Suppose 
you know the 2D transfer function as f(r), with the parameter r = 
sqrt(x^2 + y^2). [I don't know the analytical answer for f(r), but even 
if there's no analytical form you could still find a suitable 
approximation or set f(r) up as a table.] It looks now that if you want 
to calculate the answer in spatial domain (as a 2D filter), it would be 
the sum of low-pass smoothing filters propagating in all possible 
directions.

What if you don't do it for all directions, but just for a few of them, 
like 8? I'd expect that you'd get some approximation to the desired 
answer, where what would supposed to be a circle appears as an octagon.

However, even if that works for you, I highly doubt that such an 
approach would outperform the 2D FFT, because the latter is highly 
optimized. In terms of complexity, it's hard to beat O(N*log(N)). My 
approach would be still O(N). However, you'd need the matrix of an 
incredible size to worry about log(N). ;-)

Gene

Reply by Evgeny Filatov ●February 5, 20172017-02-05

On 05.02.2017 4:43, Evgeny Filatov wrote:
> On 04.02.2017 15:21, Christian Gollwitzer wrote:
>> Hi all,
>>
>> I'm looking for a way to speed up a special image reconstruction
>> algorithm. [*]
>>
>> Actually, the filter is very simple linear filter. In essence, the image
>> is Fourier transformed, multiplied by
>>
>>      1/(1+A*(u^2+v^2))
>>
>> (u, v are the conjugate coordinates of x,y) and transformed back.
>> Realistic images can be as large as 4096x4096, so with padding an 8kx8k
>> FFT is performed on several thousand images, which takes hours even on
>> powerful machines.
>>
>> I am relatively ignorant of classic filter design, but looking at this
>> transfer functions, I have the feeling that it should be possible to do
>> using some kind of recursive filter. For instance, the 1D-Fourier
>> transform of a Lorientzian is an exponential exp(-|x|) - therefore, in
>> the 1D case, I think this would be a simple exponential smoothing filter
>> - isn't it? Is there any way to design an IIR filter that operates
>> similarly in 2D, and is rotationally symmetric?
>>
>> Best regards,
>>
>>      Christian
>>
>>
>>
>> [*] The algorithm is used to correct X-ray images taken at synchrotrons
>> where the X-rays are deflected from the straight line due to refraction
>> and the details can be found here:
>>
>> https://imagej.nih.gov/ij/plugins/ankaphase/ankaphase-userguide.html#tth_sEc1.3
>>
>>
>
> I don't know anything about 2D filtering. Would just comment anyway,
> because it's fun.
>
> With the 1D transfer function exp(-|x|), it won't be a single
> exponential smoothing filter. More like, a sum of two exponential
> smoothing filters running in opposite directions.
>
> That sort of highlights the problem you'd have with the 2D case. Suppose
> you know the 2D transfer function as f(r), with the parameter r =
> sqrt(x^2 + y^2). [I don't know the analytical answer for f(r), but even
> if there's no analytical form you could still find a suitable
> approximation or set f(r) up as a table.] It looks now that if you want
> to calculate the answer in spatial domain (as a 2D filter), it would be
> the sum of low-pass smoothing filters propagating in all possible
> directions.
>
> What if you don't do it for all directions, but just for a few of them,
> like 8? I'd expect that you'd get some approximation to the desired
> answer, where what would supposed to be a circle appears as an octagon.
>
> However, even if that works for you, I highly doubt that such an
> approach would outperform the 2D FFT, because the latter is highly
> optimized. In terms of complexity, it's hard to beat O(N*log(N)). My
> approach would be still O(N). However, you'd need the matrix of an
> incredible size to worry about log(N). ;-)
>
> Gene
>

I do have a question, though. You said you are using zero padding. But 
nothing in the algorithm description says it's necessary. Actually you 
do not lose any information by not using zero padding. Whatever you get 
by zero padding could be obtained by interpolation after the FFT. So you 
could sort of, perform the ffts without zero padding, and then, if you 
need to, run a simple interpolation filter.

Anyway, I just wanted to draw your attention to the fact that zero 
padding shouldn't be considered an absolute must, but rather a feature 
the use of which needs to be justified.

Gene

Reply by dbd ●February 5, 20172017-02-05

On Sunday, February 5, 2017 at 3:15:52 AM UTC-8, Evgeny Filatov wrote:
> On 05.02.2017 4:43, Evgeny Filatov wrote:
> ...
> 
> Anyway, I just wanted to draw your attention to the fact that zero 
> padding shouldn't be considered an absolute must, but rather a feature 
> the use of which needs to be justified.
> 
> Gene

Zero padding is required to achieve linear convolution instead of circular convolution which would alias the opposite edges together. This fact is recognized in the algorithm description in the OP's reference.

Dale B. Dalrymple

Reply by Evgeny Filatov ●February 5, 20172017-02-05

On 05.02.2017 20:44, dbd wrote:
> On Sunday, February 5, 2017 at 3:15:52 AM UTC-8, Evgeny Filatov wrote:
>> On 05.02.2017 4:43, Evgeny Filatov wrote:
>> ...
>>
>> Anyway, I just wanted to draw your attention to the fact that zero
>> padding shouldn't be considered an absolute must, but rather a feature
>> the use of which needs to be justified.
>>
>> Gene
>
> Zero padding is required to achieve linear convolution instead of circular convolution which would alias the opposite edges together. This fact is recognized in the algorithm description in the OP's reference.
>
> Dale B. Dalrymple
>

Oops! Indeed, you are correct, zero padding is addressed in OP's link 
and there's a good reason to use it.

Still, there doesn't seem to be any particular reason to make the 
margins that large. Perhaps it was a matter of convenience to make all 
dimensions a power of two, but it also looks like lame design. There are 
FFT routines with different radixes, so it looks like "a degree of 
freedom" which is ignored.

Gene

Reply by Tim Wescott ●February 5, 20172017-02-05

On Sat, 04 Feb 2017 13:21:21 +0100, Christian Gollwitzer wrote:

> Hi all,
> 
> I'm looking for a way to speed up a special image reconstruction
> algorithm. [*]
> 
> Actually, the filter is very simple linear filter. In essence, the image
> is Fourier transformed, multiplied by
> 
> 	1/(1+A*(u^2+v^2))
> 
> (u, v are the conjugate coordinates of x,y) and transformed back.
> Realistic images can be as large as 4096x4096, so with padding an 8kx8k
> FFT is performed on several thousand images, which takes hours even on
> powerful machines.
> 
> I am relatively ignorant of classic filter design, but looking at this
> transfer functions, I have the feeling that it should be possible to do
> using some kind of recursive filter. For instance, the 1D-Fourier
> transform of a Lorientzian is an exponential exp(-|x|) - therefore, in
> the 1D case, I think this would be a simple exponential smoothing filter
> - isn't it? Is there any way to design an IIR filter that operates
> similarly in 2D, and is rotationally symmetric?
> 
> Best regards,
> 
> 	Christian
> 
> 
> 
> [*] The algorithm is used to correct X-ray images taken at synchrotrons
> where the X-rays are deflected from the straight line due to refraction
> and the details can be found here:
> 
> https://imagej.nih.gov/ij/plugins/ankaphase/ankaphase-
userguide.html#tth_sEc1.3

I don't think you can get there from here unless you're willing to trade 
off circular symmetry for speed.

You could implement a filter with transfer function 
1/((1 + Au^2)(1 + Av^2)) by filtering each row individually, then each 
column -- but I don't think that's really what you're looking for.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

I'm looking for work -- see my website!

Reply by Christian Gollwitzer ●February 6, 20172017-02-06

Am 06.02.17 um 04:35 schrieb Tim Wescott:
> On Sat, 04 Feb 2017 13:21:21 +0100, Christian Gollwitzer wrote:
>> https://imagej.nih.gov/ij/plugins/ankaphase/ankaphase-
> userguide.html#tth_sEc1.3
>
> I don't think you can get there from here unless you're willing to trade
> off circular symmetry for speed.
>
> You could implement a filter with transfer function
> 1/((1 + Au^2)(1 + Av^2)) by filtering each row individually, then each
> column -- but I don't think that's really what you're looking for.

Thanks! I can see it now, that the circular symmetry is impossible to 
achieve with simple 1D filtering. The Gaussian is a very special case, 
because it can be factored into two 1D-Gaussians (and there the IIR 
filter beats the FFT method easily). I can't see a way how to factor the 
Lorentzian into two 1D functions, probably it's not possible.

The thing is that the rest of the tomographic reconstruction runs on the 
GPU - using OpenCL I could get enormous speed ups and outperform a 
machine with 36 CPU cores by a factor of 2. This phase contrast filter, 
however, will be tough. FFTs on the GPU are not much faster than on the 
CPU according to some references, due to the irregular memory access 
pattern. Therefore I was hoping to get an improvement on that end. I'll 
have to do the FFT on the GPU to see how long it really takes.

Best regards,

	Christian

Reply by Evgeny Filatov ●February 6, 20172017-02-06

On 06.02.2017 10:47, Christian Gollwitzer wrote:
> The thing is that the rest of the tomographic reconstruction runs on the
> GPU - using OpenCL I could get enormous speed ups and outperform a
> machine with 36 CPU cores by a factor of 2. This phase contrast filter,
> however, will be tough. FFTs on the GPU are not much faster than on the
> CPU according to some references, due to the irregular memory access
> pattern. Therefore I was hoping to get an improvement on that end. I'll
> have to do the FFT on the GPU to see how long it really takes.
>
> Best regards,
>
>      Christian

With the current generation of GPUs the one real bottleneck is getting 
the data in and out of the GPU memory. If the entire processing is done 
on the GPU, you should also get some tangible gains with FFTs:

https://www.mathworks.com/matlabcentral/fileexchange/34080-gpubench

Regardless of the hardware, it should be possible to speed up your 
algorithm if you curtail the FFT size by reducing the margins. The exact 
size might depend on the filter parameters, but quadrupling the FFT size 
(as you suggested in the OP) might be unnecessary.

Gene

Reply by Tim Wescott ●February 6, 20172017-02-06

On Mon, 06 Feb 2017 08:47:47 +0100, Christian Gollwitzer wrote:

> Am 06.02.17 um 04:35 schrieb Tim Wescott:
>> On Sat, 04 Feb 2017 13:21:21 +0100, Christian Gollwitzer wrote:
>>> https://imagej.nih.gov/ij/plugins/ankaphase/ankaphase-
>> userguide.html#tth_sEc1.3
>>
>> I don't think you can get there from here unless you're willing to
>> trade off circular symmetry for speed.
>>
>> You could implement a filter with transfer function 1/((1 + Au^2)(1 +
>> Av^2)) by filtering each row individually, then each column -- but I
>> don't think that's really what you're looking for.
> 
> Thanks! I can see it now, that the circular symmetry is impossible to
> achieve with simple 1D filtering. The Gaussian is a very special case,
> because it can be factored into two 1D-Gaussians (and there the IIR
> filter beats the FFT method easily). I can't see a way how to factor the
> Lorentzian into two 1D functions, probably it's not possible.
> 
> The thing is that the rest of the tomographic reconstruction runs on the
> GPU - using OpenCL I could get enormous speed ups and outperform a
> machine with 36 CPU cores by a factor of 2. This phase contrast filter,
> however, will be tough. FFTs on the GPU are not much faster than on the
> CPU according to some references, due to the irregular memory access
> pattern. Therefore I was hoping to get an improvement on that end. I'll
> have to do the FFT on the GPU to see how long it really takes.

Something just popped out of my way-back machine and into my head:

I have a DSP book that was handed to me in the 1980's, and it was old 
then.  In the chapter on the DFT, they discuss a bunch of different FFT 
algorithms, including one that has the order of operations arranged so 
that you can store the input, the intermediate results, and the output 
all on one (or maybe two) tapes.

Yes, tapes.  I'm not so old to have fond memories of them, but I'm just 
old enough to have seen the glow in my slightly-older peers.

It occurs to me that -- assuming you want to pursue things that hard, and 
if memory accesses in a line is indeed the issue -- you might be able to 
use the same concept in a GPU.

"Digital Signal Processing", Alan V. Oppenheim and Ronald W. Shafer, 
Prentice-Hall, 1975, ISBN 0-13-214635-5:

Figure 6.14 on page 301, titled "Rearrangement of fig 6.10 having the 
same geometry for each stage, thereby permitting sequential data 
accessing and storage."

If you can't find any references shoot me an email and I'll see if I can 
copy the relevant pages for you.

Hope this helps...

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

I'm looking for work -- see my website!

Reply by Tim Wescott ●February 6, 20172017-02-06

On Mon, 06 Feb 2017 14:02:09 +0300, Evgeny Filatov wrote:

> On 06.02.2017 10:47, Christian Gollwitzer wrote:
>> The thing is that the rest of the tomographic reconstruction runs on
>> the GPU - using OpenCL I could get enormous speed ups and outperform a
>> machine with 36 CPU cores by a factor of 2. This phase contrast filter,
>> however, will be tough. FFTs on the GPU are not much faster than on the
>> CPU according to some references, due to the irregular memory access
>> pattern. Therefore I was hoping to get an improvement on that end. I'll
>> have to do the FFT on the GPU to see how long it really takes.
>>
>> Best regards,
>>
>>      Christian
> 
> With the current generation of GPUs the one real bottleneck is getting
> the data in and out of the GPU memory. If the entire processing is done
> on the GPU, you should also get some tangible gains with FFTs:
> 
> https://www.mathworks.com/matlabcentral/fileexchange/34080-gpubench
> 
> Regardless of the hardware, it should be possible to speed up your
> algorithm if you curtail the FFT size by reducing the margins. The exact
> size might depend on the filter parameters, but quadrupling the FFT size
> (as you suggested in the OP) might be unnecessary.
> 
> Gene

Maybe do an overlap-and-add?  It should be just as possible with 2D FFT 
as one, I think.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

I'm looking for work -- see my website!

Previous12 Next

Speed up of 2D Lorentzian filter?

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group