> I read your post several times and I'm still not
sure what it is that you actually want to do. It seems that you're
> trying to take existing HRIR and HRTF data and interpolate/extrapolate to
build a "spherical response" that shows a
> resonable approximation of HRTF in full 3D... is that it?
Exactly! In first instance, that is what i want to do: to generalize a
(very) discretely sampled HRIR sphere to a continuous one. Sorry if that
was not clear from my description. I'm not a native speaker.
These HRIRs could then be used not only to create virtual sound
originating from every point on the sphere, but also, taking into
consideration changes in volume and delay, from points closer or farther
away, where the difference in angle between the ears is larger or
smaller, respectively.
Maarten
Reply by Jeff Brower●April 4, 20102010-04-04
Maarten-
I read your post several times and I'm still not sure what it is that you
actually want to do. It seems that you're
trying to take existing HRIR and HRTF data and interpolate/extrapolate to build
a "spherical response" that shows a
resonable approximation of HRTF in full 3D... is that it?
-Jeff
> I'm quite new to dsp, programming a tool to
optimize surround music for listening on headphones, stereo speaker
> systems, etcetera. (Such tools probably already exist, but i couldn't
find any free software that does what i want.)
> The best HRIR measurements i could find are those of the Listen project, and
with the help of those, i reached some,
> to my mind, astonishing results that encourage me to delve a little deeper
into this subject.
>
> The project, however, provides impulse responses in stereo pairs for a very
limited set of angles - enough when you
> want to place the virtual speakers on a circle around the listener, but not
nearly sufficient if you want more
> freedom. To remove this constraint, i figured that the stereo hrtf pairs
should be decoupled (thanks to
> http://alumnus.caltech.edu/~franko/thesis/Chapter4.html#sub4), and that it
would be useful to interpolate for the
> missing directions.
>
> My tool currently uses the existing hrtfs and upsamples using src to match the
input audio. Convolution is done in the
> frequency domain with the help of fftw's incredibly fast fourier
transforms. I still have some questions - thoug it
> appears to work quite well - about the deconvolution needed for headphone or
stereo equipment (including crossfeeding)
> cancelation, but that is not what i would like to address in this post.
>
> To interpolate the hrirs in order to be able to provide one for every point on
the sphere (except perhaps for those
> below -40 degrees, as the listen project measured no IRs below this point),
and combining with the sample rate
> upsampling, three dimensional interpolation would be needed: in the sample
rate, azimuth as well as the elevation
> direction. So, in order to gain some experience with upsampling, i started out
with image resampling.
>
> The most straightforward way to do this, is by placing the frequency spectrum
of the smaller image into that of the
> larger image (possibly with lanczos windowing (i.e. linear shading at the
sides) to lessen ringing effects), and
> performing the ift. The equivalent in the image pixel domain is convolution
with a sinc function. Theoretically, this
> works as long as the original signal is bandlimited.
>
> The difficulty now is that the sampling for the hrirs is not equidistant in
the azimuth direction, though in this
> case, for once, it *is* periodic. (For those wondering why the distances
between the samples differ: this is because
> the impulse responses were recorded for increments of 15 degrees around the
center of the head, not centered on each
> of both ears. This means also that the distances from the ears are not all
exactly the same.) Additionally, as the
> measurements are for a sphere, the azimuth increment decreases for higher
elevations. These peculiarities would seem
> to rule out the use of fourier transforms.
>
> The sinc trick, on the other hand, could still be used in this case. My
question then is, if this would be warranted -
> if it would still be in line with the sampling theorem to upsample using a
sinc convolution with non-equidistant
> samples, or if, perhaps, a slightly or wholly different function should be
used. Furthermore, it would be nice to do
> it in the frequency domain, though i'm not sure it would be
computationally more efficient when only a few hrirs,
> rather that a full upsampled spectrum, are needed.
>
> Beyond that one could ask if it would be of any real-world use in this
upsampling. At the very least, the volume
> difference should be brought into account when using the hrtfs to emulate
farther-off sounds, but it is far from
> certain that it would closely resemble the actual hrtf as it would be recorded
from that distance. For the points on
> the sphere, however, at the same distance where the original hrtfs were
recorded, i suppose it has a good chance of
> working reasonably well. And i guess, more than all else, it's just that
i'm in for another challenge.
>
> Thanks in advance for any information.
>
> Maarten
Reply by maartendeprez●April 4, 20102010-04-04
Hi everyone,
I'm quite new to dsp, programming a tool to optimize surround music for
listening on headphones, stereo speaker systems, etcetera. (Such tools probably
already exist, but i couldn't find any free software that does what i
want.) The best HRIR measurements i could find are those of the Listen project,
and with the help of those, i reached some, to my mind, astonishing results that
encourage me to delve a little deeper into this subject.
The project, however, provides impulse responses in stereo pairs for a very
limited set of angles - enough when you want to place the virtual speakers on a
circle around the listener, but not nearly sufficient if you want more freedom.
To remove this constraint, i figured that the stereo hrtf pairs should be
decoupled (thanks to
http://alumnus.caltech.edu/~franko/thesis/Chapter4.html#sub4), and that it would
be useful to interpolate for the missing directions.
My tool currently uses the existing hrtfs and upsamples using src to match the
input audio. Convolution is done in the frequency domain with the help of
fftw's incredibly fast fourier transforms. I still have some questions -
thoug it appears to work quite well - about the deconvolution needed for
headphone or stereo equipment (including crossfeeding) cancelation, but that is
not what i would like to address in this post.
To interpolate the hrirs in order to be able to provide one for every point on
the sphere (except perhaps for those below -40 degrees, as the listen project
measured no IRs below this point), and combining with the sample rate
upsampling, three dimensional interpolation would be needed: in the sample rate,
azimuth as well as the elevation direction. So, in order to gain some experience
with upsampling, i started out with image resampling.
The most straightforward way to do this, is by placing the frequency spectrum of
the smaller image into that of the larger image (possibly with lanczos windowing
(i.e. linear shading at the sides) to lessen ringing effects), and performing
the ift. The equivalent in the image pixel domain is convolution with a sinc
function. Theoretically, this works as long as the original signal is
bandlimited.
The difficulty now is that the sampling for the hrirs is not equidistant in the
azimuth direction, though in this case, for once, it *is* periodic. (For those
wondering why the distances between the samples differ: this is because the
impulse responses were recorded for increments of 15 degrees around the center
of the head, not centered on each of both ears. This means also that the
distances from the ears are not all exactly the same.) Additionally, as the
measurements are for a sphere, the azimuth increment decreases for higher
elevations. These peculiarities would seem to rule out the use of fourier
transforms.
The sinc trick, on the other hand, could still be used in this case. My question
then is, if this would be warranted - if it would still be in line with the
sampling theorem to upsample using a sinc convolution with non-equidistant
samples, or if, perhaps, a slightly or wholly different function should be used.
Furthermore, it would be nice to do it in the frequency domain, though i'm
not sure it would be computationally more efficient when only a few hrirs,
rather that a full upsampled spectrum, are needed.
Beyond that one could ask if it would be of any real-world use in this
upsampling. At the very least, the volume difference should be brought into
account when using the hrtfs to emulate farther-off sounds, but it is far from
certain that it would closely resemble the actual hrtf as it would be recorded
from that distance. For the points on the sphere, however, at the same distance
where the original hrtfs were recorded, i suppose it has a good chance of
working reasonably well. And i guess, more than all else, it's just that
i'm in for another challenge.