DSPRelated.com
Forums

Contribution of amplitude and phase information to human perception of a timbre

Started by Ross September 23, 2009
Dear all,

Does anyone know of any research into the roles of phase and amplitude
of frequency domain representations of sound in terms of human
perception of timbres.

In images, you can take the Fourier transform of two images. You then
use the amplitude information from one image, and the phase
information from the other. The image that results from the inverse
Fourier transform of this mixed data looks pretty strange as you'd
expect, but you see more of the image the phase information came from
than the other, suggesting that in images phase information dominates
over amplitude information.

My wild guess is that for static audio timbres the opposite is true,
but I would very much like to check this out properly. Any ideas/
references/pointers?

I'm guessing that I'm probably asking in the wrong group, but don't
know where to ask. Any recommendations of other places to ask would be
greatly appreciated.
On 23 Sep, 13:20, Ross <rossclem...@gmail.com> wrote:
> Dear all, > > Does anyone know of any research into the roles of phase and amplitude > of frequency domain representations of sound in terms of human > perception of timbres.
From http://en.wikipedia.org/wiki/Timbre: " Timbre has been called ... "the psychoacoustician's multidimensional wastebasket category for everything that cannot be qualified as pitch or loudness." " Rune
Ross wrote:
> Dear all, > > Does anyone know of any research into the roles of phase and amplitude > of frequency domain representations of sound in terms of human > perception of timbres. > > In images, you can take the Fourier transform of two images. You then > use the amplitude information from one image, and the phase > information from the other. The image that results from the inverse > Fourier transform of this mixed data looks pretty strange as you'd > expect, but you see more of the image the phase information came from > than the other, suggesting that in images phase information dominates > over amplitude information. > > My wild guess is that for static audio timbres the opposite is true, > but I would very much like to check this out properly. Any ideas/ > references/pointers? > > I'm guessing that I'm probably asking in the wrong group, but don't > know where to ask. Any recommendations of other places to ask would be > greatly appreciated.
You will find lots of interest in this question in the musicdsp list. The significance of phase is pretty well canonical in audio processing, with respect to all and any combinations of sounds. Processes such as phasers and flangers combine wet and dry sounds to produce dynamic cancellation effects. There is a pretty direct audio counterpart to your image example in various techniques of cross-synthesis, hybridising and morphing of sounds. The simplest example is phase-vocoder processing where the bin amplitudes of one sound are combined with the frequency values of another. Most of the famous "problems" of the phase vocoder arise through the smearing of phase between bins. Phase relationships (not least, the preservation of them) is also central to most multi-channel production, in either preserving or modifying the "stereo image". So in the general case audio applications seek either to preserve phase relationships, or deliberately distort/modify them. Human perception of timbre is a slightly different topic; it is generally asserted that we are insensitive to (static) phase - you can scramble the phases (while keeping amplitudes the same) of the partials of, say, a square wave or sawtooth wave and the listener will not notice (though needless to say there are those who claim they can distinguish them). So in broad terms your guess is correct. The general principle is that our ears are drawn to anything changing (which of course is what we experience most of the time); addition/removal of partials, and changing phase relationships. The challenge of the subject from a research point of view is that our hearing tends to be "categorical" - given a transformation (e.g. in morphing), our perception tends to lock on one recognition until a certain point where it flips to another; somewhat akin to the famous optical illusions where we flip from seeing a vase to seeing a face, etc. In the audio case, this tends to apply even during a nominally smooth transformation. See (among other references) "Auditory Scene Analysis" by Albert S Bregman. See also the work of Diana Deutsch (http://deutsch.ucsd.edu); especially "The Psychology of Music", which discusses auditory illusions, among many other things. And: "Music, Cognition, and Computerized Sound", Perry Cook. The main sound synthesis lists will also be sources of rich and informed discussions, e.g. for PD, Csound, Max/Msp, Supercollider, etc. Richard Dobson
Rune Allnor wrote:
> On 23 Sep, 13:20, Ross <rossclem...@gmail.com> wrote: >> Dear all, >> >> Does anyone know of any research into the roles of phase and amplitude >> of frequency domain representations of sound in terms of human >> perception of timbres. > > From http://en.wikipedia.org/wiki/Timbre: > > " Timbre has been called ... "the psychoacoustician's > multidimensional wastebasket category for everything that > cannot be qualified as pitch or loudness." " > > Rune
A dodgy definition when first made; understanding has moved along considerably since then! Interestingly, while the awareness of the basic nature of timbre as characterizing mixtures of partials has been around since Helmholtz and even before, it has only been with the development of digital analysis techniques that timbre has been fully emancipated, not least in relegating the notion that a saxophone, say, has "a" timbre (in some static global time-invariant sense) to those very wastebaskets of history. Chop off (or otherwise transmogrify) the attacks of most musical notes and our ability to recognize them pretty well evaporates. Richard Dobson
Richard Dobson wrote:
> Rune Allnor wrote: >> On 23 Sep, 13:20, Ross <rossclem...@gmail.com> wrote: >>> Dear all, >>> >>> Does anyone know of any research into the roles of phase and amplitude >>> of frequency domain representations of sound in terms of human >>> perception of timbres. >> >> From http://en.wikipedia.org/wiki/Timbre: >> >> " Timbre has been called ... "the psychoacoustician's >> multidimensional wastebasket category for everything that >> cannot be qualified as pitch or loudness." " >> >> Rune > > A dodgy definition when first made; understanding has moved along > considerably since then! > > Interestingly, while the awareness of the basic nature of timbre as > characterizing mixtures of partials has been around since Helmholtz and > even before, it has only been with the development of digital analysis > techniques that timbre has been fully emancipated, not least in > relegating the notion that a saxophone, say, has "a" timbre (in some > static global time-invariant sense) to those very wastebaskets of > history. Chop off (or otherwise transmogrify) the attacks of most > musical notes and our ability to recognize them pretty well evaporates.
Indeed! Play a piano selection backwards and it sounds like a pipe organ whose notes end abruptly. Jerry -- Engineering is the art of making what you want from things you can get. &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;