Can't think of a reason why sampling at different rate won't work with
the algorithm. The only thing I can think of which might be adversely
affected is the VQ part. The codebook is trained with speech at 10ms
per frame (at 8KHz). So maybe putting a 5 ms frame (sampled at 16KHz)
would make a mess of the codebook.
All this is just a guess. Never really tried anything like this. 


Regards
Piyush

Jack <jack8051@lightspawn.removethisbit.org> wrote in message news:<ahj5m0hf7b6at2vh2apskntn3bg0so1cni@4ax.com>...
> >It's what's used in digital telephony. That's what the standard covers.
> >
> >Jerry
> 
> I realize that if I change the sampling rate I won't strictly be
> adhering to the standard any more. But when I write both the encoder
> and the decoder, it's doesn't seem like such an issue (at least in my
> case). If I increase the rate from 8 KHz (to, say, 8.1 or 8.2) will it
> sound at least as good as the standard? Or is the algorithm somehow
> "optimized" for that sampling rate so that it actually sounds worse at
> a slightly higher rate?

Jerry Avins wrote:

> Jon Harris wrote:
>
>   ...
>
>> I don't think the original researches were so "dumb" as to not 
>> realize that
>> speech had higher frequency components.  Given the technology limits 
>> of the
>> time, they chose the sample rate that allowed for "intelligible" 
>> speech (not
>> perfect speech) at a reasonable cost.  In other words, the criteria 
>> for choosing
>> the frequency response was "what is the minimum frequency response 
>> that is still
>> intelligible in normal speech" vs. "what is the minimum frequency 
>> response for
>> full fidelity speech".  That's what engineering is all 
>> about--trade-offs!
>
>
> Originally, there was no sample rate involved. Hybrids had to be
> terminated with dummy lines that closely matched the real line impedance
> over the bandwidth of intended use. Ear pieces and carbon microphones
> had to cover the band. In all respects, bandwidth cost money. This is
> also easy to see with analog frequency-division multiplexing. The actual
> guaranteed analog high frequency was 3600 Hz, if I remember correctly,
> but actual response was usually better starting around 1950. The 8 KHz
> sample rate was adequate to preserve the quality of the analog service.
>
> Jerry

There was no sample rate, but early on there were FDM stacks. That 
demanded the same choices about permitted bandwidth, and that is where 
the choices we live with today were set in (somewhat flaky) concrete. On 
simple local loop analogue lines, saying the bandwidth is 3600Hz is more 
a quality of service issues, than a hard engineering one. In 99% of 
cases the bandwidth there is pretty much arbitrary.

Regards,
Steve

Jack wrote:

>>It's what's used in digital telephony. That's what the standard covers.
>>
>>Jerry
> 
> 
> I realize that if I change the sampling rate I won't strictly be
> adhering to the standard any more. But when I write both the encoder
> and the decoder, it's doesn't seem like such an issue (at least in my
> case). If I increase the rate from 8 KHz (to, say, 8.1 or 8.2) will it
> sound at least as good as the standard? Or is the algorithm somehow
> "optimized" for that sampling rate so that it actually sounds worse at
> a slightly higher rate?

If you vary the sampling rate slightly everything should be fine, but what sound 
card samples at 8200 Hz? Perhaps if you explain WHY you want to change the 
sample rate there is another way to do it.

However, if you doubled the sampling rate the tone would greatly change for some 
people due to the filters in the encoder.

-- 
Phil Frisbie, Jr.
Hawk Software
http://www.hawksoft.com

"Jerry Avins" <jya@ieee.org> wrote in message
news:cjuqqq$jkq$1@bob.news.rcn.net...
> Jon Harris wrote:
>
> > I don't think the original researches were so "dumb" as to not realize that
> > speech had higher frequency components.  Given the technology limits of the
> > time, they chose the sample rate that allowed for "intelligible" speech (not
> > perfect speech) at a reasonable cost.  In other words, the criteria for
choosing
> > the frequency response was "what is the minimum frequency response that is
still
> > intelligible in normal speech" vs. "what is the minimum frequency response
for
> > full fidelity speech".  That's what engineering is all about--trade-offs!
>
> Originally, there was no sample rate involved. Hybrids had to be
> terminated with dummy lines that closely matched the real line impedance
> over the bandwidth of intended use. Ear pieces and carbon microphones
> had to cover the band. In all respects, bandwidth cost money. This is
> also easy to see with analog frequency-division multiplexing. The actual
> guaranteed analog high frequency was 3600 Hz, if I remember correctly,
> but actual response was usually better starting around 1950. The 8 KHz
> sample rate was adequate to preserve the quality of the analog service.

Thanks for the historical clarifications, Jerry.

Jon Harris wrote:

   ...

> I don't think the original researches were so "dumb" as to not realize that
> speech had higher frequency components.  Given the technology limits of the
> time, they chose the sample rate that allowed for "intelligible" speech (not
> perfect speech) at a reasonable cost.  In other words, the criteria for choosing
> the frequency response was "what is the minimum frequency response that is still
> intelligible in normal speech" vs. "what is the minimum frequency response for
> full fidelity speech".  That's what engineering is all about--trade-offs!

Originally, there was no sample rate involved. Hybrids had to be
terminated with dummy lines that closely matched the real line impedance
over the bandwidth of intended use. Ear pieces and carbon microphones
had to cover the band. In all respects, bandwidth cost money. This is
also easy to see with analog frequency-division multiplexing. The actual
guaranteed analog high frequency was 3600 Hz, if I remember correctly,
but actual response was usually better starting around 1950. The 8 KHz
sample rate was adequate to preserve the quality of the analog service.

Jerry
-- 
... they proceeded on the sound principle that the magnitude of a lie
always contains a certain factor of credibility, ... and that therefor
... they more easily fall victim to a big lie than to a little one ...
                                                                   A. H.
&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;

"Steve Underwood" <steveu@dis.org> wrote in message
news:cjtmsk$cp3$1@home.itg.ti.com...
> James Salsman wrote:
>
> >> Is there really something special about 8000 samples / sec?
> >
> >
> > Some dolt in Bell Labs during the 1920s decreed that voice transmission
> > requires a frequency response from 250 Hz to only 3000 Hz.  Even though
> > Harry Nyquist rounded it up to 4000 Hz to be on the safe side (and
> > because we all like round numbers) around 1938, we're all still stuck
> > saying things like "S as in Sam" and "F as in Frank" over modern
> > telephones because apparently nobody actually bothered to check the
> > frequency spectrum of actual speech.
>
> I don't think that is entirely fair. For most of the life of the
> telephone network, using twice the bandwidth would have incurred
> significant additional cost. Considering how few words in a typical
> conversation cause the problem you describe, I think the compromise they
> chose was none too bad.

From a recent thread:

On the phone, it is generally quite easy to understand normal
conversation speech even with the limited frequency response.  However, if
someone tries to read a string of random letters, it is quite a bit more
difficult to understand them on the other end.  Losing those high frequencies
makes consonants difficult to differentiate.  The brain normally does a good job
of compensating for the loss of high frequencies by using context clues.  But
since very few context clues exist with a string of random letters, it becomes
difficult to understand.

So the phone is generally quite adequate for its primary intended
application--communicating normal conversational speech.  However, it is
certainly not a perfect medium and doesn't do as well in other applications.

I don't think the original researches were so "dumb" as to not realize that
speech had higher frequency components.  Given the technology limits of the
time, they chose the sample rate that allowed for "intelligible" speech (not
perfect speech) at a reasonable cost.  In other words, the criteria for choosing
the frequency response was "what is the minimum frequency response that is still
intelligible in normal speech" vs. "what is the minimum frequency response for
full fidelity speech".  That's what engineering is all about--trade-offs!

Jack wrote:
>>It's what's used in digital telephony. That's what the standard covers.
>>
>>Jerry
> 
> 
> I realize that if I change the sampling rate I won't strictly be
> adhering to the standard any more. But when I write both the encoder
> and the decoder, it's doesn't seem like such an issue (at least in my
> case). If I increase the rate from 8 KHz (to, say, 8.1 or 8.2) will it
> sound at least as good as the standard? Or is the algorithm somehow
> "optimized" for that sampling rate so that it actually sounds worse at
> a slightly higher rate?

I don't really know, but it doesn't seem likely.

Jerry
-- 
... they proceeded on the sound principle that the magnitude of a lie
always contains a certain factor of credibility, ... and that therefor
... they more easily fall victim to a big lie than to a little one ...
                                                                   A. H.
&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;

>It's what's used in digital telephony. That's what the standard covers.
>
>Jerry

I realize that if I change the sampling rate I won't strictly be
adhering to the standard any more. But when I write both the encoder
and the decoder, it's doesn't seem like such an issue (at least in my
case). If I increase the rate from 8 KHz (to, say, 8.1 or 8.2) will it
sound at least as good as the standard? Or is the algorithm somehow
"optimized" for that sampling rate so that it actually sounds worse at
a slightly higher rate?

>>>>> "Jack" == Jack  <jack8051@lightspawn.removethisbit.org> writes:

    Jack> I'm trying to understand G.729. The only compression algorithm I've
    Jack> coded before is ADPCM, and it wasn't keyed to a certain sampling rate
    Jack> - that is, it would be just as happy with 8100 samples per second or
    Jack> 7900 samples per second as it would have been with 8000, just the
    Jack> quality would be slightly higher or lower.

    Jack> With G.729, all the documentation refers to a sampling rate of 8 KHz.
    Jack> Is it really the only rate that makes sense? That is, if my real
    Jack> sampling rate is slightly higher or lower (but constant, and same for
    Jack> the encoder and the decoder) can't I just feed the samples to the
    Jack> algorithm a little faster or a little slower? Is there really
    Jack> something special about 8000 samples / sec?

You could, probably, vary the sample rate some and it would sound ok.
But G.729 tries to model the speech signal so if things happen faster
or slower than expected, the model may no longer be as accurate.  For
example, if you sampled at 16000 samples, the pitch period would now
be twice the number of samples.  This might confuse G.729.

I don't know the fine details of G.729, so I might be wrong.

Ray

James Salsman wrote:

>> Is there really something special about 8000 samples / sec?
>
>
> Some dolt in Bell Labs during the 1920s decreed that voice transmission
> requires a frequency response from 250 Hz to only 3000 Hz.  Even though
> Harry Nyquist rounded it up to 4000 Hz to be on the safe side (and
> because we all like round numbers) around 1938, we're all still stuck
> saying things like "S as in Sam" and "F as in Frank" over modern
> telephones because apparently nobody actually bothered to check the
> frequency spectrum of actual speech.
>
> So, sure, "special," as in, "special education."

I don't think that is entirely fair. For most of the life of the 
telephone network, using twice the bandwidth would have incurred 
significant additional cost. Considering how few words in a typical 
conversation cause the problem you describe, I think the compromise they 
chose was none too bad.

What was dumber, was the half-hearted effort to improve things in the 
early days of ISDN. The addition of a 7.1kHz bandwidth audio mode was 
handled so poorly it never caught on at all.

With modern speech compression, wider bandwidth need have little impact 
on the bit rate. However, most of  the newer codecs, like G.729, still 
only provide for an 8kHz sampled audio world. The latest 3GPP codecs do, 
however, provide wideband modes, so maybe phone speech clarity will 
improve in the next few years.

Regards,
Steve