I recently learned an interesting rule of thumb regarding the use of an amplifier to drive the input of an analog to digital converter (ADC). The rule of thumb describes how to specify the maximum allowable noise power of the amplifier.
Towards Efﬁcient and Robust Automatic Speech Recognition: Decoding Techniques and Discriminative Training●1 comment
Automatic speech recognition has been widely studied and is already being applied in everyday use. Nevertheless, the recognition performance is still a bottleneck in many practical applications of large vocabulary continuous speech recognition. Either the recognition speed is not sufﬁcient, or the errors in the recognition result limit the applications. This thesis studies two aspects of speech recognition, decoding and training of acoustic models, to improve speech recognition performance in different conditions.
This article surveys the theory of compressive sensing, also known as compressed sensing or CS, a novel sensing/sampling paradigm that goes against the common wisdom in data acquisition.
Chapter 1 of the book: "Compressed Sensing: Theory and Applications".
Chapter 1 of the book: Real-Time Digital Signal Processing: Fundamentals, Implementations and Applications, 3rd Edition
An illustrated essay with software available for free download.
This book provides an applications-oriented introduction to digital signal processing written primarily for electrical engineering undergraduates. Practicing engineers and graduate students may also ﬁnd it useful as a ﬁrst text on the subject.
Peak to Average Power Ratio (PAPR) is often used to characterize digitally modulated signals. One example application is setting the level of the signal in a digital modulator. Knowing PAPR allows setting the average power to a level that is just low enough to minimize clipping.
The sum of two equal-frequency real sinusoids is itself a single real sinusoid. However, the exact equations for all the various forms of that single equivalent sinusoid are difficult to find in the signal processing literature. Here we provide those equations.
Recently I've been reading papers on underwater acoustic communications systems and this caused me to investigate the frequency-domain effects of time-reversal of time-domain sequences. I created this article because there is so little coverage of this topic in the literature of DSP.
This article may seem a bit trivial to some readers here but, then again, it might be of some value to DSP beginners. It presents a mathematical proof of what is the magnitude of an N-point discrete Fourier transform (DFT) when the DFT's input is a real-valued sinusoidal sequence.
Elliptic filters, also known as Cauer or Zolotarev filters, achieve the smallest filter order for the same specifications, or, the narrowest transition width for the same filter order, as compared to other filter types. On the negative side, they have the most nonlinear phase response over their passband. In these notes, we are primarily concerned with elliptic filters. But we will also discuss briefly the design of Butterworth, Chebyshev-1, and Chebyshev-2 filters and present a unified method of designing all cases. We also discuss the design of digital IIR filters using the bilinear transformation method.
I have read, in some of the literature of DSP, that when the discrete Fourier transform (DFT) is used as a filter the process of performing a DFT causes an input signal's spectrum to be frequency translated down to zero Hz (DC). I can understand why someone might say that, but I challenge that statement as being incorrect. Here are my thoughts.
Section 1: reviews the mathematical deﬁnition of Hilbert transform and various ways to calculate it.
Sections 2 and 3: review applications of Hilbert transform in two major areas: Signal processing and system identiﬁcation.
Section 4: concludes with remarks on the historical development of Hilbert transform
An important drawback affecting most of the speech processing systems is the environmental noise and its harmful effect on the system performance. Examples of such systems are the new wireless communications voice services or digital hearing aid devices. In speech recognition, there are still technical barriers inhibiting such systems from meeting the demands of modern applications. Numerous noise reduction techniques have been developed to palliate the effect of the noise on the system performance and often require an estimate of the noise statistics obtained by means of a precise voice activity detector (VAD). Speech/non-speech detection is an unsolved problem in speech processing and affects numerous applications including robust speech recognition, discontinuous transmission, real-time speech transmission on the Internet or combined noise reduction and echo cancellation schemes in the context of telephony. The speech/non-speech classification task is not as trivial as it appears, and most of the VAD algorithms fail when the level of background noise increases. During the last decade, numerous researchers have developed different strategies for detecting speech on a noisy signal and have evaluated the influence of the VAD effectiveness on the performance of speech processing systems. Most of the approaches have focussed on the development of robust algorithms with special attention being paid to the derivation and study of noise robust features and decision rules. The different VAD methods include those based on energy thresholds, pitch detection, spectrum analysis, zero-crossing rate, periodicity measure, higher order statistics in the LPC residual domain or combinations of different features. This chapter shows a comprehensive approximation to the main challenges in voice activity detection, the different solutions that have been reported in a complete review of the state of the art and the evaluation frameworks that are normally used. The application of VADs for speech coding, speech enhancement and robust speech recognition systems is shown and discussed. Three different VAD methods are described and compared to standardized and recently reported strategies by assessing the speech/non-speech discrimination accuracy and the robustness of speech recognition systems.
In the classic paper, "An Economical Class of Digital Filters for Decimation and Interpolation", Hogenauer introduced an important class of digital filters called "Cascaded Integrator-Comb", or "CIC" for short (also sometimes called "Hogenauer filters"). Here, Matthew Donadio provides a more gentle introduction to the subject of CIC filters, geared specifically to the needs of practicing DSP designers.