Towards Efﬁcient and Robust Automatic Speech Recognition: Decoding Techniques and Discriminative Training●1 comment
Automatic speech recognition has been widely studied and is already being applied in everyday use. Nevertheless, the recognition performance is still a bottleneck in many practical applications of large vocabulary continuous speech recognition. Either the recognition speed is not sufﬁcient, or the errors in the recognition result limit the applications. This thesis studies two aspects of speech recognition, decoding and training of acoustic models, to improve speech recognition performance in different conditions.
This article surveys the theory of compressive sensing, also known as compressed sensing or CS, a novel sensing/sampling paradigm that goes against the common wisdom in data acquisition.
Chapter 1 of the book: "Compressed Sensing: Theory and Applications".
Chapter 1 of the book: Real-Time Digital Signal Processing: Fundamentals, Implementations and Applications, 3rd Edition
An illustrated essay with software available for free download.
This book provides an applications-oriented introduction to digital signal processing written primarily for electrical engineering undergraduates. Practicing engineers and graduate students may also ﬁnd it useful as a ﬁrst text on the subject.
This tutorial is for those people who want to learn programming in C++ and do not necessarily have any previous knowledge of other programming languages. Of course any knowledge of other programming languages or any general computer skill can be useful to better understand this tutorial, although it is not essential. It is also suitable for those who need a little update on the new features the language has acquired from the latest standards. If you are familiar with the C language, you can take the first 3 parts of this tutorial as a review of concepts, since they mainly explain the C part of C++. There are slight differences in the C++ syntax for some C features, so I recommend you its reading anyway. The 4th part describes object-oriented programming. The 5th part mostly describes the new features introduced by ANSI-C++ standard.
Correcting an Important Goertzel Filter Misconception
Audio signal processing with MATLAB and Octave code examples.
The design of conventional sensors is based primarily on the Shannon-Nyquist sampling theorem, which states that a signal of bandwidth W Hz is fully determined by its discrete-time samples provided the sampling rate exceeds 2W samples per second. For discrete-time signals, the Shannon-Nyquist theorem has a very simple interpretation: the number of data samples must be at least as large as the dimensionality of the signal being sampled and recovered. This important result enables signal processing in the discrete-time domain without any loss of information. However, in an increasing number of applications, the Shannon-Nyquist sampling theorem dictates an unnecessary and often prohibitively high sampling rate. (See Box 1 for a derivation of the Nyquist rate of a time-varying scene.) As a motivating example, the high resolution of the image sensor hardware in modern cameras reflects the large amount of data sensed to capture an image. A 10-megapixel camera, in effect, takes 10 million measurements of the scene. Yet, almost immediately after acquisition, redundancies in the image are exploited to compress the acquired data significantly, often at compression ratios of 100:1 for visualization and even higher for detection and classification tasks. This example suggests immense wastage in the overall design of conventional cameras.
During the last two decades, multirate filter banks have found various applications in many different areas, such as speech coding, scrambling, adaptive signal processing, image compression, signal and image processing applications as well as transmission of several signals through the same channel. The main idea of using multirate filter banks is the ability of the system to separate in the frequency domain the signal under consideration into two or more signals or to compose two or more different signals into a single signal.
The sum of two equal-frequency real sinusoids is itself a single real sinusoid. However, the exact equations for all the various forms of that single equivalent sinusoid are difficult to find in the signal processing literature. Here we provide those equations.
This paper describes a simple method to calculate the invers of a complex matrix. The key element of the method is to use a matrix inversion, which is available and optimised for real numbers. Some actual libraries used for digital signal processing only provide highly optimised methods to calculate the inverse of a real matrix, whereas no solution for complex matrices are available, like in . The presented algorithm is very easy to implement, while still much more efficient than for example the method presented in .  Visual DSP++ 4.0 C/C++ Compiler and Library Manual for TigerSHARC Processors; Analog Devices; 2005.  W. Press, S.A. Teukolsky, W.T. Vetterling, B.R. Flannery; Numerical Recipes in C++, The art of scientific computing, Second Edition; p52 : “Complex Systems of Equations”;Cambridge University Press 2002.
I recently learned an interesting rule of thumb regarding the use of an amplifier to drive the input of an analog to digital converter (ADC). The rule of thumb describes how to specify the maximum allowable noise power of the amplifier.
There are so many different time- and frequency-domain methods for generating complex baseband and analytic bandpass signals that I had trouble keeping those techniques straight in my mind. Thus, for my own benefit, I created a kind of reference table showing those methods. I present that table for your viewing pleasure in this document.
Earlier this year, for the Linear Audio magazine, published in the Netherlands whose subscribers are technically-skilled hi-fi audio enthusiasts, I wrote an article on the fundamentals of interpolation as it's used to improve the performance of analog-to-digital conversion. Perhaps that article will be of some value to the subscribers of dsprelated.com. Here's what I wrote: We encounter the process of digital-to-analog conversion every day—in telephone calls (land lines and cell phones), telephone answering machines, CD & DVD players, iPhones, digital television, MP3 players, digital radio, and even talking greeting cards. This material is a brief tutorial on how sample rate conversion improves the quality of digital-to-analog conversion.
Dispersive linear systems with negative group delay have caused much confusion in the past. Some claim that they violate causality, others that they are the cause of superluminal tunneling. Can we really receive messages before they are sent? This article aims at pouring oil in the fire and causing yet more confusion :-).