An important drawback affecting most of the speech processing systems is the environmental noise and its harmful effect on the system performance. Examples of such systems are the new wireless communications voice services or digital hearing aid devices. In speech recognition, there are still technical barriers inhibiting such systems from meeting the demands of modern applications. Numerous noise reduction techniques have been developed to palliate the effect of the noise on the system performance and often require an estimate of the noise statistics obtained by means of a precise voice activity detector (VAD). Speech/non-speech detection is an unsolved problem in speech processing and affects numerous applications including robust speech recognition, discontinuous transmission, real-time speech transmission on the Internet or combined noise reduction and echo cancellation schemes in the context of telephony. The speech/non-speech classification task is not as trivial as it appears, and most of the VAD algorithms fail when the level of background noise increases. During the last decade, numerous researchers have developed different strategies for detecting speech on a noisy signal and have evaluated the influence of the VAD effectiveness on the performance of speech processing systems. Most of the approaches have focussed on the development of robust algorithms with special attention being paid to the derivation and study of noise robust features and decision rules. The different VAD methods include those based on energy thresholds, pitch detection, spectrum analysis, zero-crossing rate, periodicity measure, higher order statistics in the LPC residual domain or combinations of different features. This chapter shows a comprehensive approximation to the main challenges in voice activity detection, the different solutions that have been reported in a complete review of the state of the art and the evaluation frameworks that are normally used. The application of VADs for speech coding, speech enhancement and robust speech recognition systems is shown and discussed. Three different VAD methods are described and compared to standardized and recently reported strategies by assessing the speech/non-speech discrimination accuracy and the robustness of speech recognition systems.
Digital Image processing is a topic of great relevance for practically any project, either for basic arrays of photodetectors or complex robotic systems using artificial vision. It is an interesting topic that offers to multimodal systems the capacity to see and understand their environment in order to interact in a natural and more efficient way. The development of new equipment for high speed image acquisition and with higher resolutions requires a significant effort to develop techniques that process the images in a more efficient way. Besides, medical applications use new image modalities and need algorithms for the interpretation of these images as well as for the registration and fusion of the different modalities, so that the image processing is a productive area for the development of multidisciplinary applications. The aim of this chapter is to present different digital image processing algorithms using LabView and IMAQ vision toolbox. IMAQ vision toolbox presents a complete set of digital image processing and acquisition functions that improve the efficiency of the projects and reduce the programming effort of the users obtaining better results in shorter time. Therefore, the IMAQ vision toolbox of LabView is an interesting tool to analyze in detail and through this chapter it will be presented different theories about digital image processing and different applications in the field of image acquisition, image transformations. This chapter includes in first place the image acquisition and some of the most common operations that can be locally or globally applied, the statistical information generated by the image in a histogram is commented later. Finally, the use of tools allowing to segment or filtrate the image are described making special emphasis in the algorithms of pattern recognition and matching template.
Based on the fact that noise and distortion are the main factors that limit the capacity of data transmission in telecommunications and that they also affect the accuracy of the results in the signal measurement systems, whereas, modeling and removing noise and distortions are at the core of theoretical and practical considerations in communications and signal processing. Another important issue here is that, noise reduction and distortion removal are major problems in applications such as; cellular mobile communication, speech recognition, image processing, medical signal processing, radar, sonar, and any other application where the desired signals cannot be isolated from noise and distortion. The use of wavelets in the field of de-noising audio signals is relatively new, the use of this technique has been increasing over the past 20 years. One way to think about wavelets matches the way how our eyes perceive the world when they are faced to different distances. In the real world, a forest can be seen from many different perspectives; they are, in fact, different scales of resolution. From the window of an airplane, for instance, the forest cover appears as a solid green roof. From the window of a car, the green roof gets transformed into individual trees, and if we leave the car and approach to the forest, we can gradually see details such as the trees branches and leaves. If we had a magnifying glass, we could see a dew drop on the tip of a leaf. As we get closer to even smaller scales, we can discover details that we had not seen before. On the other hand, if we tried to do the same thing with a photograph, we would be completely frustrated. If we enlarged the picture "closer" to a tree, we would only be able to see a blurred tree image; we would not be able to spot neither the branch, nor the leaf, and it would be impossible to spot the dew drop. Although our eyes can see on many scales of resolution, the camera can only display one at a time. In this chapter, we introduce the reader to a way to reduce noise in an audio signal by using wavelet transforms. We developed this technique by using the wavelet tool in MATLAB. A Simulink is used to acquire an audio signal and we use it to convert the signal to a digital format so it can be processed. Finally, a Graphical User Interface Development Environment (GUIDE) is used to create a graphical user interface. The reader can go through this chapter systematically, from the theory to the implementation of the noise reduction technique. We will introduce in the first place the basic theory of an audio signal, the noise treatment fundamentals and principles of the wavelets theory. Then, we will present the development of noise reduction when using wavelet functions in MATLAB. In the foreground, we will demonstrate the usefulness of wavelets to reduce noise in a model system where Gaussian noise is inserted to an audio signal. In the following sections, we will present a practical example of noise reduction in a sinusoidal signal that has been generated in the MATLAB, which it is followed by an example with a real audio signal captured via Simulink. Finally, the graphic noise reduction model using GUIDE will be shown.
Digital Signal Processing (DSP) is a vital tool for scientists and engineers, as it is of fundamental importance in many areas of engineering practice and scientific research. The "alphabet" of DSP is mathematics and although most practical DSP problems can be solved by using real number mathematics, there are many others which can only be satisfactorily resolved or adequately described by means of complex numbers. If real number mathematics is the language of real DSP, then complex number mathematics is the language of complex DSP. In the same way that real numbers are a part of complex numbers in mathematics, real DSP can be regarded as a part of complex DSP (Smith, 1999). Complex mathematics manipulates complex numbers - the representation of two variables as a single number - and it may appear that complex DSP has no obvious connection with our everyday experience, especially since many DSP problems are explained mainly by means of real number mathematics. Nonetheless, some DSP techniques are based on complex mathematics, such as Fast Fourier Transform (FFT), z-transform, representation of periodical signals and linear systems, etc. However, the imaginary part of complex transformations is usually ignored or regarded as zero due to the inability to provide a readily comprehensible physical explanation. One well-known practical approach to the representation of an engineering problem by means of complex numbers can be referred to as the assembling approach: the real and imaginary parts of a complex number are real variables and individually can represent two real physical parameters. Complex math techniques are used to process this complex entity once it is assembled. The real and imaginary parts of the resulting complex variable preserve the same real physical parameters. This approach is not universally-applicable and can only be used with problems and applications which conform to the requirements of complex math techniques. Making a complex number entirely mathematically equivalent to a substantial physical problem is the real essence of complex DSP. Like complex Fourier transforms, complex DSP transforms show the fundamental nature of complex DSP and such complex techniques often increase the power of basic DSP methods. The development and application of complex DSP are only just beginning to increase and for this reason some researchers have named it theoretical DSP. It is evident that complex DSP is more complicated than real DSP. Complex DSP transforms are highly theoretical and mathematical; to use them efficiently and professionally requires a large amount of mathematics study and practical experience. Complex math makes the mathematical expressions used in DSP more compact and solves the problems which real math cannot deal with. Complex DSP techniques can complement our understanding of how physical systems perform but to achieve this, we are faced with the necessity of dealing with extensive sophisticated mathematics. For DSP professionals there comes a point at which they have no real choice since the study of complex number mathematics is the foundation of DSP.
Convolution is an important mathematical tool in both ﬁelds of signal and image processing. It is employed in ﬁltering, denoising, edge detection, correlation, compression, deconvolution, simulation, and in many other applications. Although the concept of convolution is not new, the efﬁcient computation of convolution is still an open topic. As the amount of processed data is constantly increasing, there is considerable request for fast manipulation with huge data. Moreover, there is demand for fast algorithms which can exploit computational power of modern parallel architectures.
Digital Signal Processors (DSPs) have been used in accelerator systems for more than fifteen years and have largely contributed to the evolution towards digital technology of many accelerator systems, such as machine protection, diagnostics and control of beams, power supply and motors. This paper aims at familiarising the reader with DSP fundamentals, namely DSP characteristics and processing development. Several DSP examples are given, in particular on Texas Instruments DSPs, as they are used in the DSP laboratory companion of the lectures this paper is based upon. The typical system design flow is described; common difficulties, problems and choices faced by DSP developers are outlined; and hints are given on the best solution.
Novel Method of Showing Frequency Transients in the Fourier Transform and it’s Application in Time-Frequency Analysis
Fourier Transform in the frequency domain is modified to also analyse frequency transients i.e. changes in the frequency spectrum with time variable of any order. This is analytically, a very useful tool as there are many problems where frequency variation with time has to be analyzed e.g. Doppler shift, Light through different mediums in time and space. Numerical calculations are usually done for such problems when needed. Here, Fourier transform is analyzed to incorporate more variables that simultaneously do the Time lag-Frequency Analysis (TLFA) from Fourier Transform by changing the Fourier Operator. Also, the Frequency Derivative Analysis (FDA) of any order can be analyzed from Fourier Transform. Validity of the operator is examined using Eigen value analysis and operator algebra.
Modulation is the process of facilitating the transfer of information over a medium. Typically the objective of a digital communication system is to transport digital data between two or more nodes. In radio communications this is usually achieved by adjusting a physical characteristic of a sinusoidal carrier, either the frequency, phase, amplitude or a combination thereof . This is performed in real systems with a modulator at the transmitting end to impose the physical change to the carrier and a demodulator at the receiving end to detect the resultant modulation on reception. Hence, modulation can be objectively defined as the process of converting information so that it can be successfully sent through a medium. This thesis deals with the current digital modulation techniques used in industry. Also, the thesis examines the qualitative and quantitative criteria used in selection of one modulation technique over the other. All the experiments, and realted data collected were obtained using MATLAB and SIMULINK
In this paper, we propose a natural framework that allows any region-based segmentation energy to be re-formulated in a local way. We consider local rather than global image statistics and evolve a contour based on local information. Localized contours are capable of segmenting objects with heterogeneous feature profiles that would be difficult to capture correctly using a standard global method. The presented technique is versatile enough to be used with any global region-based active contour energy and instill in it the benefits of localization. We describe this framework and demonstrate the localization of three well-known energies in order to illustrate how our framework can be applied to any energy. We then compare each localized energy to its global counterpart to show the improvements that can be achieved. Next, an in-depth study of the behaviors of these energies in response to the degree of localization is given. Finally, we show results on challenging images to illustrate the robust and accurate segmentations that are possible with this new class of active contour models.
A delayless structure targeted for low-resource implementation is proposed to eliminate filterbank processing delays in subband adaptive filters (SAFs). Rather than using direct IFFT or polyphase filterbanks to transform the SAFs back into the time-domain, the proposed method utilizes a weighted overlap-add (WOLA) synthesis. Low-resource real-time implementations are targeted and as such do not involve long (as long as the echo plant) FFT or IFFT operations. Also, the proposed approach facilitates time distribution of the adaptive filter reconstruction calculations crucial for efficient real-time and hardware implementation. The method is implemented on an oversampled WOLA filterbank employed as part of an echo cancellation application. Evaluation results demonstrate that the proposed implementation outperforms conventional SAF systems since the signals used in actual adaptive filtering are not distorted by filterbank aliasing. The method is a good match for partial update adaptive algorithms since segments of the time-domain adaptive filter are sequentially reconstructed and updated.
Audio processing algorithms are increasingly used in cell phones and today’s customers are placing more demands on cell phones. Feature phones, once the advent of mobile phone technology, nowadays do more than just providing the user with MP3 play back or advanced audio effects. These features have become an integral part of medium as well as low-end phones. On the other hand, there is also an endeavor to include as improved quality as possible into products to compete in market and satisfy users’ needs. Tackling the above requirements has been partly satisfied by the advance in hardware design and manufacturing technology. However, as new hardware emerges into market the need for competence to write efficient software and exploit the new features thoroughly and effectively arises. Even though compilers are also keeping up with the new tide space for hand optimized code still exist. Wrapped in the above goal, an effort was made in this thesis to partly cover the competence requirement at Multimedia Section (part of Ericsson Mobile Platforms) to develope optimized code for new processors. Forging persistently ahead with new products, EMP has always incorporated the latest technology into its products among which ARMv6 family of processors has the main central processing role in a number of upcoming products. To fully exploit latest features provided by ARMv6, it was required to probe its new instruction set among which new media processing instructions are of outmost importance. In order to execute DSP-intensive algorithms (e.g. Audio Processing algorithms) efficiently, the implementation should be done in low-level code applying available instruction set. Meanwhile, ARMv6 comes with a number of new features in comparison with its predecessors. SIMD (Single Instruction Multiple Data) and VFP (Vector Floating Point) are the most prominent media processing improvements in ARMv6. Aligned with thesis goals and guidelines, Reverb algorithm which is among one of the most complicated audio features on a hand-held devices was probed. Consequently, its kernel parts were identified and implementation was done both in fixed-point and floating-point using the available resources on hardware. Besides execution time and amount of code memory for each part were measured and provided in tables and charts for comparison purposes. Conclusions were finally drawn based on developed code’s efficiency over ARM compiler’s as well as existing code already developed and tailored to ARMv5 processors. The main criteria for optimization was the execution time. Moreover, quantization effect due to limited precision fixed-point arithmetic was formulated and its effect on quality was elaborated. The outcomes, clearly indicate that hand optimization of kernel parts are superior to Compiler optimized alternative both from the point of code memory as well as execution time. The results also confirmed the presumption that hand optimized code using new instruction set can improve efficiency by an average 25%-50% depending on the algorithm structure and its interaction with other parts of audio effect. Despite its many draw backs, fixed-point implementation remains yet to be the dominant implementation for majority of DSP algorithms on low-power devices.
This blog proposes a novel differentiator worth your consideration. Although simple, the differentiator provides a fairly wide 'frequency range of linear operation' and can be implemented, if need be, without performing numerical multiplications.
Correcting an Important Goertzel Filter Misconception
Section 1: reviews the mathematical deﬁnition of Hilbert transform and various ways to calculate it.
Sections 2 and 3: review applications of Hilbert transform in two major areas: Signal processing and system identiﬁcation.
Section 4: concludes with remarks on the historical development of Hilbert transform
In recent years, the amount of digital data which is stored and transmitted for private and public usage has increased considerably. To allow a save transmission and storage of data despite of error-prone transmission media, error correcting codes are used. A large variety of codes has been developed, and in the past decade low-density parity-check (LDPC) codes which have an excellent error correction performance became more and more popular. Today, low-density parity-check codes have been adopted for several standards, and eﬃcient decoder hardware architectures are known for the chosen structured codes. However, the existing decoder designs lack ﬂexibility as only few structured codes can be decoded with one decoder chip. In consequence, diﬀerent codes require a redesign of the decoder, and few solutions exist for decoding of codes which are not quasi-cyclic or which are unstructured. In this thesis, three diﬀerent approaches are presented for the implementation of fully programmable LDPC decoders which can decode arbitrary LDPC codes. As a design study, the ﬁrst programmable decoder which uses a heuristic mapping algorithm is realized on an ﬁeld-programmable gate array (FPGA), and error correction curves are measured to verify the correct functionality. The main contribution of this thesis lies in the development of the second and the third architecture and an appropriate mapping algorithm. The proposed fully programmable decoder architectures use one-phase message passing and layered decoding and can decode arbitrary LDPC codes using an optimum mapping and scheduling algorithm. The presented programmable architectures are in fact generalized decoder architectures from which the known decoders architectures for structured LDPC codes can be derived.
This Master Thesis presents the design of the core of a fixed point general purpose multimedia DSP processor (MDSP) and its instruction set. This processor employs parallel processing techniques and specialized addressing models to speed up the processing of multimedia applications. The MDSP has a dual MAC structure with one enhanced MAC that provides a SIMD, Single Instruction Multiple Data, unit consisting of four parallel data paths that are optimized for accelerating multimedia applications. The SIMD unit performs four multimedia-oriented 16-bit operations every clock cycle. This accelerates computationally intensive procedures such as video and audio decoding. The MDSP uses a memory bank of four memories to provide multiple accesses of source data each clock cycle.
This thesis is about implementing the functions for reciprocal, square root, inverse square root and logarithms on a DSP platform. A multi-core DSP platform that consists of one master processor core and several SIMD coprocessor cores is currently being designed by a team at the Computer Engineering Department of Linköping University. The SIMD coprocessors’ arithmetic logic unit (ALU) has 16 multipliers to support vector multiplication instructions. By efficiently using the 16 multipliers, it is possible to evaluate polynomials very fast. The ALU does not have (hardware) support for floating point arithmetic, so the challenge is to get good precision by using fixed point arithmetic. Precise and fast solutions to implement the mathematical functions are found by converting the fixed point input to a soft floating point format before polynomial approximation, choosing a polynomial based on an error analysis of the polynomial approximation, and using Newton-Raphson or Goldschmidt iterations to improve the precision of the polynomial approximations. Finally, suggestions are made of changes and additions to the instruction set architecture, in order to make the implementations faster, by efficiently using the currently existing hardware.
Motivation: A man was interested in knowing of unknown from the very beginning of the human history. Our human eyes help us to investigate our environment by reflection of light. However, wavelengths of visible light allows transparent view through only a very small kinds of materials. On the other hand, Ultra WideBand (UWB) electromagnetic waves with frequencies of few Gigahertz are able to penetrate through almost all types of materials around us. With some sophisticated methods and a piece of luck we are able to investigate what is behind opaque walls. Rescue and security of the people is one of the most promising fields for such applications. Rescue: Imagine how useful can be information about interior of the barricaded building with terrorists and hostages inside for a policemen. The tactics of police raid can be build up on realtime information about ground plan of the room and positions of big objects inside. How useful for the firemen can be information about current interior state of the room before they get inside? Such hazardous environment, full of smoke with zero visibility, is very dangerous and each additional information can make the difference between life and death. Security: Investigating objects through plastic, rubber, dress or other nonmetallic materials could be highly useful as an additional tool to the existing x-ray scanners. Especially it could be used for scanning baggage at the airport, truckloads on borders, dangerous boxes, etc.
In the classic paper, "An Economical Class of Digital Filters for Decimation and Interpolation", Hogenauer introduced an important class of digital filters called "Cascaded Integrator-Comb", or "CIC" for short (also sometimes called "Hogenauer filters"). Here, Matthew Donadio provides a more gentle introduction to the subject of CIC filters, geared specifically to the needs of practicing DSP designers.
This article presents a way to compute and plot the image response of a decimator. I'm defining the image response as the unwanted spectrum of the impulse response after downsampling, relative to the desired passband response.