
Wavelet Denoising for TDR Dynamic Range Improvement
A technique is presented for removing large amounts of noise present in time-domain-reflectometry (TDR) waveforms to increase the dynamic range of TDR waveforms and TDR based s-parameter measurements.

Bilinear Transformation Made Easy
A formula is derived and demonstrated that is capable of directly generating digital filter coefficients from an analog filter prototype using the bilinear transformation. This formula obviates the need for any algebraic manipulation of the analog prototype filter and is ideal for use in embedded systems that must take in any general analog filter specification and dynamically generate digital filter coefficients directly usable in difference equations.

FUZZY LOGIC BASED CONVOLUTIONAL DECODER FOR USE IN MOBILE TELEPHONE SYSTEMS
Efficient convolutional coding and decoding algorithms are most crucial to successful operation of wireless communication systems in order to achieve high quality of service by reducing the overall bit error rate performance. A widely applied and well evaluated scheme for error correction purposes is well known as Viterbi algorithm [7]. Although the Viterbi algorithm has very good error correcting characteristics, computational effort required remains high. In this paper a novel approach is discussed introducing a convolutional decoder design based on fuzzy logic. A simplified version of this fuzzy based decoder is examined with respect to bit error rate (BER) performance. It can be shown that the fuzzy based convolutional decoder here proposed considerably reduces computational effort with only minor BER performance degradation when compared to the classical Viterbi approach.

Method to Calculate the Inverse of a Complex Matrix using Real Matrix Inversion
This paper describes a simple method to calculate the invers of a complex matrix. The key element of the method is to use a matrix inversion, which is available and optimised for real numbers. Some actual libraries used for digital signal processing only provide highly optimised methods to calculate the inverse of a real matrix, whereas no solution for complex matrices are available, like in [1]. The presented algorithm is very easy to implement, while still much more efficient than for example the method presented in [2]. [1] Visual DSP++ 4.0 C/C++ Compiler and Library Manual for TigerSHARC Processors; Analog Devices; 2005. [2] W. Press, S.A. Teukolsky, W.T. Vetterling, B.R. Flannery; Numerical Recipes in C++, The art of scientific computing, Second Edition; p52 : “Complex Systems of Equations”;Cambridge University Press 2002.

Real Time Implementation of Multi-Level Perfect Signal Reconstruction Filter Bank
Discrete Wavelet Transform (DWT) is an efficient tool for signal and image processing applications which has been utilized for perfect signal reconstruction. In this paper, twenty seven optimum combinations of three different wavelet filter types, three different filter reconstruction levels and three different kinds of signal for multi-level perfect reconstruction filter bank were implemented in MATLAB/Simulink. All the filters for different wavelet types were designed using Filter Design Analysis (FDA) and Wavelet toolbox. Signal to Noise Ratio (SNR) was calculated for each combination. Combination with best SNR was then implemented on TMS320C6713 DSP kit. Real time testing of perfect reconstruction on DSP kit was then carried out by two different methods. Experimental results accede with theory and simulations.

Algorithm Adaptation and Optimization of a Novel DSP Vector Co-processor
The Division of Computer Engineering at Linköping's university is currently researching the possibility to create a highly parallel DSP platform, that can keep up with the computational needs of upcoming standards for various applications, at low cost and low power consumption. The architecture is called ePUMA and it combines a general RISC DSP master processor with eight SIMD co-processors on a single chip. The master processor will act as the main processor for general tasks and execution control, while the co-processors will accelerate computing intensive and parallel DSP kernels.This thesis investigates the performance potential of the co-processors by implementing matrix algebra kernels for QR decomposition, LU decomposition, matrix determinant and matrix inverse, that run on a single co-processor. The kernels will then be evaluated to find possible problems with the co-processors' microarchitecture and suggest solutions to the problems that might exist. The evaluation shows that the performance potential is very good, but a few problems have been identified, that causes significant overhead in the kernels. Pipeline mismatches, that occurs due to different pipeline lengths for different instructions, causes pipeline hazards and the current solution to this, doesn't allow effective use of the pipeline. In some cases, the single port memories will cause bottlenecks, but the thesis suggests that the situation could be greatly improved by using buffered memory write-back. Also, the lack of register forwarding makes kernels with many data dependencies run unnecessarily slow.

Correlation and Power Spectrum
In the signals and systems course and in the first course in digital signal processing, a signal is, most often, characterized by its amplitude spectrum in the frequency-domain and its amplitude profile in the time-domain. So much a student gets used to this type of characterization, that the student finds it difficult to appreciate, when encountered in the ensuing statistical signal processing course, the fact that a signal can also be characterized by its autocorrelation function in the time-domain and the corresponding power spectrum in the frequency-domain and that the amplitude characterization is not available. In this article, the characterization of a signal by its autocorrelation function in the time-domain and the corresponding power spectrum in the frequency-domain is described. Cross-correlation of two signals is also presented.

Digital Signal Processing Maths
Modern digital signal processing makes use of a variety of mathematical techniques. These techniques are used to design and understand efficient filters for data processing and control.

Auditory Component Analysis Using Perceptual Pattern Recognition to Identify and Extract Independent Components From an Auditory Scene
The cocktail party effect, our ability to separate a sound source from a multitude of other sources, has been researched in detail over the past few decades, and many investigators have tried to model this on computers. Two of the major research areas currently being evaluated for the so-called sound source separation problem are Auditory Scene Analysis (Bregman 1990) and a class of statistical analysis techniques known as Independent Component Analysis (Hyvärinen 2001). This paper presents a methodology for combining these two techniques. It suggests a framework that first separates sounds by analyzing the incoming audio for patterns and synthesizing or filtering them accordingly, measures features of the resulting tracks, and finally separates sounds statistically by matching feature sets and making the output streams statistically independent. Artificial and acoustical mixes of sounds are used to evaluate the signal-to-noise ratio where the signal is the desired source and the noise is comprised of all other sources. The proposed system is found to successfully separate audio streams. The amount of separation is inversely proportional to the amount of reverberation present.

Fundamentals of the DFT (fft) Algorithms
In this article, a physical explanation of the fundamentals of the DFT (fft) algorithms is presented in terms of waveform decomposition. After reading the article and trying the examples, the reader is expected to gain a clear understanding of the basics of the mysterious DFT (fft) algorithms.

Voice Activity Detection. Fundamentals and Speech Recognition System Robustness
An important drawback affecting most of the speech processing systems is the environmental noise and its harmful effect on the system performance. Examples of such systems are the new wireless communications voice services or digital hearing aid devices. In speech recognition, there are still technical barriers inhibiting such systems from meeting the demands of modern applications. Numerous noise reduction techniques have been developed to palliate the effect of the noise on the system performance and often require an estimate of the noise statistics obtained by means of a precise voice activity detector (VAD). Speech/non-speech detection is an unsolved problem in speech processing and affects numerous applications including robust speech recognition, discontinuous transmission, real-time speech transmission on the Internet or combined noise reduction and echo cancellation schemes in the context of telephony. The speech/non-speech classification task is not as trivial as it appears, and most of the VAD algorithms fail when the level of background noise increases. During the last decade, numerous researchers have developed different strategies for detecting speech on a noisy signal and have evaluated the influence of the VAD effectiveness on the performance of speech processing systems. Most of the approaches have focussed on the development of robust algorithms with special attention being paid to the derivation and study of noise robust features and decision rules. The different VAD methods include those based on energy thresholds, pitch detection, spectrum analysis, zero-crossing rate, periodicity measure, higher order statistics in the LPC residual domain or combinations of different features. This chapter shows a comprehensive approximation to the main challenges in voice activity detection, the different solutions that have been reported in a complete review of the state of the art and the evaluation frameworks that are normally used. The application of VADs for speech coding, speech enhancement and robust speech recognition systems is shown and discussed. Three different VAD methods are described and compared to standardized and recently reported strategies by assessing the speech/non-speech discrimination accuracy and the robustness of speech recognition systems.

Digital Image Processing Using LabView
Digital Image processing is a topic of great relevance for practically any project, either for basic arrays of photodetectors or complex robotic systems using artificial vision. It is an interesting topic that offers to multimodal systems the capacity to see and understand their environment in order to interact in a natural and more efficient way. The development of new equipment for high speed image acquisition and with higher resolutions requires a significant effort to develop techniques that process the images in a more efficient way. Besides, medical applications use new image modalities and need algorithms for the interpretation of these images as well as for the registration and fusion of the different modalities, so that the image processing is a productive area for the development of multidisciplinary applications. The aim of this chapter is to present different digital image processing algorithms using LabView and IMAQ vision toolbox. IMAQ vision toolbox presents a complete set of digital image processing and acquisition functions that improve the efficiency of the projects and reduce the programming effort of the users obtaining better results in shorter time. Therefore, the IMAQ vision toolbox of LabView is an interesting tool to analyze in detail and through this chapter it will be presented different theories about digital image processing and different applications in the field of image acquisition, image transformations. This chapter includes in first place the image acquisition and some of the most common operations that can be locally or globally applied, the statistical information generated by the image in a histogram is commented later. Finally, the use of tools allowing to segment or filtrate the image are described making special emphasis in the algorithms of pattern recognition and matching template.

STUDY OF DIGITAL MODULATION TECHNIQUES
Modulation is the process of facilitating the transfer of information over a medium. Typically the objective of a digital communication system is to transport digital data between two or more nodes. In radio communications this is usually achieved by adjusting a physical characteristic of a sinusoidal carrier, either the frequency, phase, amplitude or a combination thereof . This is performed in real systems with a modulator at the transmitting end to impose the physical change to the carrier and a demodulator at the receiving end to detect the resultant modulation on reception. Hence, modulation can be objectively defined as the process of converting information so that it can be successfully sent through a medium. This thesis deals with the current digital modulation techniques used in industry. Also, the thesis examines the qualitative and quantitative criteria used in selection of one modulation technique over the other. All the experiments, and realted data collected were obtained using MATLAB and SIMULINK

Region based Active Contour Segmentation
In this paper, we propose a natural framework that allows any region-based segmentation energy to be re-formulated in a local way. We consider local rather than global image statistics and evolve a contour based on local information. Localized contours are capable of segmenting objects with heterogeneous feature profiles that would be difficult to capture correctly using a standard global method. The presented technique is versatile enough to be used with any global region-based active contour energy and instill in it the benefits of localization. We describe this framework and demonstrate the localization of three well-known energies in order to illustrate how our framework can be applied to any energy. We then compare each localized energy to its global counterpart to show the improvements that can be achieved. Next, an in-depth study of the behaviors of these energies in response to the degree of localization is given. Finally, we show results on challenging images to illustrate the robust and accurate segmentations that are possible with this new class of active contour models.

Adaptive distributed noise reduction for speech enhancement in wireless acoustic sensor networks
An adaptive distributed noise reduction algorithm for speech enhancement is considered, which operates in a wireless acoustic sensor network where each node collects multiple microphone signals. In previous work, it was shown theoretically that for a stationary scenario, the algorithm provides the same signal estimators as the centralized multi-channel Wiener filter, while significantly compressing the data that is transmitted between the nodes. Here, we present simulation results of a fully adaptive implementation of the algorithm, in a non-stationary acoustic scenario with a moving speaker and two babble noise sources. The algorithm is implemented using a weighted overlap-add technique to reduce the overall input-output delay. It is demonstrated that good results can be obtained by estimating the required signal statistics with a long-term forgetting factor without downdating, even though the signal statistics change along with the iterative filter updates. It is also demonstrated that simultaneous node updating provides a significantly smoother and faster tracking performance compared to sequential node updating.

EFFICIENT MAPPING OF ADVANCED SIGNAL PROCESSING ALGORITHMS ON MULTI-PROCESSOR ARCHITECTURES
Modern microprocessor technology is migrating from simply increasing clock speeds on a single processor to placing multiple processors on a die to increase throughput and power performance in every generation. To utilize the potential of such a system, signal processing algorithms have to be efficiently parallelized so that the load can be distributed evenly among the multiple processing units. In this paper, we study several advanced deterministic and stochastic signal processing algorithms and their computation using multiple processing units. Specifically, we consider two commonly used time-frequency signal representations, the short-time Fourier transform and the Wigner distribution, and we demonstrate their parallelization with low communication overhead. We also consider sequential Monte Carlo estimation techniques such as particle filtering, and we demonstrate that its multiple processor implementation requires large data exchanges and thus a high communication overhead. We propose a modified mapping scheme that reduces this overhead at the expense of a slight loss in accuracy, and we evaluate the performance of the scheme for a state estimation problem with respect to accuracy and scalability.

Closing the gap: CPU and FPGA Trends in sustainable floating-point BLAS performance
Field programmable gate arrays (FPGAs) have long been an attractive alternative to microprocessors for computing tasks — as long as floating-point arithmetic is not required. Fueled by the advance of Moore’s Law, FPGAs are rapidly reaching sufficient densities to enhance peak floating-point performance as well. The question, however, is how much of this peak performance can be sustained. This paper examines three of the basic linear algebra subroutine (BLAS) functions: vector dot product, matrix-vector multiply, and matrix multiply. A comparison of microprocessors, FPGAs, and Reconfigurable Computing platforms is performed for each operation. The analysis highlights the amount of memory bandwidth and internal storage needed to sustain peak performance with FPGAs. This analysis considers the historical context of the last six years and is extrapolated for the next six years.

BLAS Comparison on FPGA, CPU and GPU
High Performance Computing (HPC) or scientific codes are being executed across a wide variety of computing platforms from embedded processors to massively parallel GPUs. We present a comparison of the Basic Linear Algebra Subroutines (BLAS) using double-precision floating point on an FPGA, CPU and GPU. On the CPU and GPU, we utilize standard libraries on state-of-the-art devices. On the FPGA, we have developed parameterized modular implementations for the dot product and Gaxpy or matrix-vector multiplication. In order to obtain optimal performance for any aspect ratio of the matrices, we have designed a high-throughput accumulator to perform an efficient reduction of floating point values. To support scalability to large data-sets, we target the BEE3 FPGA platform. We use performance and energy efficiency as metrics to compare the different platforms. Results show that FPGAs offer comparable performance as well as 2.7 to 293 times better energy efficiency for the test cases that we implemented on all three platforms.

Wavelet Denoising for TDR Dynamic Range Improvement
A technique is presented for removing large amounts of noise present in time-domain-reflectometry (TDR) waveforms to increase the dynamic range of TDR waveforms and TDR based s-parameter measurements.

Bilinear Transformation Made Easy
A formula is derived and demonstrated that is capable of directly generating digital filter coefficients from an analog filter prototype using the bilinear transformation. This formula obviates the need for any algebraic manipulation of the analog prototype filter and is ideal for use in embedded systems that must take in any general analog filter specification and dynamically generate digital filter coefficients directly usable in difference equations.