STUDY OF DIGITAL MODULATION TECHNIQUES
Modulation is the process of facilitating the transfer of information over a medium. Typically the objective of a digital communication system is to transport digital data between two or more nodes. In radio communications this is usually achieved by adjusting a physical characteristic of a sinusoidal carrier, either the frequency, phase, amplitude or a combination thereof . This is performed in real systems with a modulator at the transmitting end to impose the physical change to the carrier and a demodulator at the receiving end to detect the resultant modulation on reception. Hence, modulation can be objectively defined as the process of converting information so that it can be successfully sent through a medium. This thesis deals with the current digital modulation techniques used in industry. Also, the thesis examines the qualitative and quantitative criteria used in selection of one modulation technique over the other. All the experiments, and realted data collected were obtained using MATLAB and SIMULINK
Region based Active Contour Segmentation
In this paper, we propose a natural framework that allows any region-based segmentation energy to be re-formulated in a local way. We consider local rather than global image statistics and evolve a contour based on local information. Localized contours are capable of segmenting objects with heterogeneous feature profiles that would be difficult to capture correctly using a standard global method. The presented technique is versatile enough to be used with any global region-based active contour energy and instill in it the benefits of localization. We describe this framework and demonstrate the localization of three well-known energies in order to illustrate how our framework can be applied to any energy. We then compare each localized energy to its global counterpart to show the improvements that can be achieved. Next, an in-depth study of the behaviors of these energies in response to the degree of localization is given. Finally, we show results on challenging images to illustrate the robust and accurate segmentations that are possible with this new class of active contour models.
LOW-RESOURCE DELAYLESS SUBBAND ADAPTIVE FILTER USING WEIGHTED OVERLAP-ADD
A delayless structure targeted for low-resource implementation is proposed to eliminate filterbank processing delays in subband adaptive filters (SAFs). Rather than using direct IFFT or polyphase filterbanks to transform the SAFs back into the time-domain, the proposed method utilizes a weighted overlap-add (WOLA) synthesis. Low-resource real-time implementations are targeted and as such do not involve long (as long as the echo plant) FFT or IFFT operations. Also, the proposed approach facilitates time distribution of the adaptive filter reconstruction calculations crucial for efficient real-time and hardware implementation. The method is implemented on an oversampled WOLA filterbank employed as part of an echo cancellation application. Evaluation results demonstrate that the proposed implementation outperforms conventional SAF systems since the signals used in actual adaptive filtering are not distorted by filterbank aliasing. The method is a good match for partial update adaptive algorithms since segments of the time-domain adaptive filter are sequentially reconstructed and updated.
OPTIMAL DESIGN OF DIGITAL EQUIVALENTS TO ANALOG FILTERS
The proposed optimal algorithm for the digitizing of analog filters is based on two existing filter design methods: the extended window design (EWD) and the matched–pole (MP) frequency sampling design. The latter is closely related to the filter design with iterative weighted least squares (WLS). The optimization is performed with an original MP design that yields an equiripple digitizing error. Then, a drastic reduction of the digitizing error is achieved through the introduction of a fractional time shift that minimizes the magnitude of the equiripple error within a given frequency interval. The optimal parameters thus obtained can be used to generate the EWD equations, together with a variable fractional delay output, as described in an earlier paper. Finally, in contrast to the WLS procedure, which relies on a “good guess” of the weighting function, the MP optimization is straightforward.
A NEW PARALLEL IMPLEMENTATION FOR PARTICLE FILTERS AND ITS APPLICATION TO ADAPTIVE WAVEFORM DESIGN
Sequential Monte Carlo particle filters (PFs) are useful for estimating nonlinear non-Gaussian dynamic system parameters. As these algorithms are recursive, their real-time implementation can be computationally complex. In this paper, we analyze the bottlenecks in existing parallel PF algorithms, and we propose a new approach that integrates parallel PFs with independent Metropolis-Hastings (PPF-IMH) algorithms to improve root mean-squared estimation error performance. We implement the new PPF-IMH algorithm on a Xilinx Virtex-5 field programmable gate array (FPGA) platform. For a onedimensional problem and using 1,000 particles, the PPF-IMH architecture with four processing elements utilizes less than 5% Virtex-5 FPGA resources and takes 5.85 μs for one iteration. The algorithm performance is also demonstrated when designing the waveform for an agile sensing application.
A pole-zero placement technique for designing second-order IIR parametric equalizer filters
A new procedure is presented for designing second-order parametric equalizer filters. In contrast to the traditional approach, in which the design is based on a bilinear transform of an analog filter, the presented procedure allows for designing the filter directly in the digital domain. A rather intuitive technique known as pole-zero placement, is treated here in a quantitative way. It is shown that by making some meaningful approximations, a set of relatively simple design equations can be obtained. Design examples of both notch and resonance filters are included to illustrate the performance of the proposed method, and to compare with state-of-the-art solutions.
Adaptive distributed noise reduction for speech enhancement in wireless acoustic sensor networks
An adaptive distributed noise reduction algorithm for speech enhancement is considered, which operates in a wireless acoustic sensor network where each node collects multiple microphone signals. In previous work, it was shown theoretically that for a stationary scenario, the algorithm provides the same signal estimators as the centralized multi-channel Wiener filter, while significantly compressing the data that is transmitted between the nodes. Here, we present simulation results of a fully adaptive implementation of the algorithm, in a non-stationary acoustic scenario with a moving speaker and two babble noise sources. The algorithm is implemented using a weighted overlap-add technique to reduce the overall input-output delay. It is demonstrated that good results can be obtained by estimating the required signal statistics with a long-term forgetting factor without downdating, even though the signal statistics change along with the iterative filter updates. It is also demonstrated that simultaneous node updating provides a significantly smoother and faster tracking performance compared to sequential node updating.
EFFICIENT MAPPING OF ADVANCED SIGNAL PROCESSING ALGORITHMS ON MULTI-PROCESSOR ARCHITECTURES
Modern microprocessor technology is migrating from simply increasing clock speeds on a single processor to placing multiple processors on a die to increase throughput and power performance in every generation. To utilize the potential of such a system, signal processing algorithms have to be efficiently parallelized so that the load can be distributed evenly among the multiple processing units. In this paper, we study several advanced deterministic and stochastic signal processing algorithms and their computation using multiple processing units. Specifically, we consider two commonly used time-frequency signal representations, the short-time Fourier transform and the Wigner distribution, and we demonstrate their parallelization with low communication overhead. We also consider sequential Monte Carlo estimation techniques such as particle filtering, and we demonstrate that its multiple processor implementation requires large data exchanges and thus a high communication overhead. We propose a modified mapping scheme that reduces this overhead at the expense of a slight loss in accuracy, and we evaluate the performance of the scheme for a state estimation problem with respect to accuracy and scalability.
Closing the gap: CPU and FPGA Trends in sustainable floating-point BLAS performance
Field programmable gate arrays (FPGAs) have long been an attractive alternative to microprocessors for computing tasks — as long as floating-point arithmetic is not required. Fueled by the advance of Moore’s Law, FPGAs are rapidly reaching sufficient densities to enhance peak floating-point performance as well. The question, however, is how much of this peak performance can be sustained. This paper examines three of the basic linear algebra subroutine (BLAS) functions: vector dot product, matrix-vector multiply, and matrix multiply. A comparison of microprocessors, FPGAs, and Reconfigurable Computing platforms is performed for each operation. The analysis highlights the amount of memory bandwidth and internal storage needed to sustain peak performance with FPGAs. This analysis considers the historical context of the last six years and is extrapolated for the next six years.
BLAS Comparison on FPGA, CPU and GPU
High Performance Computing (HPC) or scientific codes are being executed across a wide variety of computing platforms from embedded processors to massively parallel GPUs. We present a comparison of the Basic Linear Algebra Subroutines (BLAS) using double-precision floating point on an FPGA, CPU and GPU. On the CPU and GPU, we utilize standard libraries on state-of-the-art devices. On the FPGA, we have developed parameterized modular implementations for the dot product and Gaxpy or matrix-vector multiplication. In order to obtain optimal performance for any aspect ratio of the matrices, we have designed a high-throughput accumulator to perform an efficient reduction of floating point values. To support scalability to large data-sets, we target the BEE3 FPGA platform. We use performance and energy efficiency as metrics to compare the different platforms. Results show that FPGAs offer comparable performance as well as 2.7 to 293 times better energy efficiency for the test cases that we implemented on all three platforms.
Digital Signal Processing Maths
Modern digital signal processing makes use of a variety of mathematical techniques. These techniques are used to design and understand efficient filters for data processing and control.
LOW-RESOURCE DELAYLESS SUBBAND ADAPTIVE FILTER USING WEIGHTED OVERLAP-ADD
A delayless structure targeted for low-resource implementation is proposed to eliminate filterbank processing delays in subband adaptive filters (SAFs). Rather than using direct IFFT or polyphase filterbanks to transform the SAFs back into the time-domain, the proposed method utilizes a weighted overlap-add (WOLA) synthesis. Low-resource real-time implementations are targeted and as such do not involve long (as long as the echo plant) FFT or IFFT operations. Also, the proposed approach facilitates time distribution of the adaptive filter reconstruction calculations crucial for efficient real-time and hardware implementation. The method is implemented on an oversampled WOLA filterbank employed as part of an echo cancellation application. Evaluation results demonstrate that the proposed implementation outperforms conventional SAF systems since the signals used in actual adaptive filtering are not distorted by filterbank aliasing. The method is a good match for partial update adaptive algorithms since segments of the time-domain adaptive filter are sequentially reconstructed and updated.
OPTIMAL DESIGN OF DIGITAL EQUIVALENTS TO ANALOG FILTERS
The proposed optimal algorithm for the digitizing of analog filters is based on two existing filter design methods: the extended window design (EWD) and the matched–pole (MP) frequency sampling design. The latter is closely related to the filter design with iterative weighted least squares (WLS). The optimization is performed with an original MP design that yields an equiripple digitizing error. Then, a drastic reduction of the digitizing error is achieved through the introduction of a fractional time shift that minimizes the magnitude of the equiripple error within a given frequency interval. The optimal parameters thus obtained can be used to generate the EWD equations, together with a variable fractional delay output, as described in an earlier paper. Finally, in contrast to the WLS procedure, which relies on a “good guess” of the weighting function, the MP optimization is straightforward.
BLAS Comparison on FPGA, CPU and GPU
High Performance Computing (HPC) or scientific codes are being executed across a wide variety of computing platforms from embedded processors to massively parallel GPUs. We present a comparison of the Basic Linear Algebra Subroutines (BLAS) using double-precision floating point on an FPGA, CPU and GPU. On the CPU and GPU, we utilize standard libraries on state-of-the-art devices. On the FPGA, we have developed parameterized modular implementations for the dot product and Gaxpy or matrix-vector multiplication. In order to obtain optimal performance for any aspect ratio of the matrices, we have designed a high-throughput accumulator to perform an efficient reduction of floating point values. To support scalability to large data-sets, we target the BEE3 FPGA platform. We use performance and energy efficiency as metrics to compare the different platforms. Results show that FPGAs offer comparable performance as well as 2.7 to 293 times better energy efficiency for the test cases that we implemented on all three platforms.
FUZZY LOGIC BASED CONVOLUTIONAL DECODER FOR USE IN MOBILE TELEPHONE SYSTEMS
Efficient convolutional coding and decoding algorithms are most crucial to successful operation of wireless communication systems in order to achieve high quality of service by reducing the overall bit error rate performance. A widely applied and well evaluated scheme for error correction purposes is well known as Viterbi algorithm [7]. Although the Viterbi algorithm has very good error correcting characteristics, computational effort required remains high. In this paper a novel approach is discussed introducing a convolutional decoder design based on fuzzy logic. A simplified version of this fuzzy based decoder is examined with respect to bit error rate (BER) performance. It can be shown that the fuzzy based convolutional decoder here proposed considerably reduces computational effort with only minor BER performance degradation when compared to the classical Viterbi approach.
Fundamentals of the DFT (fft) Algorithms
In this article, a physical explanation of the fundamentals of the DFT (fft) algorithms is presented in terms of waveform decomposition. After reading the article and trying the examples, the reader is expected to gain a clear understanding of the basics of the mysterious DFT (fft) algorithms.
An application of neural networks to adaptive playout delay in VoIP
The statistical nature of data traffic and the dynamic routing techniques employed in IP networks results in a varying network delay (jitter) experienced by the individual IP packets which form a VoIP flow. As a result voice packets generated at successive and periodic intervals at a source will typically be buffered at the receiver prior to playback in order to smooth out the jitter. However, the additional delay introduced by the playout buffer degrades the quality of service. Thus, the ability to forecast the jitter is an integral part of selecting an appropriate buffer size. This paper compares several neural network based models for adaptive playout buffer selection and in particular a novel combined wavelet transform/neural network approach is proposed. The effectiveness of these algorithms is evaluated using recorded VoIP traces by comparing the buffering delay and the packet loss ratios for each technique. In addition, an output speech signal is reconstructed based on the packet loss information for each algorithm and the perceptual quality of the speech is then estimated using the PESQ MOS algorithm. Simulation results indicate that proposed Haar-Wavelets-Packet MLP and Statistical-Model MLP adaptive scheduling schemes offer superior performance.
A Nonlinear Stein Based Estimator for Multichannel Image Denoising
The use of multicomponent images has become widespread with the improvement of multisensor systems having increased spatial and spectral resolutions. However, the observed images are often corrupted by an additive Gaussian noise. In this paper, we are interested in multichannel image denoising based on a multiscale representation of the images. A multivariate statistical approach is adopted to take into account both the spatial and the inter-component correlations existing between the different wavelet subbands. More precisely, we propose a new parametric nonlinear estimator which generalizes many reported denoising methods. The derivation of the optimal parameters is achieved by applying Stein’s principle in the multivariate case. Experiments performed on multispectral remote sensing images clearly indicate that our method outperforms conventional wavelet denoising techniques.
Update To: A Wide-Notch Comb Filter
This article presents alternatives to the wide-notch comb filter described in Reference [1].