Multirate Signal Processing Concepts in Digital Communications
Multirate systems are building blocks commonly used in digital signal processing (DSP). Their function is to alter the rate of the discrete-time signals, by adding or deleting a portion of the signal samples. They are essential in various standard signal processing techniques such as signal analysis, denoising, compression and so forth. During the last decade, however, they have increasingly found applications in new and emerging areas of signal processing, as well as in several neighboring disciplines such as digital communications. The main contribution of this thesis is aimed towards a better understanding of multirate systems and their use in modern communication systems. To this end, we first study a property of linear systems appearing in certain multirate structures. This property is called biorthogonal partnership and represents a terminology introduced recently to address a need for a descriptive term for such class of filters. In the thesis we especially focus on the extensions of this simple idea to the case of vector signals (MIMO biorthogonal partners) and to accommodate for nonintegral decimation ratios (fractional biorthogonal partners). The main results developed here study the properties of biorthogonal partners, e.g., the conditions for the existence of stable and of finite impulse response (FIR) partners. In this context we develop the parameterization of FIR solutions, which makes the search for the best partner in a given application analytically tractable. This proves very useful in their central application, namely, channel equalization in digital communications with signal oversampling at the receiver. A good channel equalizer in this context is one that helps neutralize the distortion on the signal introduced by the channel propagation but not at the expense of amplifying the channel noise. In the second part of the thesis, we focus on another class of multirate systems, used at the transmitter side in order to introduce redundancy in the data stream. This redundancy generally serves to facilitate the equalization process by forcing certain structure on the transmitted signal. We first consider the transmission systems that introduce the redundancy in the form of a cyclic prefix. The examples of such systems include the discrete multitone (DMT) and the orthogonal frequency division multiplexing (OFDM) systems. We study the signal precoding in such systems, aimed at improving the performance by minimizing the noise power at the receiver. We also consider a different class of communication systems with signal redundancy, namely, the multiuser systems based on code division multiple access (CDMA). We specifically focus on the special class of CDMA systems called `a mutually orthogonal usercode receiver' (AMOUR). We show how to find the best equalizer from the class of zero-forcing solutions in such systems, and then increase the size of this class by employing alternative sampling strategies at the receiver.
An application of neural networks to adaptive playout delay in VoIP
The statistical nature of data traffic and the dynamic routing techniques employed in IP networks results in a varying network delay (jitter) experienced by the individual IP packets which form a VoIP flow. As a result voice packets generated at successive and periodic intervals at a source will typically be buffered at the receiver prior to playback in order to smooth out the jitter. However, the additional delay introduced by the playout buffer degrades the quality of service. Thus, the ability to forecast the jitter is an integral part of selecting an appropriate buffer size. This paper compares several neural network based models for adaptive playout buffer selection and in particular a novel combined wavelet transform/neural network approach is proposed. The effectiveness of these algorithms is evaluated using recorded VoIP traces by comparing the buffering delay and the packet loss ratios for each technique. In addition, an output speech signal is reconstructed based on the packet loss information for each algorithm and the perceptual quality of the speech is then estimated using the PESQ MOS algorithm. Simulation results indicate that proposed Haar-Wavelets-Packet MLP and Statistical-Model MLP adaptive scheduling schemes offer superior performance.
HIERARCHICAL MOTION ESTIMATION FOR EMBEDDED OBJECT TRACKING
This paper presents an algorithm developed to provide automatic motion detection and object tracking embedded within intelligent CCTV systems. The algorithm development focuses on techniques which provide an efficient embedded systems implementation with the ability to target both FPGA and DSP devices. During algorithm development constraints on hardware implementation have been fully considered resulting in an algorithm which, when targeted at current FPGA devices, will take full advantage of the DSP resource commonly provided in such devices. The hierarchical structure of the proposed algorithm provides the system with a multi-level motion estimation process allowing low resolution estimation for motion detection and further higher resolution stages for motion estimation. An initial MATLAB prototype has demonstrated this algorithm capable of object motion estimation while compensating for camera motion, allowing a moving object to be tracked by a moving camera.
An FPGA Implementation of Hierarchical Motion Estimation for Embedded Oject Tracking
This paper presents the hardware implementation of an algorithm developed to provide automatic motion detection and object tracking functionality embedded within intelligent CCTV systems. The implementation is targeted at an Altera Stratix FPGA making full use of the dedicated DSP resource. The Altera Nios embedded processor provides a platform for the tracking control loop and generic Pan Tilt Zoom camera interface. This paper details the explicit functional stages of the algorithm that lend themselves to an optimised pipelined hardware implementation. This implementation provides maximum data throughput, providing real-time operation of the described algorithm, and enables a moving camera to track a moving object in real time.
A DGPS/Radiobeacon Receiver for Minimum Shift Keying with Soft Decision Capabilities
The Global Positioning System (GPS) is now in operation, and many improvements to its performance are being sought. One such improvement is Differential GPS (DGPS), where known errors in the GPS broadcast are identified and the corrections broadcast to the end user. One implementation of DGPS being considered is the use of coastal marine radio direction finding (RDF) radiobeacons in the 285-325kHz band as transmitters for the DGPS broadcast. The normal RDF beacon signal consists of a continuous carrier on a one kilohertz boundary plus a Morse-code identification signal 1025Hz above the carrier. In the DGPS/radiobeacon implementation proposed for the US coastal regions, the differential data link signal uses minimum shift keying (MSK) at a data rate of 25, 50, 100, 200 or 400 baud (the exact baud rat has not yet been decided). This MSK signal is centered between the RDF beacon carrier and identification signal. At the frequencies that these radiobeacons are operated, the prevailing atmospheric noise is both non-Gaussian and very strong. This noise characteristic makes the design of a long-range data link difficult. One solution that has been proposed is the use of forward error correction (FEC) coding of the data. The performance of FEC decoders can be improved by the used of a soft decision receiver, which delivers both bit decisions and information about the validity of the bit decisions. This work describes the design of a radio receiver for DGPS/Radiobeacon servics which is capable of reception of 400 baud MSK in the DGPS/Radiobeacon band. The receiver is designed to be easily augmented to provide soft decisions and easily modified to recieve MSK at data rates of 25 to 400 baud. The radio is a microprocessor controlled dual conversion superheterodyne with an audio frequency of 1kHz. The demodulator runs on the same microprocessor that controls the radio. The weak-signal performance of the demodulator is very good: the Eb/No vs. bit error rate performance of the demodulator is only a couple of dB worse than the theoretical performance of differential phase-shift keying. The radio has a noise floor of -114dBm referenced to it's 500Hz wide audio bandwidth and a 3rd order intermodulation intercept of +7dBm for a dynamic range of 83dB. This work concludes with a thumbnail analysis of the operations needed to implement a soft bit decision estimator, and some suggestions for the implementation of said soft bit decision estimator.
IMPLEMENTATION OF PERIODOGRAM SMOOTHING OF NOISYIMPLEMENTATION OF PERIODOGRAM SMOOTHING OF NOISY SIGNALS USING TMS320C6713 DSK
Periodogram Smoothing is a technique of power spectrum estimation. The discrete Fourier transform of a digital signal simply resolves the frequency components. The algorithm is implemented on Texas Instruments’ TMS320C6713 DSP Starter Kit (DSK). This is a 32-bit floating-point digital signal processor running at 225 MHz. The programs are basically written in the C programming language. However, those sections of code which are time-critical and memory-critical are written in assembly language of C6713. A MATLAB™ graphical user interface is also provided. The MATLAB™ program calls C programs loaded in Code Composer Studio (CCS). The C programs in turn call the assembly programs when required.
Hidden Markov Model based recognition of musical pattern in South Indian Classical Music
Automatic recognition of musical patterns plays a crucial part in Musicological and Ethno musicological research and can become an indispensable tool for the search and comparison of music extracts within a large multimedia database. This paper finds an efficient method for recognizing isolated musical patterns in a monophonic environment, using Hidden Markov Model. Each pattern, to be recognized, is converted into a sequence of frequency jumps by means of a fundamental frequency tracking algorithm, followed by a quantizer. The resulting sequence of frequency jumps is presented to the input of the recognizer which use Hidden Markov Model. The main characteristic of Hidden Markov Model is that it utilizes the stochastic information from the musical frame to recognize the pattern. The methodology is tested in the context of South Indian Classical Music, which exhibits certain characteristics that make the classification task harder, when compared with Western musical tradition. Recognition of 100% has been obtained for the six typical music pattern used in practise. South Indian classical instrument, flute is used for the whole experiment.
Design and implementation of odd-order wave digital lattice lowpass filters, from specifications to Motorol DSP56307EVM module
This thesis is dedicated to applying and developing explicit formulas for the design and implementation of odd-order lattice Lowpass wave digital filters (WDFs) on a Digital Signal Processor (DSP), such as a Motorola DSP56307EVM (Evaluation Module). The direct design method of Gazsi for filter types such as Butterworfh, Chebyshev, inverse Chebyshev, and Cauer (Elliptic) provides a straightforward method for calculating the coefficients without an extensive knowledge of digital signal processing. A program package to design and implement odd-order WDFs, including detailed procedures and examples, is presented in this thesis and includes not only the calculations of the coefficients, but also the simulation on a MATLAB platform and an implementation on a Motorola DSP56307EVM board. It is very quick, effective and convenient to obtain the coefficients when the user enters a few parameters according to the general specifications; to verify the characteristics of the designed filter; to simulate the filter on the MATLAB platform; to implement the filter on the DSP board; and to compare the results between the simulation and the implementation.
Implementing IS-95, the CDMA Standard, on TMS320C6201 DSP
IS-95 is the present U.S. 2nd generation CDMA standard. Currently, the 2nd generation CDMA phones are produced by Qualcomm. Texas Instruments (TI) has ASIC design for Viterbi Decoder on C54x. Several of the components in the forward link process are also implemented in hardware. However, having to design a specific hardware for a particular application is expensive and time consuming. Thus, the possibility of the alternative implementations is of great interest to both customers and TI itself. This research has achieved in successful implementation of IS-95 entirely in software on TI fixed-point DSP TMS320C6201, and met the real time constraint. IS-95 system, the industrial standard for CDMA, is a very complicated system and extremely computationally demanding. The transmission rate for an IS-95 system is 1.2288 Mcps. This research project includes all the major components of the demodulation process for the forward link system: PN Descrambling, Walsh Despreading, Phase Correction & Maximal Ratio Combining, Deinterleaver, Digital Automatic Gain Control, and Viterbi Deccc:r. The entire demodulation process is done completely in C. That makes it a very attractive alternative implementation in the future applications. It is well known that ASIC design is not only expensive and but also time consuming, programming in assembly is easier and cheaper, but programming in C is a much easier and efficient way out, in particular, for general computer engineers. During the whole process, efforts have been devoted on developing various specific techniques to optimize the design for all the components involved. These developments are successfully achieved by making the best use of the following techniques: to simplify the algorithms first before programming, to look for regularity in the problem, to work toward the Compiler's full efficiency, and to use C intrinsics whenever possible. All these attributes together make the implementation scheme great for DSP applications. The benchmark results compare very well to the TI-internal hand scheduled assembly performance of the same type of decoders. The estimated percentage usage of all the components (excluding PN) is only 21.18% of the total CPU cycles available (4,000 K), which is very efficient and impressive.
Towards a Real-Time Implementation of Loudness Enhancement Algorithms on a Motorola DSP 56600
Most of the cellular phone companies with audio speaker capabilities focus on reducing the current drain to extend battery life. None of these companies concentrate on modifying the speech signal itself to make it sound louder in noisy listener environments without adding additional energy. Such algorithms have been described in literature by Boillot and form the backbone of this thesis. The current project focusses on taking a step towards running these algorithms in real-time on a 16-bit fixed point Motorola DSP 56600. Implementation of the autocorrelation, Levinson- Durbin, FIR, and IIR filters in assembly for the Motorola DSP 56600 has been investigated in the thesis. The challenges and alternate solutions to circumvent the challenges have been described, and experimental results have been presented. Results indicate that the modified signed LMS algorithm, which can be considered to be a blend between the LMS and signed LMS algorithms, turns out to be an elegant solution to circumvent the challenges in implementing the Levinson-Durbin recursion.
Bilinear Transformation Made Easy
A formula is derived and demonstrated that is capable of directly generating digital filter coefficients from an analog filter prototype using the bilinear transformation. This formula obviates the need for any algebraic manipulation of the analog prototype filter and is ideal for use in embedded systems that must take in any general analog filter specification and dynamically generate digital filter coefficients directly usable in difference equations.
Automatic Parallel Memory Address Generation for Parallel DSP Computing
The concept of Parallel Vector (scratch pad) Memories (PVM) was introduced as one solution for Parallel Computing in DSP, which can provides parallel memory addressing efficiently with minimum latency. The parallel programming more efficient by using the parallel addressing generator for parallel vector memory (PVM) proposed in this thesis. However, without hiding complexities by cache, the cost of programming is high. To minimize the programming cost, automatic parallel memory address generation is needed to hide the complexities of memory access. This thesis investigates methods for implementing conflict-free vector addressing algorithms on a parallel hardware structure. In particular, match vector addressing requirements extracted from the behaviour model to a prepared parallel memory addressing template, in order to supply data in parallel from the main memory to the on-chip vector memory. According to the template and usage of the main and on-chip parallel vector memory, models for data pre-allocation and permutation in scratch pad memories of ASIP can be decided and configured. By exposing the parallel memory access of source code, the memory access flow graph (MFG) will be generated. Then MFG will be used combined with hardware information to match templates in the template library. When it is matched with one template, suited permutation equation will be gained, and the permutation table that include target addresses for data pre-allocation and permutation is created. Thus it is possible to automatically generate memory address for parallel memory accesses. A tool for achieving the goal mentioned above is created, Permutator, which is implemented in C++ combined with XML. Memory access coding template is selected, as a result that permutation formulas are specified. And then PVM address table could be generated to make the data pre-allocation, so that efficient parallel memory access is possible. The result shows that the memory access complexities is hiden by using Permutator, so that the programming cost is reduced.It works well in the context that each algorithm with its related hardware information is corresponding to a template case, so that extra memory cost is eliminated.
EngD thesis: Reduced-Complexity Signal Detection in Digital Communications Receivers
The Author began this Engineering Doctorate (EngD) in Autumn 1999, whilst already in full-time employment as a DSP software engineer at Nortel Networks, Harlow. This EngD comprises a set of three projects. The first project was focused on the development of dual-tone multi-frequency (DTMF) signal detection software. DTMF signals are currently used for operating menu-driven services such as voice-mail, telephone banking and share-dealing. The need for detection software in a packet networking environment exists because such signals become degraded when they travel through speech compression circuits. In addition, if they appear as echoes on a telephone line, they can affect the operation of echo cancellation systems. In this thesis a number of DSP algorithms are discussed where fast detection and minimum complexity are key characteristics. A key contribution here was the development of a novel detection algorithm based on the extraction of parameters that model the DTMF signal. The thesis reports a method combining parameter extraction with the technique of maximum likelihood to perform DTMF detection, resulting in very short time-frames when compared to standard approaches. Reducing the complexity of detection techniques is also important in today’s communication systems. To this end a key contribution here was the development of the ‘split Goertzel algorithm’, which permitted an overlapping of analysis windows without the need for reprocessing input data. Besides being applied to voice-band signals, such as in the case of DTMF, the Author also had the opportunity during the EngD to apply the skills and knowledge acquired in signal processing to higher-rate data-streams. This involved work concerning the equalisation of channel distortion commonly found in wireless communication systems. This covers two projects, with the first being conducted at Verticalband Ltd. (now no longer operational) in the area of digital television receivers. In this part of the thesis a real-time DSP implementation is discussed for enhancing a simulation system developed for wireless channels. A number of channel equalisation approaches are studied. The work concludes that for high-rate signals, non-linear algorithms have the best error rate performance. On the basis of the studies carried out, the thesis considers development and implementation issues of designs based on the decision feedback equaliser. The thesis reports on a software design which applies the method of least squares to carry out filter coefficient adaptation. The Verticalband studies reported lead on to further research within the context of channel equalisation, in the context of the detection of data in multiple-input multiple-output (MIMO) wireless local area network (WLAN) systems. This work was undertaken at Philips Research in Eindhoven, The Netherlands. The thesis discusses implementation scenarios of multi-element antenna arrays that aim to provide bit-rates orders of magnitude higher than today’s WLAN offerings, as those required by emerging standards such as 802.11n. The complexity of optimal detection techniques, such as maximum likelihood, scales exponentially with the number of transmit antennas. This translates to high processing costs and power consumption, rendering such techniques unsuitable for use in battery-powered devices. The initial main contribution here was the analysis of the complexity of algorithms whose performance had already been tested, such as the non-linear maximum likelihood approach and also less complex methods utilising linear filtering. This resulted in the development of new formulae to predict processing costs of algorithms based on the number of transmit and receive antennas. Other key contributions were defining a method to reduce the complexity of matrix inversion when using the Moore-Penrose pseudo-inverse, and the application of matrix decomposition to obviate the need for costly matrix inversion at all. Some on-going research into sub-optimal detection is also discussed, which describes methods to reduce the search-space for the maximum likelihood algorithm.
A Subspace Based Approach to the Design, Implementation and Validation of Algorithms for Active Vibration Isolation Control
Vibration isolation endeavors to reduce the transmission of vibration energy from one structure (the source) to another (the receiver), to prevent undesirable phenomena such as sound radiation. A well-known method for achieving this is passive vibration isolation (PVI). In the case of PVI, mounts are used - consisting of springs and dampers - to connect the vibrating source to the receiver. The stiffness of the mount determines the fundamental resonance frequency of the mounted system and vibrations with a frequency higher than the fundamental resonance frequency are attenuated. Unfortunately, however, other design requirements (such as static stability) often impose a minimum allowable stiffness, thus limiting the achievable vibration isolation by passive means. A more promising method for vibration isolation is hybrid vibration isolation control. This entails that, in addition to PVI, an active vibration isolation control (AVIC) system is used with sensors, actuators and a control system that compensates for vibrations in the lower frequency range. Here, the use of a special form of AVIC using statically determinate stiff mounts is proposed. The mounts establish a statically determinate system of high stiffness connections in the actuated directions and of low stiffness connections in the unactuated directions. The latter ensures PVI in the unactuated directions. This approach is called statically determinate AVIC (SD-AVIC). The aim of the control system is to produce antidisturbance forces that counteract the disturbance forces stemming from the source. Using this approach, the vibration energy transfer from the source to the receiver is blocked in the mount due to the anti-forces. This thesis deals with the design of controllers generating the anti-forces by applying techniques that are commonly used in the field of signal processing. The control approaches - that are model-based - are both adaptive and fixed gain and feedforward and feedback oriented. The control approaches are validated using two experimental vibration isolation setups: a single reference single actuator single error sensor (SR-SISO) setup and a single reference input multiple actuator input multiple error sensor output (SR-MIMO) setup. Finding a plant model can be a problem. This is solved by using a black-box modelling strategy. The plants are identified using subspace model identification. It is shown that accurate linear models can be found in a straightforward manner by using small batches of recorded (sampled) time-domain data only. Based on the identified models, controllers are designed, implemented and validated. Due to resonance in mechanical structures, adaptive SD-AVIC systems are often hampered by slow convergence of the controller coefficients. In general, it is desirable that the SD-AVIC system yields fast optimum performance after it is switched on. To achieve this result and speed up the convergence of the adaptive controller coefficients, the so-called inverse outer factor model is included in the adaptive control scheme. The inner/outer factorization, that has to be performed to obtain the inverse outer factor model, is completely determined in state space to enable a numerically robust computation. The inverse outer factor model is also incorporated in the control scheme as a state space model. It is found that fast adaptation of the controller coefficients is possible. Controllers are designed, implemented and validated to suppress both narrowband and broadband disturbances. Scalar regularization is used to prevent actuator saturation and an unstable closed loop. In order to reduce the computational load of the controllers, several steps are taken including controller order reduction and implementation of lower order models. It is found that in all experiments the simulation and real-time results correspond closely for both the fixed gain and adaptive control situation. On the SR-SISO setup, reductions up to 5.0 dB are established in real-time for suppressing a broadband disturbance output (0-2 kHz) using feedback-control. On the SR-MIMO vibration isolation setup, using feedforward-control reductions of broadband disturbances (0-1 kHz) of 9.4 dB are established in real-time. Using feedback-control, reductions are established up to 3.5 dB in real-time (0-1 kHz). In case of the SR-MIMO setup, the values for the reduction are obtained by averaging the reductions obtained in all sensor outputs. The results pave the way for the next generation of algorithms for SD-AVIC.
Interaction with Sound and Pre-Recorded Music: Novel Interfaces and Use Patterns
Computers are changing the way sound and recorded music are listened to and used. The use of computers to playback music makes it possible to change and adapt music to different usage situations in ways that were not possible with analog sound equipment. In this thesis, interaction with pre-recorded music is investigated using prototypes and user studies. First, different interfaces for browsing music on consumer or mobile devices were compared. It was found that the choice of input controller, mapping and auditory feedback influences how the music was searched and how the interfaces were perceived. Search performance was not affected by the tested interfaces. Based on this study, several ideas for the future design of music browsing interfaces were proposed. Indications that search time depends linearly on distance to target were observed and examined in a related study where a movement time model for searching in a text document using scrolling was developed. Second, work practices of professional disc jockeys (DJs) were studied and a new design for digital DJing was proposed and tested. Strong indications were found that the use of beat information could reduce the DJ’s cognitive workload while maintaining flexibility during the musical performance. A system for automatic beat extraction was designed based on an evaluation of a number of perceptually important parameters extracted from audio signals. Finally, auditory feedback in pen-gesture interfaces was investigated through a series of informal and formal experiments. The experiments point to several general rules of auditory feedback in pen-gesture interfaces: a few simple functions are easy to achieve, gaining further performance and learning advantage is difficult, the gesture set and its computerized recognizer can be designed to minimize visual dependence, and positive emotional or aesthetic response can be achieved using musical auditory feedback.
An application of neural networks to adaptive playout delay in VoIP
The statistical nature of data traffic and the dynamic routing techniques employed in IP networks results in a varying network delay (jitter) experienced by the individual IP packets which form a VoIP flow. As a result voice packets generated at successive and periodic intervals at a source will typically be buffered at the receiver prior to playback in order to smooth out the jitter. However, the additional delay introduced by the playout buffer degrades the quality of service. Thus, the ability to forecast the jitter is an integral part of selecting an appropriate buffer size. This paper compares several neural network based models for adaptive playout buffer selection and in particular a novel combined wavelet transform/neural network approach is proposed. The effectiveness of these algorithms is evaluated using recorded VoIP traces by comparing the buffering delay and the packet loss ratios for each technique. In addition, an output speech signal is reconstructed based on the packet loss information for each algorithm and the perceptual quality of the speech is then estimated using the PESQ MOS algorithm. Simulation results indicate that proposed Haar-Wavelets-Packet MLP and Statistical-Model MLP adaptive scheduling schemes offer superior performance.
Blind Adaptive Dereverberation of Speech Signals Using a Microphone Array
In this thesis, we present a blind adaptive speech dereverberation method based on the use of a reduced mutually referenced equalizers (RMRE) criterion. The method is based on the idea of the inversion of single-input multiple-output FIR linear systems, and as such requires the use of multiple microphones. However, unlike many traditional microphone array methods, there is no need for a specific array configuration or geometry. The RMRE method finds a subset of equalizers for a given delay in a single step, without the need for the typical channel estimation step. This makes the method practical in terms of implementation and avoids the pitfalls of the more complicated two step dereverberation approach, typical in many inversion methods. Additionally, only the second-order statistics of the signals recorded by the microphones are used, without the need for utilizing higher-order statistics information typically needed when the channsls have a nonminimum phase response, as is the case with room impulse responses. We present simulations and experimental results that demonstrate the applicability of the method when the input is speech, and show that in the noiseless case, perfect dereverberation can be achieved. We also evaluate its performance in the presence of noise, and we present a possible way to modify the proposed RMRE to work for very low SNR values. We also explore the problems when model-order mismatches are present, and demonstrate that the under-modeling of the channel impulse responses order can be combated by increasing the number of microphones. For order over-estimation, we will show that RMRE can handle such errors with no modification.
Image Analysis Using a Dual-Tree M-Band Wavelet Transform
We propose a 2D generalization to the M-band case of the dual-tree decomposition structure (initially proposed by N. Kingsbury and further investigated by I. Selesnick) based on a Hilbert pair of wavelets. We particularly address (i) the construction of the dual basis and (ii) the resulting directional analysis. We also revisit the necessary pre-processing stage in the M-band case. While several reconstructions are possible because of the redundancy of the representation, we propose a new optimal signal reconstruction technique, which minimizes potential estimation errors. The effectiveness of the proposed M- band decomposition is demonstrated via denoising comparisons on several image types (natural, texture, seismics), with various M-band wavelets and thresholding strategies. Signicant improvements in terms of both overall noise reduction and direction preservation are observed.
Wavelet Filter Banks in Perceptual Audio Coding
This thesis studies the application of the wavelet filter bank (WFB) in perceptual audio coding by providing brief overviews of perceptual coding, psychoacoustics, wavelet theory, and existing wavelet coding algorithms. Furthermore, it describes the poor frequency localization property of the WFB and explores one filter design method, in particular, for improving channel separation between the wavelet bands. A wavelet audio coder has also been developed by the author to test the new filters. Preliminary tests indicate that the new filters provide some improvement over other wavelet filters when coding audio signals that are stationary-like and contain only a few harmonic components, and similar results for other types of audio signals that contain many spectral and temporal components. It has been found that the WFB provides a flexible decomposition scheme through the choice of the tree structure and basis filter, but at the cost of poor localization properties. This flexibility can be a benefit in the context of audio coding but the poor localization properties represent a drawback. Determining ways to fully utilize this flexibility, while minimizing the effects of poor time-frequency localization, is an area that is still very much open for research.
Least Squares and Adaptive Multirate Filtering
This thesis addresses the problem of estimating a random process from two observed signals sampled at different rates. The case where the low–rate observation has a higher signal–to– noise ratio than the high–rate observation is addressed. Both adaptive and non–adaptive filtering techniques are explored. For the non–adaptive case, a multirate version of the Wiener–Hopf optimal filter is used for estimation. Three forms of the filter are described. It is shown that using both observations with this filter achieves a lower mean–squared error than using either sequence alone. Furthermore, the amount of training data to solve for the filter weights is comparable to that needed when using either sequence alone. For the adaptive case, a multirate version of the LMS adaptive algorithm is developed. Both narrowband and broadband interference are removed using the algorithm in an adaptive noise cancellation scheme. The ability to remove interference at the high rate using observations taken at the low rate without the high–rate observations is demonstrated.