Sign in

username:

password:



Not a member?

Search blogs



Search tips

Articles by category

Our Bloggers

DSP Blogs > Prabindh Sundareson > Components in Audio recognition - Part 1

Prabindh Sundareson
Prabindh Sundareson is a Senior Design Engineer with the Portable Audio & Video Group, at Texas Instruments, Bangalore. He is the current secretary of the IEEE Signal Processing Society, Bangalore Chapter. He holds 3 patents, and has several pending at the USPTO. He holds a Masters degree in Electronics from Indian Institute of Science. His interests lie in Audio compression algorithms, Signal transforms, Content classification, and security in embedded systems. In his free time, he tends to read Patent Law.

RSS Feed

Would you like to be notified by email when Prabindh Sundareson publishes a new blog?

  

Components in Audio recognition - Part 1

Posted by Prabindh Sundareson on Nov 20 2007 under Audio DSP | Academia / Research   

Audio recognition is defined as the task of recognizing a particular piece of audio (could be music, ring-tone, and speech as well), from a given sample set of audio tracks.

The Human Auditory System (HAS) is unique in that the tasks of "familiarisation" of unknown tracks, and finding "similar" tracks come naturally to us. Tunes from the not-so-recent past can still haunt the human brain many years later, when triggered by a similar tune. The way the brain stores and responds to music is proven to be different from the way the brain processes speech and other behaviour. The field of audio recognition tries to emulate this behaviour by using concepts from Biological modeling, Signal Processing theory and Pattern recognition theory.  

Audio recognition systems are used mainly to retrieve similar tracks from a database - this could be for various reasons including copyright management, personal playlist management, etc. A vastly different system that relies on "Social rating" also exists, that depends on peer rating of media files to decide where they belong. This is not covered in this topic, but will be compared when required.

A typical audio recognition system consists of the following components.

  • A system that "stores" the archive of tracks that need to be managed. This could be a simple SQL database indexing files stored in a 100 TB server.
  • A system that "analyses" the archive and fingerprints the characteristics of each track, and form various "groups" or "sets" of track based on their overlapping characteristics. This will typically include components from modeling, signal processing, and pattern recognition fields.
  • A system that can "receive" a audio track that needs to be "placed" into one of the many given groups or sets. This is typically a front-end, that is an User-Interface of some kind, followed by more Signal Processing blocks.

Portable implementations of the above can be created, with smaller storage, and more efficient but limited analysis capabilities and front ends. These can for example be used in portable media players. The Rio Volt had an early implementation of such an interface.

In the next series of articles, we will see how each of these components are typically implemented. We will also look at some reference implementations and discuss why an approach is better or bad. If you have any specific topic to discuss, email me at prabindh a't yahoo a't com.

For those of you looking at a place to start your scholarly searches, start at http://www.music-ir.org/

Looking to receive your feedbacks,

Prabindh



Rate this article:
3
Rating: 3 | Votes: 4
 
posted by Prabindh Sundareson
Prabindh Sundareson is a Senior Design Engineer with the Portable Audio & Video Group, at Texas Instruments, Bangalore. He is the current secretary of the IEEE Signal Processing Society, Bangalore Chapter. He holds 3 patents, and has several pending at the USPTO. He holds a Masters degree in Electronics from Indian Institute of Science. His interests lie in Audio compression algorithms, Signal transforms, Content classification, and security in embedded systems. In his free time, he tends to read Patent Law.

all articles by Prabindh Sundareson

Would you like to be notified by email when Prabindh Sundareson publishes a new blog?

  


Comments


 

SteveSmith wrote:

11/25/2007
 
Interesting topic! I’m always amazed (and depressed) that my eyes and ears can perform signal processing about a thousand times better than the algorithms I write. Thanks in advance for the articles.

Add a Comment
You need to login before you can post a comment (best way to prevent spam). ( Not a member? )