comp.dsp | Vector quantisation for clustering

I am hoping to identify clusters within a set of vectors using vector
quantisation. I have only just started looking into vector quantisation so
I'd like to ask a couple of questions. 

I'm implementing the LBG algorithm in Scilab, and it has just occurred to
me - what if I want to identify five clusters - or three? The LGB seems to
work only for 2^x number of clusters am I right? Perhaps this is blatantly
obvious, but I failed to notice until now.

Can anyone point me towards another algorithm which will perform
clustering for an arbitrary number of sets? My set of data is likely to be
in two to four dimensions and I'd like to split the data into anywhere from
2 to 6 subsets.

Cheers,
Dougie



		
This message was sent using the Comp.DSP web interface on
www.DSPRelated.com

Reply by Carlos Moreno ●October 25, 20052005-10-25

smuglr wrote:
> I am hoping to identify clusters within a set of vectors using vector
> quantisation. I have only just started looking into vector quantisation so
> I'd like to ask a couple of questions. 
> 
> I'm implementing the LBG algorithm in Scilab, and it has just occurred to
> me - what if I want to identify five clusters - or three? The LGB seems to
> work only for 2^x number of clusters am I right? Perhaps this is blatantly
> obvious, but I failed to notice until now.
> 
> Can anyone point me towards another algorithm which will perform
> clustering for an arbitrary number of sets? My set of data is likely to be
> in two to four dimensions and I'd like to split the data into anywhere from
> 2 to 6 subsets.

Why would you say that LBG only works for 2^N??  Where could
you possibly have read that?

How does:

Assume N initial centroids
Loop:
     For each centroid, determine the set of points that
     are closer to this centroid than to any other centroid

     For each set of points, determine the new centroid
Until ( .... whatever suitable stop condition ....)

Restrict you to work with a power of 2?

That for the purpose of data compression, it *only makes sense*
to use a number that is a power of 2, that's a different thing.

The algorithm works for *any* number that is less than or equal
to the number of training points (although in practice, the
number should be much less than the number of training points).

HTH,

Carlos
--

Reply by smuglr ●October 26, 20052005-10-26

>Why would you say that LBG only works for 2^N??  Where could
>you possibly have read that?
>
>How does:
>
>Assume N initial centroids
>Loop:
>     For each centroid, determine the set of points that
>     are closer to this centroid than to any other centroid
>
>     For each set of points, determine the new centroid
>Until ( .... whatever suitable stop condition ....)
>
>Restrict you to work with a power of 2?
>
>Carlos
>--

Ahhh -*moment of clarity*- the implementation I was following at 
http://www.data-compression.com/vq.shtml uses the splitting method to
iteratively build an initial codebook which is then optimised by LBG, I
thought the whole algorithm was the LBG method. I see what I need to look
into now.

Many thanks for pointing this out,

Dougie

Vector quantisation for clustering

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group