DSPRelated.com
Forums

Can we correct 2t errors in a BCH code?

Started by Broken Arrow July 18, 2008
Eric Jacobsen wrote:

> On Sat, 19 Jul 2008 23:25:00 -0700 (PDT), Broken Arrow > <moon.zia@gmail.com> wrote:
>>Thanks Vladimir and Eric for your responses. I guess I need to >>explain the problem in more detail. The work is a practical >>implementation of Juels and Wattenberg's seminal paper named "Fuzzy >>commitment" using iris scan and some error control code. I explain the >>scheme below.
(snip)
>>I think now it should be very clear what the problem is. From your >>posts, it seems that LDPC and Turbo codes can do the job. Can they? >>Sorry for my lack of knowledge about these codes as my research area >>is biometrics and not the coding theory. Thanks.
> Yes, the problem is clear now. Thanks for the clarification. I am > much less confused. ;)
> First, Turbo Codes and LDPCs work best when soft information is > available, i.e., a confidence level on each bit as to whether it is a > one or a zero. Your application may lend itself to this, since the > measurement of the biometric may not be binary for each bit that was > originally quantized. In other words, if the biomeasure is made on > a scale rather than a hard binary decision for each bit, using that > "soft" scaling will significantly improve the performance of many > decoders and enable the efficient use of Turbo Codes or LDPCs.
> Second, would it not be possible to make just the information part of > C the same length as W?
(snip)
> That's a quite bad error rate to try to overcome. Usually "very bad" > error rates are in the 10e-2 region, and 10e-1 would be "extremely > awful". I think correcting an error rate of one in three with a > codeword of only 2047 bits is going to be a tough nut to crack.
It seems to me, though, that the actual corrected value isn't needed. All that is needed is to know that that sufficient information is available to do it. It would seem a much easier problem, at least in terms of the computation needed. -- glen

Broken Arrow wrote:


> 1. Size of iris template is 2047 bits. Hence the codeword size should > be 2047 bits. > 2. The threshold for differentiating between same class and different > class is 32%. So the code should be able to correct 0.32*2047=655 > errors. The errors can be in message part or the parity part (as Eric > has asked). > 3. The message size is not fixed but as it is used as the secret key > (hence the term biometric key generation), it should be as long as > possible to make the brute force attack infeasible. 250+ bits is > great. > > I think now it should be very clear what the problem is. From your > posts, it seems that LDPC and Turbo codes can do the job. Can they?
LDPC/Turbo would be a poor choice for several reasons: 1) "Soft" information is unavailable 2) "Standard" codes won't fit. It should be the specially designed LDPC/Turbo code with the low rate. 3) The block size is too small for the random codes to be efficient. 4) The second layer of coding would be required to combat the error floor. I would take a tailbiting convolutional code with the rate of 1/7, and concatenate it with a Reed-Solomon. Also, a CRC or something like that would be required to ensure the data integrity.
> Sorry for my lack of knowledge about these codes as my research area > is biometrics and not the coding theory. Thanks.
Hire the consultant. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
On Sun, 20 Jul 2008 20:43:55 -0700 (PDT), Broken Arrow
<moon.zia@gmail.com> wrote:

>> First, Turbo Codes and LDPCs work best when soft information is >> available, i.e., a confidence level on each bit as to whether it is a >> one or a zero. Your application may lend itself to this, since the >> measurement of the biometric may not be binary for each bit that was >> originally quantized. In other words, if the biomeasure is made on >> a scale rather than a hard binary decision for each bit, using that >> "soft" scaling will significantly improve the performance of many >> decoders and enable the efficient use of Turbo Codes or LDPCs. > >Unfortunately, no soft information is available. The iris template is >binary having no systematic correlation.
Bummer. That'll hurt for many codes as even convolutional codes do better with soft input information.
>> Second, would it not be possible to make just the information part of >> C the same length as W? Then XORing W and K (assuming C is a >> systematic code) will produce P, then P plus the parity bits from C >> are stored together (or would that give away too much information >> about the key?) > >Thats true. Storing P in clear will leak too much information.
Okay, that helps narrow it down.
>> That's a quite bad error rate to try to overcome. Usually "very bad" >> error rates are in the 10e-2 region, and 10e-1 would be "extremely >> awful". I think correcting an error rate of one in three with a >> codeword of only 2047 bits is going to be a tough nut to crack. > >We may be able to increase the template size to 2 times (4094 bits) >but 32% error correction will still be needed. Will it help? > >Moon
Larger codeword sizes help since it increases "bit diversity" and the size of the code space. How much it helps in this case is hard to quantify. Also, I liked Glen's suggestion that since it isn't necessary to correct *all* the errors that some sneaky tricks might be applied. Leveraging the correlations that you'd mentioned earlier is probably useful as well. Personally I'm not aware of an existing system that would be obvious to apply to the problem, but what you've described helps narrow things down a bit. Eric Jacobsen Minister of Algorithms Abineau Communications http://www.ericjacobsen.org Blog: http://www.dsprelated.com/blogs-1/hf/Eric_Jacobsen.php
On Sun, 20 Jul 2008 20:43:55 -0700 (PDT), Broken Arrow
<moon.zia@gmail.com> wrote:

>> First, Turbo Codes and LDPCs work best when soft information is >> available, i.e., a confidence level on each bit as to whether it is a >> one or a zero. Your application may lend itself to this, since the >> measurement of the biometric may not be binary for each bit that was >> originally quantized. In other words, if the biomeasure is made on >> a scale rather than a hard binary decision for each bit, using that >> "soft" scaling will significantly improve the performance of many >> decoders and enable the efficient use of Turbo Codes or LDPCs. > >Unfortunately, no soft information is available. The iris template is >binary having no systematic correlation.
Bummer. That'll hurt for many codes as even convolutional codes do better with soft input information.
>> Second, would it not be possible to make just the information part of >> C the same length as W? Then XORing W and K (assuming C is a >> systematic code) will produce P, then P plus the parity bits from C >> are stored together (or would that give away too much information >> about the key?) > >Thats true. Storing P in clear will leak too much information.
Okay, that helps narrow it down.
>> That's a quite bad error rate to try to overcome. Usually "very bad" >> error rates are in the 10e-2 region, and 10e-1 would be "extremely >> awful". I think correcting an error rate of one in three with a >> codeword of only 2047 bits is going to be a tough nut to crack. > >We may be able to increase the template size to 2 times (4094 bits) >but 32% error correction will still be needed. Will it help? > >Moon
Larger codeword sizes help since it increases "bit diversity" and the size of the code space. How much it helps in this case is hard to quantify. Also, I liked Glen's suggestion that since it isn't necessary to correct *all* the errors that some sneaky tricks might be applied. Leveraging the correlations that you'd mentioned earlier is probably useful as well. Personally I'm not aware of an existing system that would be obvious to apply to the problem, but what you've described helps narrow things down a bit. Eric Jacobsen Minister of Algorithms Abineau Communications http://www.ericjacobsen.org Blog: http://www.dsprelated.com/blogs-1/hf/Eric_Jacobsen.php
On Sun, 20 Jul 2008 20:43:55 -0700 (PDT), Broken Arrow
<moon.zia@gmail.com> wrote:

>> First, Turbo Codes and LDPCs work best when soft information is >> available, i.e., a confidence level on each bit as to whether it is a >> one or a zero. Your application may lend itself to this, since the >> measurement of the biometric may not be binary for each bit that was >> originally quantized. In other words, if the biomeasure is made on >> a scale rather than a hard binary decision for each bit, using that >> "soft" scaling will significantly improve the performance of many >> decoders and enable the efficient use of Turbo Codes or LDPCs. > >Unfortunately, no soft information is available. The iris template is >binary having no systematic correlation.
Bummer. That'll hurt for many codes as even convolutional codes do better with soft input information.
>> Second, would it not be possible to make just the information part of >> C the same length as W? Then XORing W and K (assuming C is a >> systematic code) will produce P, then P plus the parity bits from C >> are stored together (or would that give away too much information >> about the key?) > >Thats true. Storing P in clear will leak too much information.
Okay, that helps narrow it down.
>> That's a quite bad error rate to try to overcome. Usually "very bad" >> error rates are in the 10e-2 region, and 10e-1 would be "extremely >> awful". I think correcting an error rate of one in three with a >> codeword of only 2047 bits is going to be a tough nut to crack. > >We may be able to increase the template size to 2 times (4094 bits) >but 32% error correction will still be needed. Will it help? > >Moon
Larger codeword sizes help since it increases "bit diversity" and the size of the code space. How much it helps in this case is hard to quantify. Also, I liked Glen's suggestion that since it isn't necessary to correct *all* the errors that some sneaky tricks might be applied. Leveraging the correlations that you'd mentioned earlier is probably useful as well. Personally I'm not aware of an existing system that would be obvious to apply to the problem, but what you've described helps narrow things down a bit. Eric Jacobsen Minister of Algorithms Abineau Communications http://www.ericjacobsen.org Blog: http://www.dsprelated.com/blogs-1/hf/Eric_Jacobsen.php
> I would take a tailbiting convolutional code with the rate of 1/7, and > concatenate it with a Reed-Solomon.
Similar technique was used by Hao, Anderson and Daugman in their paper. In fact they used RS with Hadamard. We want to use a single code to have a finer control on the error rate, if possible.

Broken Arrow wrote:

>>I would take a tailbiting convolutional code with the rate of 1/7, and >>concatenate it with a Reed-Solomon. > > > Similar technique was used by Hao, Anderson and Daugman in their > paper. In fact they used RS with Hadamard. We want to use a single > code to have a finer control on the error rate, if possible.
Why do you expect better performance from the single code? How much of improvement do you think is possible? Did you compare their solution to the theoretical bounds? Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
On Jul 22, 8:42 am, Vladimir Vassilevsky <antispam_bo...@hotmail.com>
wrote:

> > Why do you expect better performance from the single code? > How much of improvement do you think is possible? > Did you compare their solution to the theoretical bounds? >
Even better, one should go back to basics and challenge the OP to defend his assumptions. Each user is assigned a random key K that is encoded into a codeword W. How many users does the OP need to accommodate? Don't 32 bits suffice to assign a distinct key K to almost every human being on this planet? If so, can the OP manage with just a few bits, say a dozen or so, for K? Since no soft information is available, correcting 32% of the bits is going to need a Hamming distance of more than 64%. This is hard to do; codes with Hamming distance of 50% are already very low-rate (the Hadamard code mentioned which is the same as a first-order Reed-Muller code, or orthogonal code has distance exactly 50%), and codes with higher distance have even lower rates and are harder to find. If I may use an analogy, one cannot go more than half-way into a forest; after that one is coming out of the forest. But something can still be done. Here is an example of four codewords of length 3 that differ in 66.66% of the places: 000 110 011 101 Can this be generalized to longer lengths with more than four codewords? --Dilip Sarwate

sarwate@YouEyeYouSee.edu wrote:
> On Jul 22, 8:42 am, Vladimir Vassilevsky <antispam_bo...@hotmail.com> > wrote: > >>Why do you expect better performance from the single code? >>How much of improvement do you think is possible? >>Did you compare their solution to the theoretical bounds? >> > > Even better, one should go back to basics and challenge the > OP to defend his assumptions. Each user is assigned a > random key K that is encoded into a codeword W. How many > users does the OP need to accommodate? Don't 32 bits > suffice to assign a distinct key K to almost every human being > on this planet?
It requires more then twice as many bits because of the birthday paradox.
> If so, can the OP manage with just a few bits, > say a dozen or so, for K?
:-)
> Since no soft information is available, correcting 32% of the > bits is going to need a Hamming distance of more than 64%. > This is hard to do; codes with Hamming distance of 50% are > already very low-rate (the Hadamard code mentioned which is > the same as a first-order Reed-Muller code, or orthogonal code > has distance exactly 50%), and codes with higher distance > have even lower rates and are harder to find. If I may use an > analogy, one cannot go more than half-way into a forest; after > that one is coming out of the forest. But something can > still be done. Here is an example of four codewords of length > 3 that differ in 66.66% of the places: > > 000 > 110 > 011 > 101 > > Can this be generalized to longer lengths with more than four > codewords?
A high distance code can be made from the original linear code by the exclusion of the low weight codewords. However this is non-systematic and inconvenient way; perhaps some tricks could be applied to it. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com