comp.dsp | Reed-Solomon FEC necessary to have the same characteristic for symbols and words?

Hi there,

I have implemented an almost "universal" (read: compatible with most other
decoders output) Reed-Solomon codec. I feel like I have a good intuitive
grasp on the whole codec, except for one thing: I just can't understand
why we are tied to the same characteristic at both the symbol level and at
the word level.

Let me explain: if for example you choose to work on GF(2^8), this means
that you will encode each symbol of a message using 256 possible values,
which is perfect for binary data. However, in all the implementations I
have seen, it seems this also ties the maximum size of the codeword being
256 symbols/polynomial terms.

Intuitively, it seems to me that we could use a different characteristic
for both levels, eg, GF(2^8) for the "symbol level" by effecting only on
the primitive polynomial, and GF(2^32) for the "word level" effecting only
on the generator polynomial.

Am I missing some causal relationship here between those two levels? If
not, then is it really possible to use a different GF for each level, and
do you know of any paper/implementation that did this?
---------------------------------------
Posted through http://www.DSPRelated.com

Reply by Steve Pope ●June 30, 20152015-06-30

lrq3000 <106516@DSPRelated> wrote:

>I have implemented an almost "universal" (read: compatible with most other
>decoders output) Reed-Solomon codec. I feel like I have a good intuitive
>grasp on the whole codec, except for one thing: I just can't understand
>why we are tied to the same characteristic at both the symbol level and at
>the word level.

>Let me explain: if for example you choose to work on GF(2^8), this means
>that you will encode each symbol of a message using 256 possible values,
>which is perfect for binary data. However, in all the implementations I
>have seen, it seems this also ties the maximum size of the codeword being
>256 symbols/polynomial terms.

>Intuitively, it seems to me that we could use a different characteristic
>for both levels, eg, GF(2^8) for the "symbol level" by effecting only on
>the primitive polynomial, and GF(2^32) for the "word level" effecting only
>on the generator polynomial.

>Am I missing some causal relationship here between those two levels? 

Nope, you're not missing anything, the category of codes you are
talking about have been called (with minor variations) Algebraic
Geometry codes, Goppa codes, and/or nonbinary BCH codes.

Steve

Reply by tsd82 ●June 30, 20152015-06-30

>Let me explain: if for example you choose to work on GF(2^8), this means
>that you will encode each symbol of a message using 256 possible values,
>which is perfect for binary data. However, in all the implementations I
>have seen, it seems this also ties the maximum size of the codeword
being
>256 symbols/polynomial terms.
>
>Intuitively, it seems to me that we could use a different characteristic
>for both levels, eg, GF(2^8) for the "symbol level" by effecting only on
>the primitive polynomial, and GF(2^32) for the "word level" effecting
only
>on the generator polynomial.
>

Hi,

If you work with GF(2^8), then you have only 256 possible values in your
field, and RS code is based on the evaluation of a degree k (k being the
number of useful symbols) on n different symbols (n being the total number
of symbols transmitted). So n <= 2^8, otherwise you send redundant
informations (same evaluations sent more than once).

Actually, it seems that often n is choosen to be  2^p-1 (255 in your
case), instead of 2^p, but I don't kwow why.

Julien

---------------------------------------
Posted through http://www.DSPRelated.com

Reply by Steve Pope ●June 30, 20152015-06-30

tsd82 <105802@DSPRelated> wrote:

>If you work with GF(2^8), then you have only 256 possible values in your
>field, and RS code is based on the evaluation of a degree k (k being the
>number of useful symbols) on n different symbols (n being the total number
>of symbols transmitted). So n <= 2^8, otherwise you send redundant
>informations (same evaluations sent more than once).
>
>Actually, it seems that often n is choosen to be  2^p-1 (255 in your
>case), instead of 2^p, but I don't kwow why.
>
>Julien

Short explanation: the field is order 256, meaning there are 255
non-zero elements in the field.  It is simplest to associate
the loations in the codeword with the non-zero field elements; but if
the zero element is included then it is called an extended RS
code.  A further construction allows including an extra location
(at infinity) which results in a doubly-extended RS code, length
257 is this case.

Steve

Reply by dvsarwate ●July 2, 20152015-07-02

On Monday, June 29, 2015 at 8:11:48 PM UTC-5, lrq3000 wrote:
> Hi there,
> 
> I have implemented an almost "universal" (read: compatible with most other
> decoders output) Reed-Solomon codec. I feel like I have a good intuitive
> grasp on the whole codec, except for one thing: I just can't understand
> why we are tied to the same characteristic at both the symbol level and at
> the word level.
> 
> Let me explain: if for example you choose to work on GF(2^8), this means
> that you will encode each symbol of a message using 256 possible values,
> which is perfect for binary data. However, in all the implementations I
> have seen, it seems this also ties the maximum size of the codeword being
> 256 symbols/polynomial terms.
> 
> Intuitively, it seems to me that we could use a different characteristic
> for both levels, eg, GF(2^8) for the "symbol level" by effecting only on
> the primitive polynomial, and GF(2^32) for the "word level" effecting only
> on the generator polynomial.
> 
> Am I missing some causal relationship here between those two levels? If
> not, then is it really possible to use a different GF for each level, and
> do you know of any paper/implementation that did this?
> ---------------------------------------
> Posted through http://www.DSPRelated.com

The _characteristic_ of both GF(2^8) and GF(2^32)
is 2; what you mean to ask is whether the fields 
need to be of the same size (or cardinality if you
like fancier words). If you want to use a generator
polynomial whose coefficients are in GF(2^32), then
the codeword symbols will also be in GF(2^32) which
can be represented as 4-vectors of symbols (bytes)
from GF(2^8). This is true even if you insist that
the information symbols that you plan on using are
restricted to GF(2^8): when you do the encoding (with
a systematic code), the data symbols will be single
bytes but the parity-check symbols will be from
GF(2^32) and will need 4 bytes to represent. To get
a code that is purely in GF(2^8) will require that
you allow as data only those k-byte patterns that
will give you (4-byte) parity-check symbols of the
form (0,0,0,x) so that the leading zero-valued bytes
need not be transmitted. This is called a **subfield
subcode** since we are restricting _all_ the symbols
(32 bits or 4 bytes) to be in the _subfield_ GF(2^8).
They are not nearly as good as Reed-Solomon codes
designed from the start to have byte-size symbols and
to operate over GF(2^8) as both the symbol field as
well as the coefficient field.

The TL;DR summary is that Yes, it is possible to
use fields of different size but not worth the bother.

Dilip Sarwate

Reply by lrq3000 ●July 16, 20152015-07-16

Thank you all very much, all the answers were very helpful.

I still tried to use longer codewords than what is normally permitted by
the field's size (eg, codeword of length 510 in field 2^8), and I adapted
the different algorithms to support that, but there's a theoretical wall
that I can't beat: values above 2^8 will just be modulo reduced, so that
any error will have two locations: one in range 0-255 and one in range
256-510. So now I better understand the intuition behind the fact that the
field's size must be the same for both at the codeword length level and at
the symbols values level.

Also thank you Dilip for your explanation, this is an alternative I have
thought of and that I would probably have spent time trying without your
clear explanation that it's not worthwile at all.

Steve Pope, I am interested in learning more about singly and doubly
extended RS code, what is a good resource you would advise me (if possible
with algorithms, not only mathematical equations as I intend to implement
them), and what are the advantages of extended RS codes compared to
standard RS?
---------------------------------------
Posted through http://www.DSPRelated.com

Reply by lrq3000 ●July 16, 20152015-07-16

>Thank you all very much, all the answers were very helpful.
>
>I still tried to use longer codewords than what is normally permitted by
>the field's size (eg, codeword of length 510 in field 2^8), and I
adapted
>the different algorithms to support that, but there's a theoretical wall
>that I can't beat: values above 2^8 will just be modulo reduced, so that
>any error will have two locations: one in range 0-255 and one in range
>256-510. So now I better understand the intuition behind the fact that
the
>field's size must be the same for both at the codeword length level and
at
>the symbols values level.
>
>Also thank you Dilip for your explanation, this is an alternative I have
>thought of and that I would probably have spent time trying without your
>clear explanation that it's not worthwile at all.
>
>Steve Pope, I am interested in learning more about singly and doubly
>extended RS code, what is a good resource you would advise me (if
possible
>with algorithms, not only mathematical equations as I intend to
implement
>them), and what are the advantages of extended RS codes compared to
>standard RS?
>---------------------------------------
>Posted through http://www.DSPRelated.com

PS: to extend my explanation about why the field's size needs to be the
same, it's because the values of the generated ECC symbols and of any
symbol in fact are in fine used to generate the positions of the errors:
if a symbol's value is limited to 2^8, you cannot generate a position
above, and thus a codeword cannot be of a longer size. So in reality,
there is no field's size for the "codeword level" as I imagined: there is
only the field's size at the symbol's level, the limit on codeword length
is just a natural consequence.

Anyway I wonder if some genius mathematician could devise an error
correcting code that would not have any limit on the codeword length. I've
heard about Cauchy-Reed-Solomon, but in fact it's a trick because it just
moves the limit by aggregating several symbols into blocks, and the
codeword limit is then on the number of blocks instead of the number of
symbols.

Have someone heard of any ecc without a limit on the codeword length?
---------------------------------------
Posted through http://www.DSPRelated.com

Reply by dvsarwate ●July 16, 20152015-07-16

On Thursday, July 16, 2015 at 6:53:51 AM UTC-5, lrq3000 asked:

> Anyway I wonder if some genius mathematician could devise an error
> correcting code that would not have any limit on the codeword length.

About 45 years ago, "genius mathematicians" named Bose and 
Chaudhuri, and independently Hocquenghem, devised a method
for constructing t-error-correcting codes that can be used
to make binary codes (and more generally, codes over a
specified finite field such as GF(256)) that are as long as 
one wants. Note that t is fixed but the block length n can
be made as large as one wants.

What exactly do you need?

Dilip Sarwate

Reply by Eric Jacobsen ●July 16, 20152015-07-16

On Thu, 16 Jul 2015 06:53:48 -0500, "lrq3000" <106516@DSPRelated>
wrote:

>>Thank you all very much, all the answers were very helpful.
>>
>>I still tried to use longer codewords than what is normally permitted by
>>the field's size (eg, codeword of length 510 in field 2^8), and I
>adapted
>>the different algorithms to support that, but there's a theoretical wall
>>that I can't beat: values above 2^8 will just be modulo reduced, so that
>>any error will have two locations: one in range 0-255 and one in range
>>256-510. So now I better understand the intuition behind the fact that
>the
>>field's size must be the same for both at the codeword length level and
>at
>>the symbols values level.
>>
>>Also thank you Dilip for your explanation, this is an alternative I have
>>thought of and that I would probably have spent time trying without your
>>clear explanation that it's not worthwile at all.
>>
>>Steve Pope, I am interested in learning more about singly and doubly
>>extended RS code, what is a good resource you would advise me (if
>possible
>>with algorithms, not only mathematical equations as I intend to
>implement
>>them), and what are the advantages of extended RS codes compared to
>>standard RS?
>>---------------------------------------
>>Posted through http://www.DSPRelated.com
>
>PS: to extend my explanation about why the field's size needs to be the
>same, it's because the values of the generated ECC symbols and of any
>symbol in fact are in fine used to generate the positions of the errors:
>if a symbol's value is limited to 2^8, you cannot generate a position
>above, and thus a codeword cannot be of a longer size. So in reality,
>there is no field's size for the "codeword level" as I imagined: there is
>only the field's size at the symbol's level, the limit on codeword length
>is just a natural consequence.
>
>Anyway I wonder if some genius mathematician could devise an error
>correcting code that would not have any limit on the codeword length. I've
>heard about Cauchy-Reed-Solomon, but in fact it's a trick because it just
>moves the limit by aggregating several symbols into blocks, and the
>codeword limit is then on the number of blocks instead of the number of
>symbols.
>
>Have someone heard of any ecc without a limit on the codeword length?
>---------------------------------------
>Posted through http://www.DSPRelated.com

Use a FEC like a convolutional code?  You can stream it as long as you
like.


Eric Jacobsen
Anchor Hill Communications
http://www.anchorhill.com

Reply by Steve Pope ●July 16, 20152015-07-16

Eric Jacobsen <eric.jacobsen@ieee.org> wrote:

>On Thu, 16 Jul 2015 06:53:48 -0500, "lrq3000" <106516@DSPRelated>

>>Have someone heard of any ecc without a limit on the codeword length?

>Use a FEC like a convolutional code?  You can stream it as long as you
>like.

Yes, but the simplest such approaches have multiple coderwords
in the stream.  Which is fine.  One approach I like is Berlekamp's
helical interleaver, in which the codeword length N and the interleaving
depth D are selected to be relatively prime.

Steve

Previous12 Next

Reed-Solomon FEC necessary to have the same characteristic for symbols and words?

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group