DSPRelated.com
Forums

bit stuffing in serial protocols

Started by robert bristow-johnson April 1, 2016
this question is really only for my own interest.  i don't work in communications and the only time i ever got paid for doing anything like this was in 1981 or 82 when i implemented a "Kansas City Standard" for saving and loading binary data onto audio cassette tape for a 6809 based platform.

i've asked this before, but cannot find the post in our new, improved Google Groups but i am still sorta obsessed with OQPSK thing because it seems beautifully natural to me.  i like engineering that is beautifully natural.

restating a previous post, if you have a serial bit stream a[n] which is either 0 or 1, the bipolar binary signal is

     x[n]  =  2*(-1/2 + a[n])

so x[n] is either -1 or +1

you define these staggered gating signals as

even-gating function: 

    g[n]      =  (1/2)*(1 + (-1)^n) =  1 for n even, 0 for n odd 
              =  (1/2)*(1 + e^(j*pi*n)) 

and its complement is the odd-gating function: 

    1 - g[n]  =  (1/2)*(1 - (-1)^n) =  0 for n even, 1 for n odd 
              =  (1/2)*(1 - e^(j*pi*n)) 

the two quadrature signals are

    i[n]  =      g[n]*x[n]   +  (1-g[n])*x[n-1]

      = ((1 + e^(j*pi*n))*x[n] + (1 - e^(j*pi*n))*x[n-1])/2 
      = (x[n]+x[n-1])/2 + e^(j*pi*n)*(x[n]-x[n-1])/2
 

    q[n]  =  (1-g[n])*x[n]   +      g[n]*x[n-1]

      = ((1 - e^(j*pi*n))*x[n] + (1 + e^(j*pi*n))*x[n-1])/2 
      = (x[n]+x[n-1])/2 - e^(j*pi*n)*(x[n]-x[n-1])/2


for even n      i[n+1] = i[n]
and for odd n   q[n+1] = q[n]

so both i[n] and q[n] remain constant for two bit periods and change at staggered bit boundaries, which cuts their data rates in half.  i[n] changes only when an odd n changes to an even n and q[n] changes when an even n advances to an odd n.

the IF quadrature vector is  i[n] + j*q[n]  and always transitions from one quadrant to an adjacent quadrant during a single bit transition and never to the opposite quadrant.  this is why i think Offset QPSK is so cool.  it keeps the signal level up, like FM.

and the transmitted signal looks like

    s(t) =  i[n]*cos(2*pi*f_c*t)  -  q[n]*sin(2*pi*f_c*t)

         =  Re{ (i[n] + j*q[n]) * e^(j*2*pi*f_c*t) }

where f_c is the carrier frequency and

   n = floor(f_d * t)

where f_d is the data rate.

now you can plug all this in to the same mess and get

   s(t)  =  1/2 * Re{ (1+j)*(x[n]+x[n-1]) * e^(j*2*pi*f_c*t)) }
    +  1/2 * Re{ (1-j)*(x[n]-x[n-1])*e^(j*pi*n) * e^(j*2*pi*f_c*t)) }

now, i remember slamming this out before and getting for a data stream that is all
  ...0000000000000000...   or
  ...1111111111111111...   or
  ...0101010101010101...   or
  ...1010101010101010...

you would get a solid sinusoid out at the carrier frequency of f_c.  the IF vector does not move.

and when the data is
  ...0011001100110011...
you get a frequency of f_c + f_d/2.  the IF vector is rotating clockwise.
  
and when the data is
  ...0110011001100110...   (offset by one bit)
you get a frequency of f_c - f_d/2.  the IF vector is rotating counter-clockwise.

now does that all make sense?

now it seems to me that the logical thing to do is use one of the four first bit patterns to define an idle channel and the receiver would get a solid f_c to lock onto.

and it would seem to me that using a word like one of the bottom two as a frame sync flag would be good (say 0011001100110011) because it would give you a burst at the extreme frequency f_c + f_d/2 for the longest duration, something that should be able to trip a simple tuned filter (or a matched filter) to force synchronization no matter what.

perhaps an end-of-frame that would be the other word 0110011001100110 that would give you a burst of f_c - f_d/2 would make for a nice, robust protocol.

now do any OQPSK schemes do that?  any that you've heard of or know about?

then of course the danger is: what if your data contains the word 0011001100110011, wouldn't that cause a spurious frame sync?  why not use bit stuffing so that whenever there is a stream of data that looks like 00110011001100 an extra 0 is stuffed into the stream or when 11001100110011 is in the data, an extra 1 is stuffed into the stream?  the receiver would recognize the same patterns, with the stuffed bit, and remove the stuffed bit from the stream.  if the stuffed bit is not there, it would be a frame sync.

i remember SDLC from my olden daze and that was used so that the flag and the beginning and end flags were 01111110 and any five 1 bits in a row always got a zero stuffed in afterward, no matter what the following bit was.

what does OQPSK (or regular QPSK) do for idle channel and for frame synchronization?

sorry for all the pedanticness.

r b-j
On 01.04.2016 19:45, Evgeny Filatov wrote:
(snip)
> That's PSK with a constellation consisting of 3 possible states. The > uncoded TPSK is superior to QPSK in terms of bit-error rate for the same > SNR (up to 0.75 dB gain without bandwidth expansion).
Oops! I've meant "with the same ratio of bandwidth to bit rate". If you go from BPSK to QPSK within the same bandwidth, you double the bit rate but you also have to double the power to retain the same SNR. The ratio of bandwidth to bit rate remains the same. You gain nothing. While if you go from BPSK to TPSK retaining the same bandwidth, you increase the bit rate by a factor of log2(3)=1.585, and you only need to increase the power by a factor of 4/3=1.333. That's a solid 0.75 dB gain! Evgeny
On Fri, 01 Apr 2016 19:45:18 +0300, Evgeny Filatov
<filatov.ev@mipt.ru> wrote:

>On 01.04.2016 9:13, robert bristow-johnson wrote: > >(snip) > >> i've asked this before, but cannot find the post in our new, improved Google Groups but i am still sorta obsessed with OQPSK thing because it seems beautifully natural to me. i like engineering that is beautifully natural. > >(snip) > >> the IF quadrature vector is i[n] + j*q[n] and always transitions from one quadrant to an adjacent quadrant during a single bit transition and never to the opposite quadrant. this is why i think Offset QPSK is so cool. it keeps the signal level up, like FM. > >Do you know a modulation more beautifully natural? Ternary PSK (TPSK). > >That's PSK with a constellation consisting of 3 possible states. The >uncoded TPSK is superior to QPSK in terms of bit-error rate for the same >SNR (up to 0.75 dB gain without bandwidth expansion). > >Like in OQPSK, there are no transitions through zero in TPSK, but TPSK >does that without staggered bits. I've seen claims in the literature >that TPSK is even more spectrally efficient than OQPSK. > >(snip)
I'd never heard of that before. Is that something you came up with? Given that complexity is cheap these days, that might be something fun to play with.
>> what does OQPSK (or regular QPSK) do for idle channel and for frame synchronization? > >That is specified by the protocol, and there are numerous protocols. > >Frame synchronization is typically done with the use of an unique word >(specified by the protocol). One way to deal with spurious locks is that >frame synchronizer is actually a state machine. If it's already in lock >it _expects_ to see the unique word at the start of the next frame, so >it would ignore the unique word if it occurs in the middle of the frame. >If locked frame synchronizer fails to see the unique word at the start >of the next frame it lowers its state and finally gets out of lock. > >Evgeny >
As mentioned, this is very system specific. Robert, are you thinking of a continuous stream system or a bursty system? For a continuous system the framing is often done after demodulation, i.e., in the decoded bits, and is then independent of the modulation and coding (FEC). Sometimes there are advantages to synchronizing the coding, and in those cases Unique Word is usually used or some like a modulated preamble to define a super frame and aid in initial acquisition. For bursty systems the demodulator often has to fully synchronize frequency, phase, and symbol timing from burst-to-burst, so the preamble must contain features that facilitate all of those as well as unambiguously mark the start of modulated data. Sadly, the patterns you mention aren't very good for those tasks, but they are often used for testing. They're a good way to detect distortion like channel balance, phase balance, etc. Eric Jacobsen Anchor Hill Communications http://www.anchorhill.com
On 01.04.2016 20:28, Eric Jacobsen wrote:

> I'd never heard of that before. Is that something you came up with? > Given that complexity is cheap these days, that might be something fun > to play with.
It was one of those "eureka! wait, but some guy has already figured it out over 30 years ago" moments. There are several papers about ternary PSK in Google Scholar. The oldest reference seems to be 1980 article "Comparison of three-phase modulation with two-phase and four-phase modulation" by Pierce, but unfortunately that doesn't seem to be accessible online. Also, with channel codes any initial gain would be greatly decreased (you cannot beat Shannon). So it's sort of moot. Evgeny
Evgeny Filatov  <filatov.ev@mipt.ru> wrote:

>There are several papers about ternary PSK in Google Scholar. The oldest >reference seems to be 1980 article "Comparison of three-phase modulation >with two-phase and four-phase modulation" by Pierce, but unfortunately >that doesn't seem to be accessible online. > >Also, with channel codes any initial gain would be greatly decreased >(you cannot beat Shannon). So it's sort of moot.
I'm not sure I would be ready to conclude the advantages disappear with coding -- TPSK may have a better mutual-information capacity than BPSK (the mutual-information capacity of which falls roughly 2 dB short of the Shannon limit). Steve
On Fri, 01 Apr 2016 20:43:14 +0300, Evgeny Filatov
<filatov.ev@mipt.ru> wrote:

>On 01.04.2016 20:28, Eric Jacobsen wrote: > >> I'd never heard of that before. Is that something you came up with? >> Given that complexity is cheap these days, that might be something fun >> to play with. > >It was one of those "eureka! wait, but some guy has already figured it >out over 30 years ago" moments. > >There are several papers about ternary PSK in Google Scholar. The oldest >reference seems to be 1980 article "Comparison of three-phase modulation >with two-phase and four-phase modulation" by Pierce, but unfortunately >that doesn't seem to be accessible online. > >Also, with channel codes any initial gain would be greatly decreased >(you cannot beat Shannon). So it's sort of moot. > >Evgeny
Modulation types each have their own capacity, and when you're trying to reach "capacity" in a link, it's that capacity that you're trying to hit. Coding helps you get there. Usually coding capacity is shown for antipodal binary, which is essentially BPSK. All modulation types have their own capacity curve, and I'd expect that ternary PSK would fall between BPSK and QPSK, so it would fill a niche that may be useful. I was trying to sort out how to map the constellation, since there isn't a power-of-two number of points. Even mapping across multiple symbols isn't straightforward, since 3 doesn't go into 2^N evenly for practical N. It might make a really cool way to do an R = 1/3 code, though. ;) Eric Jacobsen Anchor Hill Communications http://www.anchorhill.com
On Friday, April 1, 2016 at 1:28:43 PM UTC-4, Eric Jacobsen wrote:
> On Fri, 01 Apr 2016 19:45:18 +0300, Evgeny Filatov > <filatov.ev@mipt.ru> wrote: > > >On 01.04.2016 9:13, robert bristow-johnson wrote: > > > >(snip) > > > >> i've asked this before, but cannot find the post in our new, improved Google Groups but i am still sorta obsessed with OQPSK thing because it seems beautifully natural to me. i like engineering that is beautifully natural. > > > >(snip) > > > >> the IF quadrature vector is i[n] + j*q[n] and always transitions from one quadrant to an adjacent quadrant during a single bit transition and never to the opposite quadrant. this is why i think Offset QPSK is so cool. it keeps the signal level up, like FM. > > > >Do you know a modulation more beautifully natural? Ternary PSK (TPSK). > > > >That's PSK with a constellation consisting of 3 possible states. The > >uncoded TPSK is superior to QPSK in terms of bit-error rate for the same > >SNR (up to 0.75 dB gain without bandwidth expansion). > > > >Like in OQPSK, there are no transitions through zero in TPSK, but TPSK > >does that without staggered bits. I've seen claims in the literature > >that TPSK is even more spectrally efficient than OQPSK.
it's like an equilateral triangle constellation of symbols?
> I'd never heard of that before. Is that something you came up with? > Given that complexity is cheap these days, that might be something fun > to play with.
i remember, when i was a student, talk of "multi-valued logic" and base-3 logic was electrically simple with two MOSFETs connected between ground and Vcc. if both MOSFETs were "on", the output voltage would be halfway between Vcc and Gnd (or maybe it was both "off"). if one MOSFET was "on" and the other "off, then the output voltage was either Vcc or Gnd. 3 symbols. must be an absolute copulating female canine to encode normal (binary) data (the way God meant it to be) into such an abomination. and back again. sorry Evgeny, my kinda "beautifully natural" is not a Rube Goldberg machine. base-4 multi-valued logic was supposed to be easier to convert to binary (it would be 2 bits per wire), but harder to do 4-level logic, electrically. all this was supposed to reduce pin count and the number of PCB-traces. i have never in my life seen any of this multi-valued logic stuff in any product. i am sorta averse to really newfangled goofy shit like this. guess it's the Luddite in me. i've also never heard of 3-state PSK. but why not? :-) i can imagine this 3-state PSK thingie (essentially 3-QAM with an equilateral triangle constellation) to do *differential* binary coding. so you wouldn't give a rat's ass which actual point you're at, the receiver only cares if the movement of the quadrature IF vector is clockwise or counter-clockwise. that sorta makes sense to me. could be *real* clean and simple. like OQPSK, each movement of the IF vector corresponds to exactly one bit of data. i think it's accurate to say that any QPSK or 4-QAM is simple 4-valued data transmission (2 bits per symbol). another reason why i think that *Offset* QPSK is more beautiful than "regular" QPSK is that the serial stream does not have to be divided up into "di-bit" symbols. the new bit arrives from the data stream just in time for the quadrature IF vector to know whether it's going clockwise or counter-clockwise from its present position.
> >> what does OQPSK (or regular QPSK) do for idle channel and for frame synchronization? > > > >That is specified by the protocol, and there are numerous protocols. > > > >Frame synchronization is typically done with the use of an unique word > >(specified by the protocol). One way to deal with spurious locks is that > >frame synchronizer is actually a state machine.
of course it is. that's the way it frames up and decodes the data. it has to know whether it's an even bit time or an odd bit time.
> > If it's already in lock > >it _expects_ to see the unique word at the start of the next frame, so > >it would ignore the unique word if it occurs in the middle of the frame.
it should do more than ignore it. what if it ignores it and it doesn't see the next frame sync when it expects to? then it's "ooops, i guess that was meant to be a frame sync."
> >If locked frame synchronizer fails to see the unique word at the start > >of the next frame it lowers its state and finally gets out of lock.
what if it sees that unique word at the start of the next frame, but it was out of lock? how should it deal with the unique word? what if it thinks a possible version of a frame sync is just happenstance data in the middle of the frame?
> As mentioned, this is very system specific. Robert, are you thinking > of a continuous stream system or a bursty system?
both. either. whichever is worst case. but even if the receiver *thinks* it bursty, i think it should work with a forever continuous stream. and vice-versa. even with a forever continuous stream, there should be occasional frame syncs, just to keep everyone on the same page.
> For a continuous system the framing is often done after demodulation, > i.e., in the decoded bits, and is then independent of the modulation > and coding (FEC).
but you don't necessarily have valid decoded bits after demodulation if the synchronization wasn't right to start with. doesn't there have to be some unambiguous transmission, that if received without error, tells the receiver in no uncertain terms, that we're starting a new frame?
> Sometimes there are advantages to synchronizing > the coding, and in those cases Unique Word is usually used or some > like a modulated preamble to define a super frame and aid in initial > acquisition. > > For bursty systems the demodulator often has to fully synchronize > frequency, phase, and symbol timing from burst-to-burst, so the > preamble must contain features that facilitate all of those as well as > unambiguously mark the start of modulated data.
that's what i am thinking. i am thinking of a stream of data already happening, and you turn on your receiver that has and *idea* of what f_c and f_d are and will be totally useless until the first frame-sync ("preamble", that's the word for it, much better than "unique word") that it sees. but with OQPSK, when the receiver is turned on and listening to a solid f_c, it doesn't have any idea where the IF vector is pointing. could be 00, 01, 11, or 10. the only way it can do that is have an unambiguous preamble that can never appear in the data. and the only way i know of doing that is to do bit stuffing to prevent a combination of bits that happen to look like the preamble from being transmitted inside the data block of the frame.
> Sadly, the patterns you mention aren't very good for those tasks,
why? those two patterns are the only patterns that will detune the carrier away from its center value to the maximum deviation. it would seem to me to be the easiest and cleanest to adapt a matched filter to.
> but they are often used for testing. They're a good way to detect > distortion like channel balance, phase balance, etc.
for OQPSK, what is a good preamble? like what would be a good bit pattern? and, without knowledge of phase sync, how does a receiver detect it (and get all sync'd up)? and if you don't do bit stuffing, like SDLC does to prevent a faux preamble from transmission, how do you robustly keep the receiver from getting confused? thanks Eric. thanks Evgeny. r b-j
robert bristow-johnson  <rbj@audioimagination.com> wrote:

>for OQPSK, what is a good preamble? like what would be a good bit >pattern? and, without knowledge of phase sync, how does a receiver >detect it (and get all sync'd up)?
Some (many?) standards that employ OQPSK have an alternating zero-one preamble pattern, i.e. 0101010101010101010101010101010101 , followed by more of a traditional sync pattern with good distance properties. The advantage of this is that it does, in fact, establish an absolute phase reference point, as the carrier phase is simply slewing back and forth between 0 and pi/2. The disadvantages are I think fairly obvious. (And I'm on warning not to post obvious stuff here...) Steve
spope33@speedymail.org (Steve Pope) writes:

> (And I'm on warning not to post obvious stuff here...)
Damn right! :) -- Randy Yates, DSP/Embedded Firmware Developer Digital Signal Labs http://www.digitalsignallabs.com
On Friday, April 1, 2016 at 11:45:28 PM UTC-4, Steve Pope wrote:
> robert bristow-johnson <rbj@audioimagination.com> wrote: > > >for OQPSK, what is a good preamble? like what would be a good bit > >pattern? and, without knowledge of phase sync, how does a receiver > >detect it (and get all sync'd up)? > > Some (many?) standards that employ OQPSK have an alternating zero-one > preamble pattern, i.e. 0101010101010101010101010101010101 ,
wouldn't that be an idle pattern?
> followed by > more of a traditional sync pattern with good distance properties.
what would that be? i dunno what others think a "traditional sync pattern with good distance properties" is. seems to me it should be 0011001100110011 or 0110011001100110, which detune the instantaneous frequency the most.
> The advantage of this is that it does, in fact, establish an > absolute phase reference point, as the carrier phase is simply slewing > back and forth between 0 and pi/2.
sure, you get a lock on f_c, but you have no idea whether it's 01010101... or 10101010... or 11111111... or 00000000... all of those patterns will give you the same solid sinusoid at f_c (but with different phase, relative to the transmitter clock, which is not known by the receiver, the receiver has to derive it and cannot from any of those four patterns). i can see any one of these as an idle pattern, but i *really* don't see how that is useful at all for a preamble which should, besides telling you that a frame of data is about to come your way, tells you how to sync up to the whole thing. how can you possibly get the quadrature IF vector pointing in the right direction with such a preamble if your reference to phase was ambiguous before? seems to me you need 0011001100110011... or 0110011001100110... in order to do that.
> The disadvantages are I think fairly obvious. (And I'm on warning > not to post obvious stuff here...)
well, dunno what the obvious disadvantages are, because i don't see *any* advantages. r b-j