The relatively recent book "Principles of Communication Systems Simulation with Wirless Applications" by Tranter & Shanmugan & Rappaport & Kosbar talks about a Monte Carlo simulation technique called "block-serial": page 380, and also page 747 code listing, with the "block size" NIterate = 1e3. This technique is, of course, not new. To summarize, to simulate/estimate the bit-error rate (BER), data blocks (e.g. of 1000 symbols) are processed iteratively until some N total symbols are reached. Then: BER = (total number of errors) / (total number of bits) Hence, it can also be viewed as averaging over many realizations (each block = 1 realization): block 1 has associated BER_1, block 2 has BER_2, etc. If there are K blocks processed then, BER_avg = (BER_1 + BER_2 + ... + BER_K) / K Note that this technique is actually "required" in some cases, (i.e., cannot run a single realization with N total symbols), notably when fading is considered. In such cases, various realizations must be simulated to account for "random" channels: each realization = a different channel condition. Now, it is quite well-known what the total number of symbols N (or the equivalent number of bits) required for a certain BER resolution (e.g. BER=1e-6) should be. However, my question is, does the same N apply when we're segmenting data symbols into "blocks" as described? Most notably, if the data block size is 1000 (and asumming binary symbols), then the BER resolution for each block is 1e-3. So what is the "reliable" BER resolution that one can expect when averaging over many blocks? And what reasons/theorems can be used to justify such claims? Of course, in the ideal case (independent trials, ergodicity assumptions, etc.), there should not be much difference between block-serial and single-block simulations. However, can the same be said for "typical" simulations, e.g. with channel fading, coding techniques, all introducing some amount of correlation? Is it evident that in this case, averaging over (possibly dependent) data blocks would possibly result in biased and/or inconsistent BER estimates? The literature seems to agree on the fact that, to get a reliable BER estimate, some mininum number of raw independent bit errors (e.g. 10 or 100) should be obtained in the simulation. Can this criteria still be reliable in the block-serial approach? To exaggerate, let's suppose we use short blocks of say 20 data symbols, and then repeat as many times as necessary until there are 50 bit errors reported. And suppose that the BER is reported as 1e-5. Will the BER obtained in this case still be "reliable"? How would this compare to a different simulation using blocks of say 5000 data symbols? So I guess the MAIN QUESTION is the following: "WHAT SHOULD THE MINIMUM BLOCK-SIZE BE, TO RELIABLY OBTAIN A CERTAIN BER-RESOLUTION?" For example, to reliably estimate BER=1e-4, should data blocks have at least 1e4 bits? Note that in the cited book by Tranter et al, there are detailed discussions on picking the total number of symbols N to satisfy a certain BER resolution; however, much less is said on how to choose the individual block size (1000 symbols/block assumed in the given examples). And last but not least, if the block-size is an important factor, would this mean a practical problem in BER estimation when considering short-burst systems? e.g. GSM, where the block-size is by design limited to short blocks. Whew, I did not intend to make this a long post, but giving more information is perhaps always better than too few :-) Also, if this topic has been discussed before, I'd appreciate URL pointers, or article citations. Thanks all for your attention and time :-) -- MT
BER estimation with "block-serial" approach
Started by ●April 17, 2005
Reply by ●April 18, 20052005-04-18
The block-serial approach hinges on block independence. So the answer to your question, "what is the minimum block-size" is given simply as the size which will achieve good approximation to block independence. There are two factors determining the correlation between blocks. The first is the dispersive nature of the communications channel, which will spread information between blocks. So the block size should at least be a few orders of magnitude larger than the channel coherence time (impulse response duration). Secondly the channel noise may be coloured and not white. A coloured noise generator can be modelled as the output of an appropriate ARMA (auto-regressive moving average) filter driven by a white input. The level of noise auto-correlation over time will be given by the impulse duration of the ARMA filter. Again, the block size should be orders of magnitude larger than this. If you have a good model for your channel and noise envornment, then it should be a straightforward process to pick your block size. As an example, suppose you are expecting a bit error rate 10^-3. You would measure 10 errors by simulating approximately 10,000 samples. Now your question was, should that be 10*1000, 20*500, 50*200 etc, in the block serial approach. If your channel has impulse response duration of about 100 samples (say a DSL channel) then your block size should be large (10*1000) whereas if the channel is a multipath wireless channel with arrivals over 10 samples, you can use a much shorter block. Porterboy
Reply by ●April 18, 20052005-04-18
Thanks a lot for the suggestions! It makes sense that the filter order and the channel coherence time / Doppler spread (or rather, its inverse) would affect the block size selection. Still, in cases where coding is involved, with intentional redundancy/correlation, perhaps some added conservatism, i.e. larger block sizes, is warranted. And on a somewhat related note, pp. 164-169 of Steven Kay's "Estimation Theory" investigate the effect of finite data record on maximum likelihood estimation. The conclusions seem to be that the precise data record length required for guaranteed asympotic optimality is not generally known in all cases. Fortunately, however, it is found in practice (through simulations) that the data length does not have to be excessive for good estimates. Still, the given simulation examples deal primarily with quantities on the order of 1e-1. Anyway, unless analytical lower bounds can be established, I still feel suspicious of extremely low BERs obtained with small block sizes. After all, with BER as low as say 1e-6, it is quite unlikely that block with small sizes would result in any bit error. And each new block is like "resetting" the simulation state. Thus, the many zero-error blocks would tend to "skew" the average BER, towards something even lower than the true BER value?
Reply by ●April 19, 20052005-04-19
It is not necessarily true that you are resetting the simulation state with each block in the block-serial approach. For example suppose you are using the matlab function rand, or randn for noise/data generation. This can be seeded to run from different points over it's (very long) period giving the same statistics for a large number of small blocks as for a small number of large blocks. Have a look at the documentation for randn.