Inline interpolation 320 to 720

Started by Alex Z June 23, 2009
Hello group.
I'm dealing now with the task of converting 320x240 RGB video stream (50 fps, progressive) into a standard D1 stream (YCrCb 4:2:2, BT656 sync coded) in order to be able to interface to virtually any commercial video encoder device.
Obviously, the steps of gamma-correction on RGB and then RGB to YCrCb conversion steps shall be done first, and these appear to be obvious.
Now, I need to convert from 320 pixels in line to PAL standard 720 (non-square pixels).
I do not intend to do any real interlacing by line-wise because of having 50 fps at the input, so I'll just stream it down to output and the encoder will treat every two adjacent input frames as two fields of single frame which is fine for me. The only line-related consideration is converting 240 input lines into 288 output (to make standard 576 lines of PAL frame) but that I consider making by duplicating every other 6th line.
The real question is about 320 -> 720 in-line interpolation.
After RGB (gamma-corrected) to YCrCb conversion and 4:4:4 to 4:2:2, I'll obtain 320 Y samples along with 320 CbCr samples (160 of each color) in single line.
My application emphasis on Y information (being inherently B&W), color is far less important (only for system menus, etc..), so I'd consider just somehow duplicating CbCr samples (or some simple interpolation) to achieve 720 color samples in line, however, I think obtaining 720 Y samples output 320 requires more robust algorithm to preserve reasonable luma informaiton quality.
Another consideration - I'm going to implement all that in FPGA + necessary memory and DSP blocks if necessary.

I'd prefer to avoid algorithms that require complex computations and deep memory resources, and also consider only single-dimensional aproach, hence bicubic and bilinear are less considered.

The options that I'd like to check out is just simple linear interpolation, or, alternatively, 9:4 resampling (sample rate conversion) approaches.
9:4 resampling looks viable, however requires and LPF (anti-imaging/anti-aliasing filter between interpolation and decimation stages), and I'm uncertain a bit about viability of FIR implementation of necessary length in FPGA. The potential issue might be tied to that data rates at the input and output. Input rate if 50 fps x 253 lines x 320 ~ 4.4 MHz, output rate (D1) is required to be 27 MHz.

The largest data buffer can be the size of frame, but that would require external SRAM chip whilst I'd prefer to utilize FPGA internal memory blocks which would allow only few tens of lines to get stored.

I'd be happy to hear an educated advise in regard of such in-line interpolation approaches, in particular if backed up by an appropriate experience.

Thanks in advance, Alex