There are 6 messages in this thread.
You are currently looking at messages 1 to .
Is this discussion worth a thumbs up?
So, I've been looking at a few ways of creating a module that does "video-in" -> "convert to stream of individual JPEG images" -> "send over a network". One way might be to use a Blackfin DSP. I'm having some difficulty trying to figure out whether the DSP would be up to the job though. On the ADI site, they seem to be quoting ~31 cycles/pixel (http://www.analog.com/en/processors-dsp/blackfin/bf_jpeg_motion-jpeg/products/product.html) for quality 60, or ~15 million cycles for an SD video frame (or 25ms/frame). That means at 30fps I'd need 450MHz, which is comfortably within the 600MHz budget of the part (and in fact there are two DSP's on board, so hey, this ought to be simple, right ? Right ??) Then, on the other hand I see people who've been doing this a lot longer than I (because this is my first foray into DSPs) getting a 752x512 (not a million miles away from SD video) frame out in 92ms. What ? That's only ~10 fps, and basically means it's worthless to me. The ADI site does say the memory setup is optimal, but with the sizes of images they're using, they must be storing them in SDRAM, so it must be streaming them from SDRAM to L1/L2 and out again after operating on them. In other words, I don't think the ADI setup is *too* fake. So, where's the beef ? And, at the end of the day, is it possible to do this on the Blackfin ? I've looked at FPGA's (I have some experience there), at XMOS chips to parallelise the problem, at a honking ARM chip (they're actually pretty fast these days, and I've worked with the NEON stuff before). Quite apart from the attraction of learning a new tool (the DSP) it seems there's less "support stuff" needed for the Blackfin, and it's in a fairly friendly package (it's a BGA, but only the outer two rows are used). Any help gratefully appreciated :) Simon______________________________
On Sun, 09 Sep 2012 12:29:22 -0700, krudthebarbarian wrote: > So, I've been looking at a few ways of creating a module that does > "video-in" -> "convert to stream of individual JPEG images" -> "send > over a network". One way might be to use a Blackfin DSP. > > I'm having some difficulty trying to figure out whether the DSP would be > up to the job though. On the ADI site, they seem to be quoting ~31 > cycles/pixel > (http://www.analog.com/en/processors-dsp/blackfin/bf_jpeg_motion-jpeg/ products/product.html) > for quality 60, or ~15 million cycles for an SD video frame (or > 25ms/frame). That means at 30fps I'd need 450MHz, which is comfortably > within the 600MHz budget of the part (and in fact there are two DSP's on > board, so hey, this ought to be simple, right ? Right ??) > > Then, on the other hand I see people who've been doing this a lot longer > than I (because this is my first foray into DSPs) getting a 752x512 (not > a million miles away from SD video) frame out in 92ms. What ? That's > only ~10 fps, and basically means it's worthless to me. > > The ADI site does say the memory setup is optimal, but with the sizes of > images they're using, they must be storing them in SDRAM, so it must be > streaming them from SDRAM to L1/L2 and out again after operating on > them. In other words, I don't think the ADI setup is *too* fake. > > So, where's the beef ? And, at the end of the day, is it possible to do > this on the Blackfin ? > > I've looked at FPGA's (I have some experience there), at XMOS chips to > parallelise the problem, at a honking ARM chip (they're actually pretty > fast these days, and I've worked with the NEON stuff before). Quite > apart from the attraction of learning a new tool (the DSP) it seems > there's less "support stuff" needed for the Blackfin, and it's in a > fairly friendly package (it's a BGA, but only the outer two rows are > used). > > Any help gratefully appreciated :) > > Simon Whatever ADI is quoting, it's very likely a core algorithm, not close to everything you need. So it may be that getting things decoded into fast memory (which is way easy to write to) may not be the real bottleneck, or may be only one of the bottlenecks to getting things decoded and out onto a network. Can you take a canned set of 30 jpeg images of the size you want and pump them out over your network at the speed you want? How much of the processor is left over while you're doing this for decoding? -- My liberal friends think I'm a conservative kook. My conservative friends think I'm a liberal kook. Why am I not happy that they have found common ground? Tim Wescott, Communications, Control, Circuits & Software http://www.wescottdesign.com______________________________
On Sunday, September 9, 2012 1:46:32 PM UTC-7, Tim Wescott wrote: > On Sun, 09 Sep 2012 12:29:22 -0700, krudthebarbarian wrote: > > > The ADI site does say the memory setup is optimal, but with the sizes of > > images they're using, they must be storing them in SDRAM, so it must be > > streaming them from SDRAM to L1/L2 and out again after operating on > > them. In other words, I don't think the ADI setup is *too* fake. > > > So, where's the beef ? And, at the end of the day, is it possible to do > > this on the Blackfin ? > > > Whatever ADI is quoting, it's very likely a core algorithm, not close to > everything you need. At the end of the day, I don't really care if I get JPEG files out, as long as there's a relatively painless way to make them into JPEGs. > So it may be that getting things decoded into fast memory (which is way > easy to write to) may not be the real bottleneck, or may be only one of > the bottlenecks to getting things decoded and out onto a network. Right, but the 92ms figure above is for memory->memory too, although to be fair he doesn't say which Blackfin he's using. ADI are using a slower/less capable Blackfin than the one I had in mind to use (they are using a single-core '548, I'm looking at the dual-core '561). The network stuff is an additional overhead (although I don't think it'll be too onerous. I was also trying to think how I could set up a JPEG compression test to be, ahem, advantageous to the marketing people. Given that they're quoting a final JPEG file size, and results for various JPEG compression ratios in terms of DSP cycles, I'm having trouble thinking that it's anything other than it appears to be. Which os why I'm asking, of course :) The idea is to have multiple video feeds streaming information into a central server. I'm using single frame compression (JPEG, maybe JPEG-2k) because of the unique nature of the video sources - I'll be back-stepping and overwriting individual frames with new data very frequently but still want to play back arbitrary sequences of frames afterwards. So, I have to have compression done at the video-source (or I'll overload the network as well as put too large a load on the server to do the compression), but on playback, it'll be a web-browser / quicktime interface, so if I can easily transform the data to an MJPEG stream (or similar), I'll be fine. If that means prepending a header, appending a footer or whatever to every frame, I'm fine with that. I don't have to have everything perfectly standard at the instant it's compressed, is what I'm trying to say. > Can you take a canned set of 30 jpeg images of the size you want and pump > them out over your network at the speed you want? How much of the > processor is left over while you're doing this for decoding? Yeah, the jpeg compression brings them down to ~60k each. A 100Mbit network can easily handle that load. I was figuring that if one of the DSP cores is doing the JPEG compression, the other could do the network management - a simple producer/consumer queue would handle that simply enough. On the mac, it doesn't even blip the processor to do this :) On the DSP, I was figuring I could use the ENC28J60 and then it's just a matter of sending data via SPI. Depending on how I can arrange it, I might be able to do this via DMA and not even bother the DSP itself. Of course, it'd be nice to know it was at least feasible before plonking down the $500 for the evaluation kit :) Simon______________________________
On Sun, 09 Sep 2012 14:33:17 -0700, krudthebarbarian wrote: > On Sunday, September 9, 2012 1:46:32 PM UTC-7, Tim Wescott wrote: >> On Sun, 09 Sep 2012 12:29:22 -0700, krudthebarbarian wrote: >> >> > The ADI site does say the memory setup is optimal, but with the sizes >> > of images they're using, they must be storing them in SDRAM, so it >> > must be streaming them from SDRAM to L1/L2 and out again after >> > operating on them. In other words, I don't think the ADI setup is >> > *too* fake. >> >> > So, where's the beef ? And, at the end of the day, is it possible to >> > do this on the Blackfin ? >> >> >> Whatever ADI is quoting, it's very likely a core algorithm, not close >> to everything you need. > > At the end of the day, I don't really care if I get JPEG files out, as > long as there's a relatively painless way to make them into JPEGs. > >> So it may be that getting things decoded into fast memory (which is way >> easy to write to) may not be the real bottleneck, or may be only one of >> the bottlenecks to getting things decoded and out onto a network. > > Right, but the 92ms figure above is for memory->memory too, although to > be fair he doesn't say which Blackfin he's using. ADI are using a > slower/less capable Blackfin than the one I had in mind to use (they are > using a single-core '548, I'm looking at the dual-core '561). The > network stuff is an additional overhead (although I don't think it'll be > too onerous. > > I was also trying to think how I could set up a JPEG compression test to > be, ahem, advantageous to the marketing people. Given that they're > quoting a final JPEG file size, and results for various JPEG compression > ratios in terms of DSP cycles, I'm having trouble thinking that it's > anything other than it appears to be. Which os why I'm asking, of course > :) > > The idea is to have multiple video feeds streaming information into a > central server. I'm using single frame compression (JPEG, maybe JPEG-2k) > because of the unique nature of the video sources - I'll be > back-stepping and overwriting individual frames with new data very > frequently but still want to play back arbitrary sequences of frames > afterwards. > > So, I have to have compression done at the video-source (or I'll > overload the network as well as put too large a load on the server to do > the compression), but on playback, it'll be a web-browser / quicktime > interface, so if I can easily transform the data to an MJPEG stream (or > similar), I'll be fine. If that means prepending a header, appending a > footer or whatever to every frame, I'm fine with that. I don't have to > have everything perfectly standard at the instant it's compressed, is > what I'm trying to say. > >> Can you take a canned set of 30 jpeg images of the size you want and >> pump them out over your network at the speed you want? How much of the >> processor is left over while you're doing this for decoding? > > Yeah, the jpeg compression brings them down to ~60k each. A 100Mbit > network can easily handle that load. I was figuring that if one of the > DSP cores is doing the JPEG compression, the other could do the network > management - a simple producer/consumer queue would handle that simply > enough. On the mac, it doesn't even blip the processor to do this :) On > the DSP, I was figuring I could use the ENC28J60 and then it's just a > matter of sending data via SPI. Depending on how I can arrange it, I > might be able to do this via DMA and not even bother the DSP itself. > > Of course, it'd be nice to know it was at least feasible before plonking > down the $500 for the evaluation kit :) Well, ADI's numbers certainly point to it being feasible. No matter what idea you choose as a candidate, I think it's a Really Good Idea to buy an eval kit and try it out before you start laying out a board -- and make sure that you've identified as many possible sources of slowness as you can, and have them as many in there (or at least accounted for) as you can. -- My liberal friends think I'm a conservative kook. My conservative friends think I'm a liberal kook. Why am I not happy that they have found common ground? Tim Wescott, Communications, Control, Circuits & Software http://www.wescottdesign.com______________________________
On Monday, September 10, 2012 9:46:54 AM UTC-7, Tim Wescott wrote: > > Well, ADI's numbers certainly point to it being feasible. No matter what > idea you choose as a candidate, I think it's a Really Good Idea to buy an > eval kit and try it out before you start laying out a board Oh, absolutely. Diving into a new (at least to me - I think I understand the theory, but theory and practice are two different things :) technology, I like to have a "known-good" position on at least something. That gives me a way to build the knowledge on a firm foundation - start off simple with the appropriate "hello, world" code, and iterate through increasing levels of complexity until you get a solution. There's actually a newer blackfin board that seems to be "the future", but since it doesn't come with video-in onboard, I'm sticking to the one that does. Reduce risk is something of a mantra for me :) Not being on the bleeding edge while learning stuff is generally a good idea too. > -- and make > sure that you've identified as many possible sources of slowness as you > can, and have them as many in there (or at least accounted for) as you Yeah, this is where the uncertainty lies. They actually have an eval-kit that does everything I could ask for (http://www.analog.com/en/evaluation/bf561-ezlite/eb.html). Video in routed to the DSP, lots of i/o (albeit on prototyping-unfriendly connectors) and sufficient onboard RAM. This isn't a business idea, it's an attempt to reincarnate a (really expensive) solution that my company (before I sold it :) used to supply about a decade ago. Since it's just a hobby project, I'd like to have a *bit* of certainty before I splash the cash :) I've been trying to join Analog Device's forum to ask pertinent questions there, but after typing in your details they try to send an email to confirm your email address, and that email is never sent (or at least it never arrives). Thus no ability to log into their forum. Thus frustration. I've sent emails to their web-team, but we'll see if that garners any reaction.______________________________
Hi there, the reply might come a bit late, but anyhow: I haven't done any high speed JPEG encoding on the BF561, but did some PAL YUV420 at 10 fps (only asm-coded 2D DCT) MJPEG and rather stupid bayer conversion in uClinux on the single core Blackfins. Actually, the Bayer burns many cycles. If you have a video source that already delivers some YUV format, like a few CMOS sensors, you'll likely reach the 25 fps with a dual core. To be on the safe side, you could put the JPEG encoding into an FPGA. That works rather well. Some more information here: http://tech.section5.ch/news/?p!9 Cheers, - Martin >> >> Well, ADI's numbers certainly point to it being feasible. No matter what >> idea you choose as a candidate, I think it's a Really Good Idea to buy an >> eval kit and try it out before you start laying out a board > > Oh, absolutely. Diving into a new (at least to me - I think I understand the theory, but theory and practice are two different things :) technology, I like to have a "known-good" position on at least something. That gives me a way to build the knowledge on a firm foundation - start off simple with the appropriate "hello, world" code, and iterate through increasing levels of complexity until you get a solution. > > There's actually a newer blackfin board that seems to be "the future", but since it doesn't come with video-in onboard, I'm sticking to the one that does. Reduce risk is something of a mantra for me :) Not being on the bleeding edge while learning stuff is generally a good idea too. > >> -- and make >> sure that you've identified as many possible sources of slowness as you >> can, and have them as many in there (or at least accounted for) as you > > Yeah, this is where the uncertainty lies. They actually have an eval-kit that does everything I could ask for (http://www.analog.com/en/evaluation/bf561-ezlite/eb.html). Video in routed to the DSP, lots of i/o (albeit on prototyping-unfriendly connectors) and sufficient onboard RAM. This isn't a business idea, it's an attempt to reincarnate a (really expensive) solution that my company (before I sold it :) used to supply about a decade ago. Since it's just a hobby project, I'd like to have a *bit* of certainty before I splash the cash :) > > I've been trying to join Analog Device's forum to ask pertinent questions there, but after typing in your details they try to send an email to confirm your email address, and that email is never sent (or at least it never arrives). Thus no ability to log into their forum. Thus frustration. > > I've sent emails to their web-team, but we'll see if that garners any reaction.______________________________