Hello DSP gurus. I am busy researching the possibilities of detecting and recognizing text in motion video (MPEG/WMV stream). So far i've found lots op university-projects and a hand full of commercial applications that kind-of do what I seek. I am not interested in spending months/years of research and development to build my own software. I am looking for a currently stable and functional SDK to provide my already existing A/V research and archiving software with text-recognition. I seek for a system that does a recognition once, so I can put the metadata in a database for later use (in information searches). Is anyone here familiar with this, and can point me in the right direction? I'd appreciate any response! Best regards, Rob Vermeulen Arbor Audiocommunications BV rvermeulenatarbor-audiodotcom
Text recognition in motion video
Started by ●August 3, 2004
Reply by ●August 3, 20042004-08-03
I have no experience with this, but SRI's ConTEXTract may be worth a look. I heard about it when I was browsing Virage's web site a while ago (they were offering it bundled with one of their products). http://www.esd.sri.com/automation/video_recog.html In article <10guta2t2grd9d9@corp.supernews.com>, Rob Vermeulen <rvermeulen@arbor-audio-antispam-.com> wrote:>Hello DSP gurus. >I am busy researching the possibilities of detecting and recognizing text in >motion video (MPEG/WMV stream). >So far i've found lots op university-projects and a hand full of commercial >applications that kind-of do what I seek. > >I am not interested in spending months/years of research and development to >build my own software. I am looking for a currently stable and functional >SDK to provide my already existing A/V research and archiving software with >text-recognition. >I seek for a system that does a recognition once, so I can put the metadata >in a database for later use (in information searches). > >Is anyone here familiar with this, and can point me in the right direction? > >I'd appreciate any response! > >Best regards, > >Rob Vermeulen >Arbor Audiocommunications BV >rvermeulenatarbor-audiodotcom > > > >
Reply by ●August 3, 20042004-08-03
Thanks, I already found that one while googling around. Indeed, quite impressive. Cheers, Rob "David Gelbart" <gelbart@ICSI.Berkeley.EDU> wrote in message news:ceo12r$6hs$1@agate.berkeley.edu...> I have no experience with this, but SRI's ConTEXTract > may be worth a look. I heard about it when I was > browsing Virage's web site a while ago (they were > offering it bundled with one of their products). > > http://www.esd.sri.com/automation/video_recog.html > > In article <10guta2t2grd9d9@corp.supernews.com>, > Rob Vermeulen <rvermeulen@arbor-audio-antispam-.com> wrote: > >Hello DSP gurus. > >I am busy researching the possibilities of detecting and recognizing textin> >motion video (MPEG/WMV stream). > >So far i've found lots op university-projects and a hand full ofcommercial> >applications that kind-of do what I seek. > > > >I am not interested in spending months/years of research and developmentto> >build my own software. I am looking for a currently stable and functional > >SDK to provide my already existing A/V research and archiving softwarewith> >text-recognition. > >I seek for a system that does a recognition once, so I can put themetadata> >in a database for later use (in information searches). > > > >Is anyone here familiar with this, and can point me in the rightdirection?> > > >I'd appreciate any response! > > > >Best regards, > > > >Rob Vermeulen > >Arbor Audiocommunications BV > >rvermeulenatarbor-audiodotcom > > > > > > > > > >
Reply by ●August 3, 20042004-08-03
Rob Vermeulen wrote:> Hello DSP gurus. > I am busy researching the possibilities of detecting and recognizing text in > motion video (MPEG/WMV stream). > So far i've found lots op university-projects and a hand full of commercial > applications that kind-of do what I seek. > > I am not interested in spending months/years of research and development to > build my own software. I am looking for a currently stable and functional > SDK to provide my already existing A/V research and archiving software with > text-recognition. > I seek for a system that does a recognition once, so I can put the metadata > in a database for later use (in information searches). > > Is anyone here familiar with this, and can point me in the right direction? > > I'd appreciate any response! > > Best regards, > > Rob Vermeulen > Arbor Audiocommunications BV > rvermeulenatarbor-audiodotcom >I would suggest posting also to comp.speech.research . IIRC there was recent discussion of identifying speech occurring in presence of other audio.
Reply by ●August 3, 20042004-08-03
Thanks for your reply, but what has this got to do with speech recognition ? :-) Perhaps you misinterpret the word 'Text' which in this case means graphical written (drawn/rendered) words. I am not looking for a speech recognition algorithm; this I already have. But any input is welcome! Cheers, Rob "Richard Owlett" <rowlett@atlascomm.net> wrote in message news:10gvu2q9gks3qd6@corp.supernews.com...> Rob Vermeulen wrote: > > > Hello DSP gurus. > > I am busy researching the possibilities of detecting and recognizingtext in> > motion video (MPEG/WMV stream). > > So far i've found lots op university-projects and a hand full ofcommercial> > applications that kind-of do what I seek. > > > > I am not interested in spending months/years of research and developmentto> > build my own software. I am looking for a currently stable andfunctional> > SDK to provide my already existing A/V research and archiving softwarewith> > text-recognition. > > I seek for a system that does a recognition once, so I can put themetadata> > in a database for later use (in information searches). > > > > Is anyone here familiar with this, and can point me in the rightdirection?> > > > I'd appreciate any response! > > > > Best regards, > > > > Rob Vermeulen > > Arbor Audiocommunications BV > > rvermeulenatarbor-audiodotcom > > > > I would suggest posting also to comp.speech.research . > IIRC there was recent discussion of identifying speech occurring in > presence of other audio. >
Reply by ●August 3, 20042004-08-03
Ooops. You right. I'm biased towards problems *I* wish to solve ;] Now do you have lead on end user "phoneme recognizer" as opposed to "speech recognizer"? Rob Vermeulen wrote:> Thanks for your reply, > > but what has this got to do with speech recognition ? :-) > Perhaps you misinterpret the word 'Text' which in this case means graphical > written (drawn/rendered) words. I am not looking for a speech recognition > algorithm; this I already have. > > But any input is welcome! > > Cheers, > > Rob > > "Richard Owlett" <rowlett@atlascomm.net> wrote in message > news:10gvu2q9gks3qd6@corp.supernews.com... > >>Rob Vermeulen wrote: >> >> >>>Hello DSP gurus. >>>I am busy researching the possibilities of detecting and recognizing > > text in > >>>motion video (MPEG/WMV stream). >>>So far i've found lots op university-projects and a hand full of > > commercial > >>>applications that kind-of do what I seek. >>> >>>I am not interested in spending months/years of research and development > > to > >>>build my own software. I am looking for a currently stable and > > functional > >>>SDK to provide my already existing A/V research and archiving software > > with > >>>text-recognition. >>>I seek for a system that does a recognition once, so I can put the > > metadata > >>>in a database for later use (in information searches). >>> >>>Is anyone here familiar with this, and can point me in the right > > direction? > >>>I'd appreciate any response! >>> >>>Best regards, >>> >>>Rob Vermeulen >>>Arbor Audiocommunications BV >>>rvermeulenatarbor-audiodotcom >>> >> >>I would suggest posting also to comp.speech.research . >>IIRC there was recent discussion of identifying speech occurring in >>presence of other audio. >> > > >
Reply by ●August 4, 20042004-08-04
> Now do you have lead on end user "phoneme recognizer" as opposed to > "speech recognizer"?I use the Nexidia SDK (formerly known as Fasttalk) for speech recognition. Indeed this is phoneme based recognition which is language/dialect dependent. I haven't got the capacity & time to develop my own. Although it is very interesting matter. But I do like to combine technologies in to something that is worth twice as much as the sum of the parts :-) hth, Rob> Rob Vermeulen wrote: > > Thanks for your reply, > > > > but what has this got to do with speech recognition ? :-) > > Perhaps you misinterpret the word 'Text' which in this case meansgraphical> > written (drawn/rendered) words. I am not looking for a speechrecognition> > algorithm; this I already have. > > > > But any input is welcome! > > > > Cheers, > > > > Rob > > > > "Richard Owlett" <rowlett@atlascomm.net> wrote in message > > news:10gvu2q9gks3qd6@corp.supernews.com... > > > >>Rob Vermeulen wrote: > >> > >> > >>>Hello DSP gurus. > >>>I am busy researching the possibilities of detecting and recognizing > > > > text in > > > >>>motion video (MPEG/WMV stream). > >>>So far i've found lots op university-projects and a hand full of > > > > commercial > > > >>>applications that kind-of do what I seek. > >>> > >>>I am not interested in spending months/years of research anddevelopment> > > > to > > > >>>build my own software. I am looking for a currently stable and > > > > functional > > > >>>SDK to provide my already existing A/V research and archiving software > > > > with > > > >>>text-recognition. > >>>I seek for a system that does a recognition once, so I can put the > > > > metadata > > > >>>in a database for later use (in information searches). > >>> > >>>Is anyone here familiar with this, and can point me in the right > > > > direction? > > > >>>I'd appreciate any response! > >>> > >>>Best regards, > >>> > >>>Rob Vermeulen > >>>Arbor Audiocommunications BV > >>>rvermeulenatarbor-audiodotcom > >>> > >> > >>I would suggest posting also to comp.speech.research . > >>IIRC there was recent discussion of identifying speech occurring in > >>presence of other audio. > >> > > > > > >
Reply by ●August 4, 20042004-08-04
Rob Vermeulen wrote:> Hello DSP gurus. > I am busy researching the possibilities of detecting and > recognizing text in motion video (MPEG/WMV stream). > So far i've found lots op university-projects and a hand full of > commercial applications that kind-of do what I seek. > > I am not interested in spending months/years of research and > development to build my own software. I am looking for a currently > stable and functional SDK to provide my already existing A/V > research and archiving software with text-recognition. > I seek for a system that does a recognition once, so I can put the > metadata in a database for later use (in information searches). > > Is anyone here familiar with this, and can point me in the right > direction? > > I'd appreciate any response! > > Best regards, > > Rob Vermeulen > Arbor Audiocommunications BV > rvermeulenatarbor-audiodotcomMaybe it's requiring too much work from your side, but it might be at least worth mentioning... If your software is able to provide a still image of the text, then the OCR tools which are used for scanners, might be applicable. I think of "gocr" which should be integrable. Bernhard
Reply by ●August 4, 20042004-08-04
Bernhard, Thanks for this. I have been searching for OCR SDK's the past few days but gocr did not show up in Google. I'll look it up and see if it is usable. Yes, I can produce still images which I can feed to an OCR algorithm. I already tested it on several algorithms, even on my own written OCR routine (which is basically simple, I found out). The problem with OCR is that the algoritm only works accurately when text is placed on a solid background, which isn't the case in video material most of the time. I want to detect subtitles and other "overlayed text" but also "scene text" such as license places on cars and company logos on buildings. The text in the last category can also appear in every angle, rotated in every direction and even in perspective. What I need is more than just OCR. It must first do text-detection, classification, de-blurr filters and other preprocessing things before it recognizes characters. But I haven't looked into gocr yet, so it still might surprise me :-) I'm still open for other suggestions. Cheers, Rob "Bernhard Holzmayer" <holzmayer.bernhard@deadspam.com> wrote in message news:7278517.ib1pVHU4g4@holzmayer.ifr.rt...> > Maybe it's requiring too much work from your side, but it might be > at least worth mentioning... > > If your software is able to provide a still image of the text, then > the OCR tools which are used for scanners, might be applicable. > I think of "gocr" which should be integrable. > > Bernhard
Reply by ●August 4, 20042004-08-04
Rob Vermeulen wrote: ...> The problem with OCR is that the algoritm only works accurately when text is > placed on a solid background, which isn't the case in video material most of > the time. I want to detect subtitles and other "overlayed text" but also > "scene text" such as license places on cars and company logos on buildings. > The text in the last category can also appear in every angle, rotated in > every direction and even in perspective. > What I need is more than just OCR. It must first do text-detection, > classification, de-blurr filters and other preprocessing things before it > recognizes characters.There may be image enhancement processes that you could use to isolate the letter outlines. Once you have that, you can put it on any background you like. Some OCR programs may be able to use that information directly. I can't begin to estimate the resources needed. They could well be excessive. Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������