Jeff,
--- Jeff Brower <jbrower@jbro...> wrote:
> Mike-
>
> > I put this on the shelf and had an occaision to
> look
> > at some c6727 issues in more detail [one group of
> > functions picked up a 30%+ speed increase with
> only a
> > recompile and a noticeable reduction in code
> size].
> > After trying to 'reverse engineer' where the
> > improvements were located, I thought that I would
> > apply some of my own advice - RTFM.
> >
> > My enlightened personal suggestion is that you
> perform
> > some serious benchmarking on the c6727 if your
> 6713 is
> > running out of gas - and maybe even if it isn't.
> >
> > I found the following documents very interesting
> and
> > helpful - especially the migration guide. It's
> > amazing how much more meaningful some of the
> > information can be after you have 'mucked with the
> > details' and measured it.
> >
> > SPRAA78May 2005
> > TMS320C6713 to TMS320C672x Migration Guide
> > TMS320C672x Floating-Point Digital Signal
> Processor
> > ROM
> > SPRS277AMAY 2005REVISED NOVEMBER 2005
> > - On-Chip Bootloader
> > Full-Feature Version of DSP/BIOS Operating
> System
> > Optimized Math Library (FastRTS) Library of
> Commonly
> > Used DSP Functions (DSPLIB)
> >
> > RE. Jeff's cache/c5502/c6727 comments
> > Jeff's comments seemed a bit odd, but I was busy
> at
> > the time... After seeing a very significant
> > performance increase with the c6727, I decided
> that
> > 'something was wrong with the picture'. I am not
> sure
> > if Jeff's code just has 'not much to do' or what.
> > Although it is true that there is no L2 cache on
> the
> > c6727, it is also true that the c6727 L1P cache is
> > 32KB [vs. 4 KB for 6713 L1P or 64K max for 6713
> L2].
> > I looked up the c5502 and it had even less cache
> 16KB.
>
> I wish our code had not much to do :-) My
> experience has been that C5xxx code
> compiles far more compactly than C67xx, making
> onchip memory and cache more
> effective.
>
> C6727 small size is impressive for floating-point,
> but I still don't see a persuasive
> advantage for audio/acoustic applications vs.
> smaller C55xx devices that have lower
> power consumption and a wider range of peripherals
> those apps typically need. Vs.
> C6713 it's faster but what if the app needs SDRAM?
> 64k x combined total data + prog
> memory is sort of a throwback, not to mention
> requiring very compact code (see
> previous point).
>
> If no L2 cache, no low power, no suite of I/O
> peripherals, then why not double the
> clock rate? C641x runs at 1 GHz, TigerSharc runs
> 600 MHz. Where's the compelling
> reason?
I guess that you have listed some of the reasons that
TI makes more than one architecture.
mikedunn
>
> -Jeff
>
> > --- Andrew Elder <andrew_elder@andr...> wrote:
> >
> > > Mike and Jeff,
> > >
> > > Thanks for the comments. I guess the most useful
> > > observation is that the
> > > C6727 is not necessarily an obvious upgrade path
> > > from a C6713.
> > >
> > > Mike, what are ROM DSPLIB functions ?
> > > Has anyone noticed whether updated DSPLIB
> routines
> > > for the 6727 have
> > > been released by TI ?
> > >
> > > - Andrew E
> > >
> > > Mike Dunn wrote:
> > >
> > > >Andrew,
> > > >
> > > >some comments below.
> > > >
> > > >mikedunn
> > > >
> > > >--- Jeff Brower <jbrower@jbro...> wrote:
> > > >
> > > >
> > > >
> > > >>Andrew-
> > > >>
> > > >>
> > > >>
> > > >>>I wonder if anyone has real-world performance
> > > >>>
> > > >>>
> > > >>comments on the C6727 ?
> > > >>
> > > >>
> > > >>>Is it MUCH faster than the 6713 ?
> > > >>>
> > > >>>
> > > >Just because it is newer, doesn't always mean
> > > faster
> > > >[apps].
> > > >
> > > >
> > > >
> > > >>>Do the extra 32 registers allow the compiler
> to
> > > >>>
> > > >>>
> > > >>"go to town" on
> > > >>
> > > >>
> > > >>>optimization ?
> > > >>>
> > > >>>We currently use the C6713 and enjoy the L2
> > > >>>
> > > >>>
> > > >>caching performance speedup,
> > > >>
> > > >>
> > > >>>even if it is at the expense of 100%
> > > deterministic
> > > >>>
> > > >>>
> > > >>behaviour. Since the
> > > >>
> > > >>
> > > >>>C6727 doesn't have any L2 cache, it looks
> like we
> > > >>>
> > > >>>
> > > >>would have to move
> > > >>
> > > >>
> > > >>>things like FIR filter coeffs to internal
> memory
> > > >>>
> > > >>>
> > > >>at runtime to get any
> > > >>
> > > >>
> > > >>>sort of performance. In fact we would have to
> > > >>>
> > > >>>
> > > >>restructure how many of
> > > >>
> > > >>
> > > >>>our algorithms run in order support shuffling
> > > >>>
> > > >>>
> > > >>arrays of data in and out
> > > >>
> > > >>
> > > >>>of internal memory.
> > > >>>
> > > >>>General comments anyone ?
> > > >>>
> > > >>>
> > > >>We compared C6727 closely to C5502 for
> > > >>high-performance, high-precision
> > > >>acoustic/audio applications and stayed with
> C5502,
> > > >>which has an instruction cache and
> > > >>is extremely efficient with 32-bit precision
> > > >>operations at 300 MHz. The lack of L2
> > > >>cache is a big deal.
> > > >>
> > > >>
> > > >Some random comments...
> > > >- I think that you really need to take a close
> look
> > > >the C6727 before selecting [or not selecting]
> it
> > > >[isn't that always the case??].
> > > >- I think that this part is much more
> 'polarizing'
>
=== message truncated ===