background image

SPRA921

6

TMS320C6713 Digital Signal Processor Optimized for High Performance Multichannel Audio Systems

Table 1. C6713 Benchmark Performance

Algorithm

Description

Parameter Values

Cycles

Time

Biquad filter
(IIR filter direct form II)

nx input/output cycles

nx = 60
nx = 90

316
436

1.4 

µ

s

1.9 

µ

s

Real FIR filter

nh coefficients
nr output samples

nh = 24
nr = 64
nh = 30,
nr = 50

802

795

3.6 

µ

s

3.5 

µ

s

IIR filter

nr number of output samples

nr = 64

443

2.0 

µ

s

IIR lattice filter

nr number of samples
nk number of reflection coefficients

nk = 10,
nr = 100

4125

18.3 

µ

s

Dotproduct

nx number of values

nx = 512

281

1.2 

µ

s

3

Two-Level Cache

3.1

Cache Overview

The TMS320C6713 device utilizes a highly efficient two-level real-time cache for internal
program and data storage. The cache delivers high performance without the cost of large arrays
of on-chip memory. The efficiency of the cache makes low cost, high-density external memory,
such as SDRAM, as effective as on-chip memory.

The first level of the memory architecture has dedicated 4K Byte instruction and data caches,
L1I and L1D respectively. The LII is direct-mapped where as the L1D provides 2-way
associativity to handle multiple types of data. The second level (L2) consists of a total of 256K
bytes of memory. 64K bytes of this can be configured in one of five ways:

64K 4-way associative cache

48K 3-way associative cache, 16K mapped RAM

32K 2-way associative cache, 32K mapped RAM

16K direct mapped associative cache, 48K mapped RAM

64K Mapped RAM

Dedicated L1 caches eliminate conflicts for the memory resources between the program and
data busses. A unified L2 memory provides flexible memory allocation between program and
data for accesses that do not reside in L1.

3.2

Cache Hides Off-Chip Latency

The external memories that interface to the TMS320C6713 may operate at a maximum of
100 MHz, while the device operates at a 225 MHz maximum frequency. All external memory
devices have significant start-up latencies associated with them. For example, SDRAMs typically
have a read latency of 2-4 bus cycles. The reduced frequency and additional latency of
memories would normally significantly degrade processor performance. There is a significant
reduction in latency for retrieving data from on-chip L2 memory than from an external memory.
By having the intermediate L2 cache, this latency is hidden from the user. Using the fast L2
memories to cache the slower external memories reduces the latency of external accesses by a
factor of five.

Summary of Contents for TMS320C6713

Page 1: ... existing digital audio formats and the flexibility to add future formats This paper will describe the following parts of the TMS32C6713 processor and their impact on high performance multichannel audio systems The external peripheral architecture The C67x CPU architectural features and performance The real time two level cache architecture The multichannel audio serial ports McASPs Contents 1 Int...

Page 2: ... Microsoft Windows Media Meridian Lossless Packing MLP DVD Audio Rich Music Format RMF In addition to consumer standards many companies are developing their own high performance multichannel audio applications Digital technology is being applied to large venues such as stadiums auditoriums and movie theaters to tune the listening experience to the room acoustics Audio broadcast production and reco...

Page 3: ... a cycle by cycle basis avoiding dead time of most DMAs when a higher priority transfer interrupts a lower priority one Highly configurable PLL and clocking control logic to enable a variety of ratios of system and CPU clocks 256K bytes of internal memory to provide a large internal program and data store Two multichannel buffered serial ports McBSPs provide general connection to multiple serial s...

Page 4: ...eams A D converters DIR SPDIF receivers McASP port 0 McASP port 1 Figure 2 Generalized High Performance Multichannel Audio System McBSP0 McBSP1 McASP0 McASP1 32 EMIF I2C1 I2C0 Timer 1 Timer 0 32 HPI GRO Enhanced DMA controller 16 L2Cache memory 4 banks 64K bytes total up to L2 memory 192K bytes channel Clock generator oscillator and PLL x4 through x25 multiplier 1 through 32 dividers 4 way L1D cac...

Page 5: ...natively supports IEEE 32 bit single precision and 64 bit double precision floating point In addition to C62x fixed point instructions six out of the eight functional units also execute floating point instructions two multipliers two ALUs and two auxiliary floating point units The remaining two functional units support floating point by providing address generation for the 64 bit loads the C67x CP...

Page 6: ...ociativity to handle multiple types of data The second level L2 consists of a total of 256K bytes of memory 64K bytes of this can be configured in one of five ways 64K 4 way associative cache 48K 3 way associative cache 16K mapped RAM 32K 2 way associative cache 32K mapped RAM 16K direct mapped associative cache 48K mapped RAM 64K Mapped RAM Dedicated L1 caches eliminate conflicts for the memory r...

Page 7: ...errupt frequency has not increased in proportion to the increase in device operation frequency As processing speeds have increased latency requirements have not The TMS320C6713 is capable of servicing interrupts with a latency of a fraction of a microsecond when the service routine is located in external memory By configuring the L2 memory blocks as memory mapped SRAM or by using the L2 memory map...

Page 8: ...d to operate as either transmit data receive data or general purpose I O GPIO The transmit section of the McASP can transmit data in either a time division multiplexed TDM synchronous serial format or in a digital audio interface DIT format where the bit stream is encoded for S PDIF AES 3 IEC 60958 CP 430 transmission The receive section of the McASP supports the TDM synchronous serial format Each...

Page 9: ...e configured in digital audio interface transmitter DIT mode where it outputs data formatted for transmission over an S PDIF AES 3 IEC 60958 or CP 430 standard link These standards encode the serial data such that the equivalent of clock and frame sync are embedded within the data stream DIT transfer mode is used as an interconnect between audio components and can transfer multichannel digital aud...

Page 10: ...SCLK2 clock cycles The timer value can be read to get a measurement of the high frequency master clock frequency and has a min max range setting that can raise an error flag if the high frequency master clock goes out of a specified range Upon the detection of any one or more of the above errors software selectable or the assertion of the AMUTE_IN pin the AMUTE output pin may be asserted to a high...

Page 11: ...report SPRA472 3 TMS320C6000 DSP Multichannel Audio Serial Port McASP Reference Guide SPRU041 4 TMS320C621x C671x Two Level Internal Memory Reference Guide SPRU609 5 TMS320C6000 CPU and Instruction Set Reference Guide SPRU189 6 TMS320C6000 Peripherals Reference Guide SPRU190 7 Payan Reimi DSP software and hardware trade offs in Professional Audio Applications Audio Engineering Society 112th Conven...

Page 12: ...tute a license from TI to use such products or services or a warranty or endorsement thereof Use of such information may require a license from a third party under the patents or other intellectual property of the third party or a license from TI under the patents or other intellectual property of TI Reproduction of information in TI data books or data sheets is permissible only if reproduction is...

Reviews: