This month both Texas Instruments and Tilera announced new multicore chips. TI announced the TMS320C6472, which includes six ‘C64x+ processor cores running at 500-700 MHz (depending on the family member). Tilera announced a new chip family, the TILE-Gx, which will include variants with 16-100 cores running at 1.25-1.5 GHz. The ‘C6472 is available now, while Tilera does not expect to start sampling TILE-Gx chips until late 2010. According to Tilera, TILE-Gx chips will be fabbed in a 40 nm process. These announcements represent two of the common approaches to multicore today: putting a handful of processors that were originally designed for standalone use on a single die (TI) and creating a new architecture incorporating numerous cores (Tilera).
According to TI, the ‘C6472 is essentially the same chip as the TNETV3020, which has been marketed since 2007 for carrier infrastructure gateway applications. The ‘C6472 targets higher clock speeds and is being offered for general-purpose use. (Moving from a specific market to the general-purpose market is a common chip trajectory for TI; the company did the same thing with its triple-core ‘C6474 family.)
Because the TI chip is based on an existing architecture, the ‘C64x+, it is software backwards-compatible with previous ‘C64x+ chips. TI expects that customers who are currently using multiple single-core ‘C6415 chips will migrate to the ‘C6472 and realize cost and power savings by having all the cores on one chip. The 500 MHz ‘C6472 costs $140 in 1K quantities and consumes 3.7 watts, according to TI, compared to $75 (each) and 1.0 watt (each) for a 500 MHz single-core ‘C6415.
With the ‘C6472, TI is providing its customers with a straightforward upgrade path. The software development paradigm for single-core DSPs (like the ‘C64x+) is well understood and TI’s Code Composer Studio tool suite is very familiar to many DSP programmers. TI has added some tool support for multicore software development; for example, TI provides a Parallel Debug Manager that enables users to view registers and memory on all cores simultaneously. In addition, the user can set global breakpoints that halt execution on all cores when a breakpoint is hit on a single core. More generally, TI has an outstanding track record of nurturing an extensive application development ecosystem around its chips. In short, TI has the incumbent’s advantage, though its cores were not designed to be sprayed across a chip in large quantities.
Tilera’s approach to multicore architectures is quite different. Several years ago the company developed a mesh architecture that supports a large array of homogeneous processor cores. The company’s initial products were the TILE64 and TILEPro, based on an array of 32-bit, 3-way VLIW cores. Tilera designed the architecture for multicore from the outset, and optimized the bus structure for supporting many cores—which requires high bandwidth onto the chip and efficient inter-core communications. One advantage to this approach is that is it highly scalable; the size of the array can vary from chip to chip while maintaining software compatibility. In the case of the new TILE-Gx family, customers will be able to choose 16, 36, 64, or 100 cores, covering a large range of performance points.
The TILE-Gx is based on Tilera’s earlier TILE architectures, but is significantly enhanced. The TILE-Gx uses a new 64-bit instruction set architecture that includes 75 new instructions relative to the TILE processors (20 of which are SIMD instructions). Tilera has added instruction-set support for bit manipulation and quad multiply-accumulate operations—useful for a variety of DSP-oriented algorithms. The new chips also have a packet processing accelerator, the “mPIPE,” that isn’t included in the TILE families, and accelerators for compression and encryption.
Tilera’s approach to multicore offers much higher performance than TI’s, but like all massively parallel chips, it requires a somewhat different development mindset relative to traditional DSPs. According to Tilera, its processors are intended to be programmed in C/C++, which helps, but it can be difficult to determine how to effectively partition (and debug) an application on a huge array of processing engines. Tilera’s Multicore Development Environment supports full-chip simulation and debugging, and Tilera says that it has partnered with tools vendors CriticalBlue and Nema Labs to help address the partitioning challenge. These companies provide tools that help identify opportunities for parallelization and implement them on the Tilera platform.
For Tilera, cost-performance may also prove to be a hurdle. In 2008 BDTI benchmarked Tilera’s TILE64 processor on the BDTI Communications Benchmark (OFDM). The 64-core TILE64 ran at 866 MHz and at the time it cost $889 in 1K quantities. Its benchmark results were compared to competitors from picoChip, TI, Freescale, and Xilinx, among others. The results indicated that the TILE64 had very strong performance (much better than the DSP processors, not as good as the FPGA, and comparable to picoChip’s massively parallel PC102) but it had the worst cost-performance of the chips BDTI had evaluated on this benchmark. It’s difficult to estimate how the new chip family will compare to the TILE64 in terms of performance and cost-performance since it is a significantly different architecture. In addition, TILE-Gx core speeds are higher, and chip pricing is also somewhat different—Tilera says that prices will range from “under $400 for the TILE-Gx36 to less than $1,000 for the TILE-Gx100” in low volumes.
TI’s ‘C6472 and Tilera’s TILE-Gx chips are likely to compete head-to-head in some applications, though it’s important to note that by the time the Tilera chips are available TI may well have higher-performance offerings. TI is targeting high-performance, multi-channel applications such as communications infrastructure, high-speed computing, video, imaging, and military applications. Tilera targets many of these same applications, and is also emphasizing cloud computing.
Add new comment