When people talk about massively parallel, multicore chips, they’re usually talking about chips for high-performance line-powered applications, like WiMAX base stations or desktop video processing. But 3DLabs is headed in a different direction. The fabless chip company offers a massively parallel media processor, the DMS-02, which the company says is a perfect fit for portable multimedia devices with demanding video and audio processing requirements—such as high-end cellular handsets and portable media players. According to 3DLabs, the chip is in full production and costs $40 in small (1K) quantities. The company is currently shipping chips to initial customers, including a video surveillance equipment vendor, Grandeye.
The DMS-02 incorporates two ARM9 processors that are connected (through a cache) to a SIMD array of 24 32-bit floating-point processing elements. The ARM9’s run at 200 MHz, while the floating-point array runs at 100 MHz. The SIMD elements are grouped into three, 8-element “clusters,” each of which can perform a different operation—this is somewhat more flexible than the more common approach of requiring all SIMD elements to do the same thing. 3DLabs expects most of its customers to access the array via its library of multimedia-oriented software functions, which are called from C programs running on the ARM cores. The idea is that programmers can harness the processing power of the array without having to deal with its complexity. (3DLabs isn’t the only chip vendor who thinks this is a good idea; Texas Instruments, for example, has taken a similar route with its video-oriented DaVinci products.) 3DLabs also provides reference designs and a development tool suite; customers who want to implement proprietary algorithms on the SIMD array can use these tools to do so. As is the case with all massively parallel processors, however, programming the DMS-02 is likely to be fairly challenging.
Figure 1. 3 DLabs DMS-02 chip.
Off-the-shelf multimedia functions include (among others) H.264 encode/decode, MPEG-4 decode, WMV9 encode/decode, MP3 encode/decode, and AAC encode/decode; the video codecs run on the SIMD array, while the audio codecs run on an ARM processor. According to 3DLabs, the chip can perform H.264 D1 encode at 30 frames per second. Without knowing the exact conditions under which the data was generated (e.g., the input video content) it’s difficult to assess how the DMS-02’s video performance compares to that of other chips. (And as we’ve written about before, this is a problem with published video performance data in general.)
The big question, of course, is whether a multi-core, massively parallel, floating-point chip is really energy-efficient enough for portable applications. Indeed, given the power constraints of the chip’s target applications, it’s worth asking why its designers chose floating-point at all. The answer lies in the company’s background (and in its name). 3DLabs has a long history in developing products for 3D graphics processing on PCs, an application area that typically requires floating-point capabilities. 3DLabs wanted the DMS-02 to be able to handle 3D graphics, and thus gave the chip 32-bit floating-point capabilities—though it also supports 16-bit floating-point and integer data. In any case, the company says that a 130 nm DMS-02 can run H.264 D1 resolution decoding in under 500 mW, including power for memory and I/O.
3DLabs has gone through some changes in recent years. It is currently a fully-owned subsidiary of Creative Technologies, having been acquired back in 2002. In late 2006, 3DLabs and Creative Tech announced that 3DLabs would be spun back out as its own company—but that hasn’t yet materialized. According to 3DLabs, the companies are waiting for a better economic climate in which to complete the deal. In the meantime, 3DLabs is pushing forward with the DMS-02, and says that it may offer scaled versions of the chip with differently sized SIMD arrays and/or different ARM cores. This could enable the company to tackle a broader range of applications while maintaining software compatibility—potentially, a big advantage. It remains to be seen, though, whether a massively parallel chip can really find success in portable products.
Add new comment