Startup Connex Introduces Architecture for HD Video

Submitted by BDTI on Tue, 06/27/2006 - 19:00

In May, fabless semiconductor startup Connex announced its CA1024 processor, which is based on the ConnexArray vector architecture.  After receiving initial funding from In-Q-Tel, a not-for-profit venture group established by the U.S. Central Intelligence Agency, Connex first applied the ConnexArray architecture to database search applications.  Now Connex is setting its sights on high-definition television, hoping to provide a programmable, highly parallel architecture that can compete with ASICs and a growing number of programmable SoCs targeting digital video.

The CA1024 processor contains a linear array of 1,024 simple RISC-like processors Connex calls PEs, or processing elements, that have been designed specifically with digital television in mind.  For example, to conserve silicon area and power, the PEs do not include a multiplier.  This is because many of the most computationally intensive functions in video codecs such as deblocking and motion compensation have been designed to avoid multiplications that can’t be implemented efficiently using shifts and adds.  Multiplies that can be implemented with a single shift and add can be done on a Connex PE in one cycle with a combined shift/add instruction. Multiplies by non-constant values require 9-10 cycles.

Connex

Figure 1. The Connex Array of processing elements, each comprised of a 16-bit ALU, conditional execution flags, eight registers (R0-R7), and a 16x256 local RAM.

Each PE consists of a 16-bit ALU, eight accumulator registers, a 16x256 local RAM, and a bi-directional 16-bit inter-PE bus allowing data transfer only between adjacent PEs.  In addition, there are two special-purpose registers which are used to control conditional execution of instructions sent to the PEs.  In a given clock cycle, all PEs execute a common instruction, or perform a NOP based on the conditional execution flags.  In selecting this vector architecture, Connex’s design goal was to provide a conceptually simple platform that can efficiently handle algorithms that apply the same operations to large amounts of data. Tasks that are not efficiently parallizable are not implemented on the array.  For instance, the variable-length encoding found in video compression algorithms is performed on a RISC co-processor with a stream accelerator, which is integrated into the CA1024.

One of the key advantages the CA1024 has over fixed-function ASICs is that it is programmable.  This allows users to adapt to shifting standards as well as implement unique features to differentiate their products, such as post-processing tailored to the physics of different displays.  Connex expects to offer H.264, VC-1, and MPEG-2 HD decoders as well as encoders and transcoders for H.264 and MPEG-2 Main Profile HD, a range of audio codecs, and post-processing algorithms including scaling, deinterlacing, and various image management algorithms.  Connex will provide software modules with libraries and an SDK, and customers will incorporate their own unique IP with support from Connex.  The CA1024 will be programmed using the Connex Programming Language, an extension of C with Connex-specific operators for features such as vector operations and selection of PEs.  For relatively simple algorithms this programming model may be easy to use, but for more complex algorithms such as those found in video codecs, it could prove difficult.

Traditionally, fixed-function ASICs have been the only devices offering the performance demanded by HD video.  The key problems with ASICs are that they’re expensive and time-consuming to develop and, most importantly, cannot be reprogrammed to adapt to changing standards and application requirements.  It is easy to see why video system designers might be interested in a programmable solution such as the CA1024.  Connex, however, faces a growing number of competitors targeting HD video.  For instance, the DM64x “DaVinci” part from Texas Instruments can decode up to 1080i Main Profile MPEG-2 or 720p MPEG-4.  In this architecture, an ARM926 processor is coupled with a C64x+ DSP and a host of video-related accelerators and peripherals.  Another heterogeneous multimedia processor capable of HD decoding is the Nexperia PNX8850 from Phillips, featuring a MIPS32 CPU and dual 240 MHz TriMedia processors.  The PNX8850 is capable of decoding one MPEG-2 HD MP stream or two standard definition MP MPEG-2 streams.  According to Connex, the CA1024 will outperform these chips, decoding two channels of HD H.264 or encoding one channel of HD H.264.

Add new comment

Log in to post comments