Synopsys' EM5D and EM7D Processor Cores: The ARC Architecture Gains DSP Capabilities

Submitted by BDTI on Mon, 07/21/2014 - 22:00

Although the ARC brand has kept a relatively low profile since being acquired by Synopsys in 2009, Synopsys reports that the ARC family of licensable cores are on track to ship more than 1.5 billion units this year. Until recently, ARC's offerings were "vanilla" Harvard architecture CPUs with no DSP-optimized features. That's all changed with the latest EM5D and EM7D (the "D" standing for "DSP"), the first two members of the EM DSP family, which were introduced in late May and are generally available for licensing and implementations beginning this month.

ARC's DSP embrace is the result of two primary factors: evolving market requirements, and Synopsys' deep R&D pockets. As the April 2014 edition of InsideDSP noted, the ARC processor core came into the Synopsys fold via a multi-step process; ARC International was first acquired by Virage Logic in 2008, with Synopsys subsequently purchasing Virage Logic one year later. And, as that same InsideDSP article also noted, ARC isn't the only processor suite in Synopsys' product line; fully custom processor offerings are also available via Processor Designer (obtained via the 2012 acquisition of CoWare) and through the custom processor tool flow recently acquired in Synopsys' purchase of Target Compiler Technologies.

The April 2014 InsideDSP article referred to ARC's cores as "fully formed offerings." That statement is accurate in a comparative sense, relative to custom processor design tools. In an absolute sense, however, ARC's architecture is quite flexible, capable of being user-tailored for particular needs. Elements of the instruction set, register set, functional units and buses can all be customized or (if not needed) eliminated via the supplied development tool suite.

Speaking of architecture, the EM and "big brother" HS families are both based on the "v2" ARC instruction set, in contrast to earlier generations of v1 instruction set-based precursors. These earlier (but still sold and supported) ARCv1 ISA products include the 600 and 700 series families, along with the audio processing-focused AS200 product line (Figure 1).


Figure 1. Synopsys' various ARC offerings use a mix of v1 and v2 instruction sets, with the newly introduced v2 EM DSP slotting in-between the v2 EM and v1 600 families.

The EM5D and EM7D follow in the footsteps of the EM4 and EM6, three-stage pipelined CPUs intended for deeply embedded fixed-function applications. The fundamental difference between the EM4 and EM6, and between the EM5D and EM7D, involves integrated memory. The EM4 and EM5D both offer instruction and data CCMs (closely coupled memories with single-cycle access), up to 1 MByte in size each with the EM5D. The EM6 and EM7D supplement the CCMs with instruction and data cache support, up to 32 KBytes in size for the EM7D.

What the EM5D and EM7D share, and what neither the EM4 or EM6 provide, is a separate, parallel processing pipeline that builds on the ARCv2 ISA with support for more than 100 DSP instructions. The DSP pipeline is fully clock-gated to minimize power consumption when not in use. And because it is separate from the conventional instruction set pipeline, its inclusion doesn't result in clock speed degradation versus the EM4 or EM6. Hardware elements encompassed in the DSP pipeline include a unified single-cycle 32x32 multiplier/multiply-accumulator, square root, divide and FFT butterfly acceleration units, and a 64+8-bit accumulator alternatively configurable as two 32+8 bit accumulators (Figure 2).


Figure 2. The DSP facilities operate in parallel with the conventional three-stage processor pipeline, and are further supplemented by an optional floating-point unit

Foundation DSP capabilities of the EM5D and EM7D are fixed-point in nature (Table 1).


Table 1. Functions supported by ARC EM DSP processors' hardware

Also shown in the above block diagram (Figure 2), however, is the single- and double-precision floating-point "assist" engine, which builds on the ARCv2DSP ISA base and is an optional add-on for all EM-series cores. And for both fixed- and floating-point modes, SoC designers can supplement the foundation instruction set with user-defined acceleration resources via Synopsys' APEX (ARC Processor EXtensions) support.

Synopsys believes that EM DSP cores deliver a superior combination of performance, power consumption and silicon area relative to competitive IP offerings from companies like ARM, Imagination Technologies (MIPS), and Cadence (Tensilica). The following table summarizes Synopsys' claimed capabilities for an EM DSP core fabricated on a 40 nm LP (low power) process (Table 2).


Table 2. Synopsys’ reported performance, power and area data for the EM DSP core

How do these specs translate into real-life implementation results? As a case study, Synopsys partnered with software provider Sensory to implement Sensory's Low-Power Sound Detection algorithm, part of Sensory's v3 TrulyHandsfree Voice Control suite. The code ran on an EM5D core fabricated on TSMC's 28 nm HPM (high performance for mobile) process. Synopsys claims that that the EM DSP core can implement the Sensory algorithm with less than 1/4 the power consumption of competing cores. More generally, Synopsys believes that its EM and EM DSP devices compare favorably against competing ARM, MIPS and Tensilica products (Figure 3).


Figure 3. Synopsys claims that its EM products are highly competitive in performance, power consumption, and size

Synopsys Product Marketing Manager Paul Garden believes that the total available market for SoCs servicing applications with low-energy control and signal processing needs (aside from those in PCs) will reach 16 billion units by 2018 (Figure 4).


Figure 4. EM DSP target markets have strong growth forecasts in the coming years

As the above graphic makes clear, Synopsys is putting faith into the rapid and sizeable maturation of IoT (Internet of Things) products. First-generation IoT offerings, along with wearable devices such as fitness monitors and smart watches, have entered the market with limited success. In response, manufacturers are rapidly evolving them, adding features and lengthening battery life, while simultaneously reducing prices. The resultant demands on building-block processor suppliers such as Synopsys are perhaps obvious (Figure 5).


Figure 5. Aggressive system performance, power consumption and price trends drove the EM DSP's development and will define its continued evolution

More generally, Garden notes that common functions employed by SoCs in the company's target markets consolidate into the following four categories:

  • Sensor processing
  • Voice/speech recognition
  • Baseband control for wireless devices, and
  • Audio (host CPU off-load)

Not surprisingly, Garden believes that EM DSP is the optimum processor architecture to tackle those functions. Is he right? Only time will tell. Synopsys has the "clean slate" benefit of not needing to conform to a legacy architecture heritage in crafting its cores' DSP facilities. That same "clean slate" status is also a hindrance, however, in that potential customers aren't able to immediately tap into an expansive and diverse software ecosystem. How quickly and extensively Synopsys matures not only its own building-block software offerings but those of its partners will go a long way toward defining the degree of EM DSP's success (Figure 6).


Figure 6. The MetaWare toolkit supports ARC cores with an extensive DSP software library and C/C++ compiler, among other facilities.

Add new comment

Log in to post comments