This month Intrinsity and Samsung jointly announced a new, highly optimized implementation of the ARM Cortex-A8 CPU core, called “Hummingbird.” According to Samsung and Intrinsity, an initial Hummingbird sample has achieved 1 GHz in Samsung’s 45nm low-power process. The companies say that Hummingbird is both faster and lower power than other Cortex-A8 implementations, though as of this writing they have declined to provide power data. Samsung says that it is currently developing Hummingbird-based SoCs for mobile products, but has not yet announced any products.
Hummingbird represents an interesting business model. Samsung has a license for the Cortex-A8 from ARM, and a license for Hummingbird from Intrinsity. Intrinsity received a fee to develop Hummingbird, and both ARM and Intrinsity will collect royalties when the core is used in SoCs. Hummingbird belongs to Samsung; Intrinsity can’t license it, re-use it, or port it to another process. However, any company with a license for the Cortex-A8 can engage Intrinsity to develop a custom core using the same methodology. So theoretically, any other SoC vendor with a Cortex-A8 license can get an optimized core that performs as well as Hummingbird—assuming the vendor has access to a comparable fabrication process.
The Hummingbird core is a continuation of an overall shift in Intrinsity’s business model. The company was founded in 1997 to develop a design approach and related tools for high-speed processor implementations, collectively referred to as “Fast14.” In 2002, Intrinsity used the Fast14 technology to develop and implement its own high-speed, massively parallel processor, FastMATH, which ran at up to 2.5 GHz. (FastMATH was based on a MIPS32 core mated to a vector math unit.) More recently Intrinsity has become primarily an IP licensing company focused on using Fast14 to develop high-speed, optimized versions of other vendors’ embedded cores, which it then offers for license. In addition to the optimized Cortex-A8, Intrinsity has previously developed a PowerPC core for AMCC, and a high-speed (600 MHz) version of the Cortex-R4. Intrinsity uses dynamic domino logic and other techniques (such as custom designed memory blocks) in its hard cores to achieve up to a 1.5X speed-up compared to standard synthesized implementations.
Intrinsity is not the only company to invest in developing faster variants of the ARMv7-based Cortex-A8 core. Qualcomm’s 1 GHz Scorpion core is an optimized implementation of the ARMv7 architecture, and TI has optimized the Cortex-A8 core for use in its own chips. The Scorpion core, however, includes changes to the ISA and microarchitecture and as such is not cycle-for-cycle compatible with the Cortex-A8 core. The handcrafted Cortex-A8 used in TI’s OMAP35x chips currently in production has a top speed of 600 MHz. This is lower than the Hummingbird sample speed of 1 GHz, but that comparison comes with a number of important caveats. First, Hummingbird is implemented in 45nm and the TI core is currently implemented in 65nm, so it’s not quite an apples-to-apples comparison. (ARM quotes a top speed of 1.1 GHz for the Cortex-A8 in a 65nm GP process.) Furthermore, it’s not yet clear whether Samsung will actually ship 1 GHz chips in production volumes or if the 1 GHz speed is just the speed of the initial demo chip. BDTI Benchmark results for the Cortex-A8 are available on BDTI’s website, at /Resources/BenchmarkResults/BDTIMark2000.
Intrinsity’s approach of making highly optimized cores available to anyone who wants to license them (assuming they already have a license for the original core) may level the playing field among SoC developers. It won’t be only the big vendors like TI and Qualcomm who can reap the benefits of a faster, more power-efficient core implementation. If so, then SoC vendors may need to find other ways in which to differentiate their products.
Add new comment