Microsoft's Cognitive Toolkit (formerly CNTK) is a relatively recent entrant to the open-source deep learning framework market. And, as the company's Principal Researcher Cha Zhang acknowledged in a recent briefing, it has a ways to go before it can catch up with the population of developers enjoyed by well-known alternatives such as Caffe and Google's TensorFlow. Last year's transition from Microsoft's own CodePlex open source hosting site to the more widely known GitHub repository, along with
Read more...
At last year's Embedded Vision Summit, Cadence unveiled the Tensilica Vision P6 DSP, which augmented the imaging and vision processing capabilities of its predecessors with the ability to efficiently execute deep neural network (DNN) inference functions. Cadence returned to the Summit this year with a new IP offering, the Vision C5 DSP core, focused exclusively on deep neural networks. Vision C5 is intended for use alongside another core, such as the Vision P6, which will handle image signal
Read more...
Jetson TX2 is NVIDIA's latest board-level product targeted at computer vision, deep learning, and other embedded AI tasks, particularly focused on "at the edge" inference (when a neural network analyzes new data it’s presented with, based on its previous training) (Figure 1). It acts as an upgrade to both the Tegra K1 SoC-based Jetson TK1, covered in InsideDSP in the spring of 2014, and the successor Tegra X1-based Jetson TX1, which BDTI evaluated for deep learning and other computer vision
Read more...
HPC (high-performance computing) servers, which have notably embraced the GPGPU (general-purpose computing on graphics processing units) concept in recent years, are increasingly being employed for computer vision and other deep learning-based applications. Beginning in late 2014, NVIDIA supplemented its general-purpose CUDA toolset for GPU-accelerated heterogeneous computing with its proprietary CuDNN software library, which codifies the basic mathematical and data operations at the core of
Read more...
NVIDIA was an early and aggressive advocate of leveraging graphics processors for other massively parallel processing tasks (often referred to as general-purpose computing on graphics processing units, or GPGPU). The company's CUDA software toolset for GPU computing has to date secured only modest success in mobile and desktop PCs; with game physics processing acceleration, for example, along with still and video image processing acceleration. However, GPGPU has been embraced in the HPC (high-
Read more...
Look back over the history of processors, and you'll see many examples of tasks initially restricted to running on high-end processors that, once they became popular and standardized, eventually attracted specialized co-processor or processor support (Figure 1). Consider, for example, video encoding and decoding, nowadays efficiently handled by a multimedia co-processor core sitting alongside the main processor in a SoC. Or consider graphics processing; initially, only BitBlt and other bitmap-
Read more...
Hard on the heels of the public release of CEVA's second-generation convolutional neural network toolset, CDNN2, the company is putting the final touches on its fifth-generation processor core, the CEVA-XM6, designed to run software generated by that toolset. Liran Bar, the company's Director of Product Marketing, acknowledged in a recent briefing that the new core represents an evolutionary step, versus revolutionary break, from its predecessors: the CEVA-MM3101 (introduced in 2012) and the
Read more...
Last year, when CEVA introduced the initial iteration of its CDNN (CEVA Deep Neural Network) toolset, company officials expressed an aspiration for CDNN to eventually support multiple popular deep learning frameworks. At the time, however, CDNN launched with support only for the well-known Caffe framework, and only for a subset of possible layers and topologies based on it. The recently released second-generation CDNN2 makes notable advancements in all of these areas, including both more fully
Read more...
Modern SoCs increasingly contain a variety of processing resources: one or more CPU cores and a GPU, often with a DSP, programmable logic, or one or multiple special-purpose co-processors for tasks such as computer vision. Properly harnessed, such heterogeneous processors often deliver impressive performance at low cost and low power consumption. But mapping applications onto heterogeneous processors is challenging. OpenCL, a specification standard language and runtime from the Khronos Group,
Read more...
In late January of this year, Movidius and Google broadened their collaboration plans, which had begun with 2014's Project Tango prototype depth-sensing smartphone. As initially announced, the companies’ broader intention to "accelerate the adoption of deep learning within mobile devices" was somewhat vague. However, as of earlier this month, at least some of the details of the planned collaboration become clearer, thanks to the unveiling of Movidius' Fathom Software Framework and Neural
Read more...