Algorithms are the essence of embedded applications; they are the mathematical processes that transform data in useful ways. They’re also often computationally demanding. When designing a new product, companies often need to assess whether an algorithm will fit within their cost and power consumption targets. Sometimes, an algorithm won’t fit in its initial form.
Most algorithms can be formulated in many different ways and different formulations will be more or less efficient on different processors. In some cases, small changes to the algorithm can make a large difference in the processing load without significantly changing the output. The key to successful algorithm engineering is a clear understanding of both the algorithm and the target processor architecture.
Over the course of numerous algorithm engineering projects, BDTI has developed an effective methodology for fitting complex algorithms on processors. One such project was for a company considering the acquisition of an algorithm development firm. The objective was to evaluate the processing load of a key algorithm on two target processors and to identify any impediments to efficient implementation. First, BDTI carefully studied the instruction set, microarchitecture, and memory organization of the target processors, identifying key attributes that would impact efficient algorithm implementation. On one processor, the cache-based memory architecture was likely to introduce memory access delays. Also, one processor allowed parallel data loads and arithmetic and the other did not. With these factors in mind, BDTI determined the best mapping of the algorithm onto each processor. Then, BDTI created initial optimized implementations of critical algorithm sections. Based on these optimized code fragments, BDTI was able to predict the algorithm's processor cycle and memory use requirements for various data set sizes.
In another project, BDTI was engaged to optimize Google’s sophisticated Tango 3D computer vision algorithms to run efficiently and in real time on the Lenovo Phab 2 Pro smartphone, the world's first smartphone to include this technology. The challenge here was to reduce the computational demands of the algorithms to run efficiently on the Qualcomm Snapdragon 652 without impacting the quality of results. BDTI studied the algorithms carefully and architected an implementation approach using image tiling and multi-threading, then recoded the algorithms to implement this architecture. Next, BDTI optimized the algorithms to run efficiently on multiple processing engines in the Snapdragon 652, using hand-coded assembly, compiler intrinsics, refactoring, and other techniques. Thanks to the cumulative gains of all of these techniques, the optimized implementation achieved real-time performance—as well as low power consumption.
BDTI's knowledge of embedded processor architectures, skill in architecting efficient software, and thorough understanding of algorithms were the key to success on these and other projects. To discuss how BDTI's algorithm engineering services can help you create winning products, contact Jeremy Giddings via the web or by phone at +1 925 954 1411.
Add new comment