BDTI recently completed an in-depth analysis of AutoESL’s AutoPilot high-level synthesis tool via the BDTI High-Level Synthesis Tool Certification Program™. BDTI evaluated the process of implementing applications on a Xilinx FPGA using AutoPilot, comparing it with traditional FPGA design based on hand-written RTL code, and with DSP processor software development. Overall, AutoPilot demonstrated a strong ability to generate high-quality RTL code—with equivalent resource utilization to hand-written RTL code.
AutoPilot accepts C, C++ and SystemC functional descriptions as its input. The design flow using AutoPilot is shown in figure 1. The BDTI Certification Program uses two C language example applications, a video motion analysis application and a wireless receiver. BDTI’s hands-on evaluation primarily relied on the video application. After receiving training from AutoESL, installing AutoPilot, and becoming familiar with the tool, a BDTI engineer implemented the video application on a Xilinx Spartan-3A DSP FPGA using AutoPilot in conjunction with the Xilinx RTL tools. As the first step in this process, the engineer assessed the extent to which the initial (purely algorithmic) C code had to be modified in order to be accepted by AutoPilot. Surprisingly, the initial C code did not require any modification to produce initial RTL code. Obtaining an efficient FPGA implementation of the application did require significant modification to the original C code, however. This modification focused on restructuring the purely algorithmic C code to make it better suited to a hardware implementation and to reduce hardware resource utilization, rather than to accommodate restrictions imposed by AutoPilot.
FIGURE 1. Design Flow Using the AutoPilot High-Level Synthesis Tool with Xilinx “ISE” RTL Tools (Figure based on a diagram provided by AutoESL.)
Memory access was the most important resource constraint encountered during optimization of the video workload–a characteristic typical of video applications. For example, as initially designed, a video application may pass entire video frames of intermediate data between algorithm blocks. For efficiency of implementation, the code can be restructured to transfer small blocks of pixels between functions. Another important optimization was to represent variables using the minimum precision required. AutoPilot provides a C language library that supports arbitrary-precision integers to enable experimentation with numeric precision. For instance, “int10” represents a 10-bit integer. Overall, BDTI found the process of restructuring C code with AutoPilot to be much simpler than traditional RTL coding.
In addition to restructuring C code, the AutoPilot user provides hints to the tool about implementation. These hints, in the form of synthesis directives, can be embedded in the C code as pragmas or included in scripts. Typical directives describe memory organization, mapping of specific variables to registers or wires, mapping of arrays to an FPGA RAM block, and pipelining suggestions. While AutoPilot’s documentation is limited, the tool provides helpful pull-down menus for generating constraints.
Another surprising finding is AutoPilot’s fast run times. For the moderately complex workloads in BDTI’s Certification Program, AutoPilot typically generated RTL output, including a report that estimates the FPGA resource utilization, latency, and throughput, in under 30 seconds. This speed enables the user to quickly evaluate numerous implementation options. It also enables an iterative development approach that provides quick feedback on each modification.
In addition to generating RTL code, AutoPilot can produce a C or SystemC language representation of the synthesized implementation to enable verification and (in the case of SystemC) cycle-based performance testing of the tool’s output. Using this capability to test the functionality of the tool’s output can save significant design time, as C simulations run much faster than RTL simulations. For BDTI’s video application, C simulations ran in minutes, while RTL simulations took hours to days. This C or SystemC simulation is not a complete substitute for RTL simulation, however, as it may not include components developed outside of AutoPilot (such as DRAM controllers) and, in the case of C language models, does not model concurrency.
Once satisfactory RTL code was obtained from AutoPilot, BDTI used the Xilinx RTL tools to complete the FPGA implementation. While AutoPilot itself was quite easy to use, the Xilinx tools are complex, especially for new users. For FPGA engineers who have already mastered those tools, inserting AutoPilot into the design flow should be straightforward. For DSP software engineers with hardware knowledge, learning and using AutoPilot should be easy, but coming up to speed on the Xilinx RTL tools will be a challenge. Such engineers may want to enlist the help of a seasoned FPGA designer.
More detailed evaluation results and a description of BDTI’s methodology are available on the BDTI web site. A BDTI white paper containing further analysis of AutoPilot is also available.
Add new comment