The following dictionary of DSP terms is provided as a resource to the community.
Absolute addressing Alternative term for memory-direct addressing.
Accumulator A register used to hold the output of the ALU or multiply-accumulate unit. On fixed-point processors, accumulators are usually at least twice as wide as the processor’s basic word width (in bits) and may be wider. DSP processors typically have from one to four accumulators.
ALU Arithmetic/logic unit An execution unit on a processor responsible for arithmetic (add/subtract, shift, etc.; typically not including multiply operations) and logic (and, or, not, exclusive-or) operations.
ASIC Application-specific integrated circuit. An integrated circuit intended for use in a particular product or set of products, designed by users of the IC. Contrast with ASSP.
ASSP Application-specific standard product. An ASSP is different from an ASIC in that it is intended to be used in a range of products within a particular application field, whereas an ASIC is often used only in a single product.
Biquad filter A second-order digital filter commonly used in signal processing. Biquads are often used as building blocks in higher-order digital filters.
Bit-reversed addressing An addressing mode in which the order of the bits used to form a memory address is reversed. This simplifies reading the output from radix-2 FFT algorithms, which produce their results in a scrambled order.
Block floating-point Block floating-point arithmetic is a form of floating-point arithmetic that is sometimes used on fixed-point processors. (Many fixed-point DSP processors have hardware and instructions to support block floating-point arithmetic.) With block floating-point formats, a block of data is assigned a single exponent (rather than each data word having its own exponent, as in floating-point). The exponent is typically determined by the data element in the block with the largest magnitude. This technique is used for variables that can’t be represented with sufficient fidelity using the processor’s native fixed-point format. The conversion from fixed-point to block floating-point representation is performed explicitly by the programmer, in software.
Branch A change in a processor’s flow of execution to continue execution at a new address.
Butterfly An operation used in the computation of the FFT. A butterfly computation involves a multiplication, addition, and subtraction of complex numbers.
Circular (modulo) addressing An addressing mode where post-increments are done using modulo arithmetic, so that the pointer returns to the beginning address after it steps through the buffer. (This addressing mode is used to implement circular buffers.)
Circular buffer A region of memory used as a buffer that appears to wrap around. Circular buffers are typically implemented in software on conventional processors and via modulo addressing on DSP processors.
Codec encoder/decoder, or sometimes compression/decompression. Refers to an algorithm used to encode/decode or compress/decompress digital audio or video signals for transmission or storage.
Convergent rounding A rounding technique used to avoid the bias inherent in the conventional round-to-nearest approach. This technique works by attempting to randomize rounding behavior in cases where the input value to be rounded lies exactly halfway between two output values. In half of these cases the value is rounded up, in the other half it is rounded down.
Convolutional encoding An error control coding technique used to encode bits before transmission over a noisy channel. Used in modems and digital cellular telephony. Convolutional encoding is usually decoded via the Viterbi algorithm.
Core Refers to the central execution units of a processor, excluding such things as memory and peripherals. A processor core can be used in many chips with different combinations of memory and peripherals, thus creating a chip family with the same processor architecture. Some cores can be licensed and used to create customized ASICs.
Data path A collection of execution units (adder, multiplier, shifter, etc.) that process data. A processor’s data path determines the mathematical operations possible on that processor.
Delay line A buffer used to store a fixed number of past samples. Delay lines are used to implement both FIR and IIR filters.
Digital signal controller (DSC) A processor that is intended to be used for both control-loop applications (such as motor control) and signal processing. Examples include Microchip’s dsPIC and Freescale’s 56800/56800E.
DSP Digital signal processor OR digital signal processing.
Embedded system A system containing a processor where the processor is not generally reprogrammable by the end user. For example, a cell phone containing a DSP processor is an embedded system.
FFT Fast Fourier transform. A computationally efficient method of estimating the frequency spectrum of a signal. FFT algorithms are widely used in DSP systems. There are a number of different FFT algorithms; these include radix-2 and radix-4, and they may include data unscrambling (also called bit-reversal) or not; they may be calculated "in-place" or "out-of-place."
Fixed-point Refers to a number format where the binary point is in a fixed location. The format uses a fixed number of bits, and of these, a fixed subset specifies the integer part, with the remaining subset specifying the fractional part. The advantage of the fixed-point format is that it can be implemented in hardware more cheaply and with better energy efficiency than floating-point format, and thus it is very commonly used in embedded systems. The disadvantage is that it provides less dynamic range and requires values to be carefully scaled to avoid overflow or saturation. Integer formats are a sub-class of fixed-point formats.
Floating-point Refers to a number format where the position of the binary point "floats" depending on the magnitude of the number being represented. Each floating-point number is composed of three fixed-point numbers: one specifies the sign, another specifies the “mantissa,” and the third specifies the “exponent.” The value of the number being represented is equal to the mantissa times the base (usually 2) raised to the power given by the exponent. The advantage of the floating-point format is that it provides a very wide dynamic range with good precision. The disadvantage is that the hardware required to support this format consumes more power and is more expensive than fixed-point hardware.
FPGA Field-programmable gate array. A chip composed of an array of configurable logic cells (also called logic blocks). Each cell can be configured, or programmed, to perform one of a variety of simple functions, such as computing the logical AND of two inputs. FPGA logic cells can be used as building blocks to implement any kind of functionality desired, from low-complexity state machines to complete microprocessors. Recent DSP-oriented FPGAs include hard-wired data paths and/or processor cores.
FIR filter Finite impulse response filter. A category of digital filters. As compared to the other category, IIR filters, FIR filters are generally more expensive to implement, but offer several attractive design characteristics.
General-purpose processor A processor that was designed to serve a variety of applications, rather than being highly tailored to one specific application (or class of applications). Examples include the ARM, MIPS, and PowerPC processors.
Guard bits Extra bits in an accumulator used to prevent overflow during accumulation operations. DSP processors commonly provide four to eight guard bits in their accumulators.
Hardware loop A programming construct in which one or more instructions are repeated under the control of specialized hardware that minimizes the overhead for the loop.
Harvard architecture A processor memory architecture with two (or more) separate banks of memory and multiple on-chip memory buses. In traditional Harvard architectures, the processor fetches instructions from one memory bank and data from the other. In many modern processors, however, the memory banks can store both instructions and data. Most processors that target signal processing applications are based on variants of the basic Harvard architecture.
IIR filter Infinite impulse response filter. A category of digital filters. As compared to FIR filters, IIR filters generally require less computation to achieve comparable results, but sacrifice certain design characteristics (such as guaranteed stability) that are often desirable.
Immediate addressing A processor addressing mode in which the value to be used is specified as part of the instruction word. An example is the instruction "MOV #3,RO," which moves the value "3" into register R0.
Instruction cycle The time required to execute the fastest instruction on a processor.
Instruction set A processor’s instruction set is the set of assembly language instructions that it is able to execute.
Instruction-set simulator A software development tool that simulates the execution of programs on a specific processor. Instruction set simulators provide a software view of the processor; that is, they display program instructions, registers, memory, and flags, and allow the user to manipulate the register and memory contents.
Interlocking pipeline A pipeline architecture in which instructions that cause contention for resources are delayed by some number of instruction cycles. This technique ensures that instructions produce logically expected results; an instruction that uses the results of a previous instruction, for example, will stall (if needed) until the results from the previous instruction are ready.
Kernel (1) Software (such as an operating system) that provides services to other programs. (2) A small portion of code that forms the heart of an algorithm (as in algorithm kernel benchmarks.)
Loop unrolling A programming or compiler strategy whereby instructions that are executed within a loop are copied one or more times to reduce (or eliminate) the number of times the loop is executed. This technique can reduce the overhead associated with looping, but it increases instruction memory usage.
MAC Multiply-accumulate. A common operation in many DSP applications, where operands are multiplied and then added into a running total in an accumulator register.
Memory-direct addressing A processor addressing mode where the address is specified as a constant that forms part of the instruction. For example, "MOV X:1234,X0" moves the contents of X memory location 1234 into register X0.
SIMD Single instruction, multiple data. A processor architectural technique in which one instruction causes multiple, identical operations to be performed on different sets of data. For example, a single SIMD multiply instruction might perform two (or four, or eight, etc.) multiplications on separate sets of input data, producing multiple results.
Superscalar A processor architecture in which the processor can execute multiple instructions (typically two or four) per instruction cycle. In superscalar processors, instructions are grouped for parallel execution by processor hardware rather than by the programmer or compiler. Many general-purpose processors that were initially single-issue architectures have been augmented for superscalar execution; this enables the architecture to increase performance while maintaining binary compatibility with previous generations. Contrast with VLIW.
Viterbi algorithm A computationally efficient (but still relatively complex) algorithm for decoding a convolutionally encoded bit stream.
VLIW Very long instruction word. A processor architecture in which the processor issues and executes multiple instructions (typically ranging from 4 to 8) per instruction cycle. In VLIW processors, the programmer is responsible for grouping instructions for parallel execution. This increases programming complexity, but simplifies the hardware relative to the other main class of multi-issue architectures, superscalar.
Von Neumann architecture. A processor memory architecture in which there is a single memory bank and single bus for transferring both instructions and data. Von Neumann architectures are rarely used in data-intensive signal processing applications because they tend to result in crippling memory bottlenecks. Contrast with Harvard architecture.