Project

Description

Many embedded systems contain applications integrating mathematical processing. To satisfy the constraint (area, energy consumption, execution time) inherent to embedded systems, fixed-point arithmetic is widely used and preferred. Applications are designed and simulated using floating-point data types and then implemented into fixed-point architectures. The limited number of bits to code the data requires an in-depth analysis of the dynamic range to ensure no overflow and the numerical accuracy to guarantee a limited degradation of application quality. The manual fixed-point conversion is a tedious, time-consuming and error prone task which slows down the implementation of embedded applications. Thus, reducing time-to-market while increasing application complexity requires high-level development tools to automate fixed-point conversion.

The main objective of DEFIS project is to propose new approaches to improve the efficiency of the fixed-point conversion process and to provide a complete design flow for fixed-point refinement of complex applications. The benefit obtained with this design flow in terms of development time and quality of the generated solution will be demonstrated through experimentations. This infrastructure will significantly reduce the design time by automating the fixed-point conversion and will enable exploring the trade-off between application quality and implementation cost. Moreover, this flow will guarantee and validate the numerical behavior of the resulting implementation.

In the DEFIS workflow, the application is described at the system level through a set of blocks that can be either specified in C code using floating-point data types or issued from parameterized block for kernel like signal processing (DSP) systems or polynomial evaluation. This infrastructure is able to handle a complete application through a hierarchical approach and is organized in four modules. The first module defines for each block the numerical accuracy constraint according to the global application quality. This constraint is used for the fixed-point conversion of each block. The second module corresponds to the algorithm-level transformation and carries-out a set of transformations to find the best structure for the computation. The third module evaluates the dynamic range to determine the number of bits for the integer part. Two kinds of methods will be proposed. The first one guarantees no overflow by using techniques based on static analysis. The objective of the second one is to minimize the integer part word-length by tolerating some overflows as long as the global performances (quality) are maintained. Finally, the fourth module determines the number of bits for the fractional part. The objective is to optimize the fixed-point specification to minimize the implementation cost for a given numerical accuracy constraint. This optimization process requires evaluating several times the numerical accuracy. The existing approaches proposed by the DEFIS project partners will be extended and combined to obtain an accurate approach supporting a wide range of applications.

After the determination and the optimization of the fixed-point specification, the infrastructure generates a new C code with fixed-point data types. This C code is implemented in the targeted architectures with classical tools. Three platforms are considered: an SIMD accelerator, design by Thales, with its associated parallelization tool; a hard-coded IP block synthesized using a High-Level Synthesis tool and a processor (ARM Cortex) with its associated C compiler.

The proposed infrastructure will be validated on two real applications provided by the in dustrial partners. The results obtained with the infrastructure developed in the DEFIS project will be compared with a reference implementation provided by industrial partners. Each reference implementation has been optimized manually. To analyze the quality of our approach, the gain in terms of development time will be measured and the cost (resources, energy consumption, execution time) of the implementations will be compared.