Signal Processing Generator

This tool should still be functional inside Simplifide but is no longer being developed. These features are included in ScalaDL which is where continued development will occur.

The signal processing generator is an add-on tool for Simplifide which aids in the creation of datapath rtl as well as bit matched models of the expression. It operates like other embedded templates where this code is embedded into the rtl and can be expanded using a context menu in the editor or project navigator. The purpose of this tool is to use a more natural language for creating signal processing hardware designs which are both easier to implement, understand, debug and support. This tool uses  syntax as close as possible to standard signal processing notation to create optimal rtl code as well as matching c-code greatly increasing the development time of datapath blocks.

Verilog and VHDL are extremely good languages for describing low level hardware operations but are somewhat lacking when describing signal processing functionallity. 

y[n]  =  RC<16,8>(x[n] + RC<16,8>(a0*y[n-1]) + RC<16,8>(a1*y[n-2]));

z[n]  =  RC<16,8>(b0*y[n-1]) + RC<16,8>(b1*y[n-2]) + RC<16,8>(b2*y[n-3]);

The code above contains standard signal processing notation with exception to the RC<16,8> which stands for Round and Clip with internal precision of 16 bits with 8 fractional bits.

Ignoring the RC<16,8> notation which is fixed point, the example code above shows an expression which should be easily identifiable as an IIR to anyone who has taken a signal processing course. This snippet of code generates an rtl version of this code along with both a floating point and fixed point model of this design in C. On average this design would take a minimum of 1 day to design as well as 2 days to verify  and bit match. With this tool an optimal design can be done in 10 minutes.

Similar examples can be shown below

·         FIR example

·         IIR example

·         Complex Multiplier with 2 cycles per operation

·         Complex Multiplier with 4 cycles per operation

Background

RTL languages are well suited for the design of general purpose hardware but are somewhat lacking when it comes to signal processing code.  Code generators based on C or other high level  languages suffer from the issues of the design not being efficient as well as control and simple hardware structures are difficult to specify. This tool resolves these issues by

1.       The language is not an abstraction but rather a more efficient way to describe common structures

2.       The language is embedded in rtl code allowing it to only be used for blocks which have a good fit

3.       This is a natural language designed as a signal processing language

This tool is an alpha version of the 4th generation of a signal processing generator. The basic features are already implemented with a couple of key extensions planned over the next couple of weeks and months. The main planned features are support for vectors as well as for functions and better parameterization. 

Getting Started

The easiest way to get up and running is to use one of the example projects included in Simplifide. The projects can be loaded by creating a standard example suite. The suites can be found underneath the Signal Processing menu. These projects all have the same structure. The main project contained under the projects directory is work which contains 3 subdirectories

1.       design - Contains design files and testbench

2.       src_c    - Contains a directory which contains the source files generated from the design

3.       test      - Contains operations to test the project

a.       data         - Contains the data which is used for the simulation

b.      matlab    - Contains matlab files to generate and test results

c.       Makefile - A makefile which runs an ISIM simulation as well as the c-code

  

The important targets in the Makefile are build which builds the project and run which runs the ISIM and C simulation. The matlab directories contain a few basic files. The data creation file is prefixed with a "cr". The test routine compares the c data with the rtl data, and is prefixed with a "test".

Operation

Embedded templates can be expanded in 2 seperate ways

1.       From a context sensitive action (Expand Templates) in the editor

2.       From a context sensitive action (Expand Templates) in the project navigator

These operations will remove the existing templates and replace them with newly generated blocks. In addition a header and source c file will be created in a src_c directory underneat the project directory which contain a floating point optimal model of the operation as well as a bit matched fixed point model. The c-files are pretty simple and only require one extra c file.

dsp_basic.c

dsp_basic.h

Syntax

The syntax has been designed to be as simple as possible to describe only signal processing operations in notation which should be familiar to anyone with a signal processing background.  There are numerous extensions planned which are going to keep the philosophy of signal processing notation.  Since this code is embedded in rtl, control and other operations can and will be handled by rtl.

Dsp Body

Embedded templates are defined inside comments in the code. The DSP templates are defined with the dsp prefix in a body as shown below. The name shown below is used as the name for the c functions which are created.

dsp name {

     Clock Definition (Optional);

     Signal Declarations

     Statements

}

Clock Definition

The clock definition section defines the clock which is used for this block. An example is shown below. The name specifies a previously defined clock declaration. This is required for blocks which don't use the default clocking structure as well as blocks which share logic and require some control for the input muxing.

clock_head alpha;

Signal Declarations

The signal declaration specifies the signals, constants, and parameters are used to create the code.

Fixed point definition

All of the code generation is dealt with using fixed point and uses the standard fixed point definitions of the width of the signal, the width of the fractional part and wheter the signal is signed or unsigned. This is defined using the format shown below :

                <signed, width, frac>

In this case the signed can be defined using. This is an optional value when unspecified defaults to signed.

·         s - Signed

·         u -Unsigned

·         c - Control

The width and frac values are obvious. The example shown below creates a parameter iwidth which is signed and has a width of 8 with 4 fractional points.

   fixed  <s,8,4>  iwidth

 

Signals can be either declared using a parameter like iwidth or directly using the <> notation.

Signal Definition

Other than the fixed point definition there are 4 other signal types which can be used.

Input Declaration

An input declaration specifies a signal which is an input to the block. The distinction of an input from other signals is that a declaration is not created inside the rtl body, and that it is added to the input parameter list of the c functions which are created. An example is shown below. Like other signals an integer surrounded by brackets at the end of the declaration defines the number of delayed versions of this signal.

input  <s,8,5>  x[2];

Output Declaration

An output declaration specifies a signal which is an output to the block. The distinction of an output from other signals is that a declaration is not created inside the rtl body, and that it is added to the parameter list of the c functions which are created as a pointer. 

output   <s,8,5>  z;

Signal Declaration

A signal declaration specifies an internal signal. The example below shows 2 signals which each contain an extra signal associated with 1 delay.

signal <16,8>   Xr[1],Xi[1];

Constant Declaration

A constant declaration specifies a signal which is a constant. A constant is treated differntly than a signal in many operations. An example is a multiply where they are broken down into their adds.

constant <iwidth> beta  = 1.5;

Statements

The statement section of the code defines standard signal processing expressions with add, subtract, and multiply operations and methods to specify delay and functions used to define clipping and rounding operations, as well as operations for sharing blocks.

Standard Statement Syntax

The standard syntax for statements is a standard syntax for both programming languages as well as signal processing. The statement consists of an output signal and an expression defining the input which consists of a standard expression.

z[n] = alpha*x[n] + beta*x[n-1] + alpha*x[n-2];

The expression above shows a very short simple filter.  The expression above will multiply perform the multiplies with of the constants with the input x after going through a delay line. The index subtracted from the n defines the delay of the signal.

z[n] = alpha*(x[n]+x[n-2]) + beta*x[n-1];

 

The first expression can be optimized to the second expression. This tool is a WYSIWYG tool so optimizations like this need to be explicitly coded.

Fixed Point Handling

The constructs in this tool handle fixed point operations automatically. If no rounding operation is specified than the input is automatically converted to the output by using truncation and overflow.  This is done to specify the least hardware by default. For cases where more control over the operations are desired the standard rounding and truncation methods are allowed using a function type syntax.

y[n]  =  RC<16,8>(x[n] + RC<16,8>(a0*y[n-1]) + RC<16,8>(a1*y[n-2]));

 

An example of this is shown above.  The RC<16,8> notations stands for round and clip with an internal width of 16 and a fraction at 8. This might be slightly counterintuitive. The output width is predefined by the operation so the internal width is all that needs to be specified. There are many choices for defining these operations

·         RC or round_clip            : Round and clip

·         R or round                       : Round

·         TC or truncate_clip        : Truncate and clip

·         T or truncate                   : Truncate

·         UC                                     : Unbiased round and clip

·         U                                       : Unbiased round

Currently the unbiased round operation is not supported.

Logic Sharing

Logic sharing based on using a higher speed clock is standard for signal processing design, and is normally handled by muxing the input to the signal processing blocks in the design. This can be automatically handled by using a slightly different notation specifying the period and offset. An example of this is shown below.  This must be couple with a clock definition which defines an index equivalent to the offset to be used in the muxing.

   // Real Multiplier

   Xr[2k]    = Ar[2k]   * Br[2k];

   Xr[2k+1]  = Ai[2k+1] * Bi[2k+1];

The example above which will create code which creates 2 muxes at the input of the equalizer which selects :

·         Time 0 --- Ar and Br

·         Time 1 --- Ai and Bi

The sharing operation assumes that the expression defined between the 2 segments is exactly the same.

   // Real Addition Segment

   Yr[2k]    = RC<16,8>(Yr[2k-1]   +   -Xr[2k-1]);

   Yr[2k+1]  = RC<16,8>(0.0        +    Xr[2k]);

The example above give an idea of this. The block above shows an example of this.  In the example above a constant 0.0 is used in the second expression as well as using plus the negative in the first expression rather than a minus.

Clock Declaration

The clock, reset, and potential enable needs to be defined for this block as well as a possible index which is used for sharing operations. This declaration is not part of the signal processing , but is required to define the clock operations.

clock_head alpha {

      clock  "clk"    posedge

      reset  "reset"  async     active_low

      enable "enable"

      index  "ind"    1

}

clock

The clock command defines the clock name and the edge which is used by the clock

reset (optional)

The reset command defines the name of the reset along with whether it is synchronous and whether it is active high or low.

enable (optional)

The enable command defines  the name of the enable.

index (optional)

The index command defines the name of an index which is used to control an input mux for sharing operations.