The VHDL module "multiply_cs" (see symbol)
calculates the signed product of a multiplicand and a multiplier.
It uses carry-save adders to achieve a faster timing.
Using carry-save adders has the advantage that an addition can be implemented with a carry propagation not
to the next bit but to the next adder of the multiplication adder structure.
Using the carry-save adders causes a circuit structure where the timing does not depend on the width of the
operands, but only depends on the number of consecutive additions which are executed in one clock period.
The module "multiply_cs" has a better timing than the modules "multiply" and "multiply_bsc" also available from this web site.
Note that if you are using an advanced synthesis tool such as Synopsys Design Compiler Ultra,
the module "multiply_cs" design may not give a better timing than the "multiply" design.
Design Compiler Ultra already uses advanced arithmetic optimisations that implement fast addition structures.
The number of bits of multiplicand and multiplier are configured by generics.
Product, multiplicand and multiplier are numbers in 2's complement format.
The module uses flipflops only for storing the product (and for controlling).
For quick access to the multiplier bits, the multiplier is first stored in the product flipflops and
then replaced by the upcoming product bits through shift operations.
The latency of the module can be configured by a generic independently from the width of the operands.
This means, the module is configurable by generics in order
The module "multiply_cs" was developed with HDL-SCHEM-Editor.
Port name | Direction | Description |
---|---|---|
res_i | input | asynchronous reset input, 1-active Can be clamped to 0 when g_latency=0. |
clk_i | input | clock input Can be clamped to 0 when g_latency=0. |
start_i | input | This input expects an 1-active impulse of 1 clock cycle width in order to start the calculation. When g_latency=0 then back to back pulses can be used. |
multiplicand_i(g_multiplicand_width-1:0) | input | Signed multiplicand (g_multiplicand_width is a generic). The input must be stable during the calculation. |
multiplier_i(g_multiplier_width-1:0) | input | Signed multiplier (g_multiplier_width is a generic). The input is latched at start=1 and can be changed afterwards. |
ready_o | output | 1-active impulse of 1 clock cycle width, when the calculation is ready (at latency 0 it gets active in the same clock cycle in which start_i gets active). |
product_o(g_multiplicand_width+g_multiplier_width-1:0) | output | Signed product. Valid at ready_o=1. Not stable during calculation. |
Generic name | Minimum Value | Maximum Value | Description |
---|---|---|---|
g_multiplicand_width | 2 | none | Number of bits of the multiplicand The first bit represents the sign as the operands have to be coded in 2's complement. |
g_multiplier_width | 2 | none | Number of bits of the multiplier The first bit represents the sign as the operands have to be coded in 2's complement. |
g_latency_mul | 0 | none | Latency of the multiplication algorithm in clock cycles When g_latency is 0, then the multiplication is a combinatorial design. |
g_latency_convert | 0 | 1 | Latency of the submodule multiply_cs_convert in clock cycles This module converts the product from a carry-save number into a 2's complement number. When g_latency_convert is 0, then multiply_cs_convert is a combinatorial design. |
The module "multiply_cs" is a hierarchical module, which is built by 3 submodules.
Submodule name | Functionality |
---|---|
cs_package |
The package cs_package contains all needed type definitions and functions to handle "carry-save" numbers. |
multiply_cs_step |
The "multiply_cs_step" module processes 1 bit of the multiplier.
Depending on the multiplier bit processed, 0 or the multiplicand is added to the partial product.
If the processed multiplier bit is the sign bit and has a value of 0, 0 is added to the partial product. |
multiply_cs_convert |
The "multiply_cs_convert" module converts the product from a carry-save number into a 2's complement number.
If g_latency_convert=0 then the "multiply_cs_convert" submodule is a combinatorial design. |
multiply_cs_control |
The multiply_cs_control modules generates all the control signals which are needed. |
There are no limitations for the generics g_multiplicand_width and g_multiplier_width (except that they must be bigger than 1).
These generics are most of the time determined by the environment, where the module multiply_cs is used.
There is also no limitation for the generic g_latency_mul. But this generic determines not only the latency but also
how difficult it will be to reach timing closure: The smaller the value is chosen, the harder it will be to reach timing closure.
If timing closure cannot be reached and g_multiplicand_width is bigger than g_multiplier_width it may be a solution to switch
multiplicand and multiplier. The reason is that the width of the adders which are used by the module multiply_cs_step depends
on g_multiplicand_width and the smaller this number is, the faster the adders get.
If g_latency_mul is equal to g_multiplier_width, then in each clock cycle 1 bit of the multiplier is handled.
If g_latency_mul is smaller than g_multiplier_width, then in each clock cycle more than bit of the multiplier is handled.
How many bits of the multiplier are handled can be calculated by rounding up g_multiplicand_width/g_latency_mul to the next integer.
Note that handling more than 1 bit of the multiplier in a clock cycle may prevent reaching timing closure.
If g_latency_mul is bigger than g_multiplier_width, then the number of bits of the multiplier is internally increased to g_latency_mul and
again in each clock cycle 1 bit of the (extended) multiplier is handled. Of course this leads to an internal product register which has
the same number of additional bits as the multiplier.
Source code for HDL-SCHEM-Editor and HDL-FSM-Editor for module "multiply_cs" and its testbench (Number of downloads =
9 ).
With these files the schematics and the state-diagram of module multiply_cs can be loaded into HDL-SCHEM-Editor or
HDL-FSM-Editor and can be easily read and modified:
All module VHDL-files of the module "multiply_cs" (Number of downloads =
8 ).
These files were generated by HDL-SCHEM-Editor and HDL-FSM-Editor:
All testbench VHDL-files of the module "multiply_cs" (Number of downloads =
8 ).
These files were generated by HDL-SCHEM-Editor and HDL-FSM-Editor:
You should extract all archives into a folder named "multiply_cs".
Then you should load the toplevel (probably the testbench) into HDL-SCHEM-Editor.
When you navigate through the design hierarchy by a double click at each symbol,
HDL-SCHEM-Editor will find the submodules on your disk and ask if it can replace
the original path to the submodule by the new one at your disk.
After storing the changed modules the relocation of the source files is ready
(instead you could replace "M:/gesicherte Daten/Programmieren/VHDL/multiply_cs" in all
"hdl_editor_designs/*.hse" source files by your path to this directory with your editor).
Now you can navigate through the design by HDL-SCHEM-Editor and generate HDL by HDL-SCHEM-Editor for
all modules except multiply_cs_control, for which the HDL must be generated by HDL-FSM-Editor.
Of course there is only need for generating HDL, if you change something at the modules, because you can find the HDL in VHDL_designs.zip and VHDL_testbenches.zip.
If you want to simulate or modify the modules by HDL-SCHEM-Editor you also must adapt the information in the Control-tab of the toplevel you want to work on.
There you must define a "Compile through hierarchy command", an "Edit command", the path to your HDL-FSM-Editor and a "Working directory".
Version 1.0 (27.04.2025):
If you detect any bugs or have any questions,
please send a mail to "matthias.schweikart@gmx.de".