The module "multiply"
a VHDL implementation of the multiplication algorithm

The VHDL module "multiply" (see symbol) calculates the signed product of a multiplicand and a multiplier.

The number of bits of multiplicand and multiplier are configured by generics.
Product, multiplicand and multiplier are numbers in 2's complement format.
The module uses flipflops only for storing the product (and for controlling).
For quick access to the multiplier bits, the multiplier is first stored in the product flipflops and
then replaced by the upcoming product bits through shift operations.
The latency of the module can be configured by a generic independently from the width of the operands.
The module has 2 architectures: One which implements the classic written multiplication algorithm and a second which uses the VHDL multiplication operator '*'.

This means, the module is configurable by generics in order

But of course there is no guarantee that timing closure can be reached with the selected values
for the generics, as the timing depends on the technology which is used at synthesis.

The module "multiply" was developed with HDL-SCHEM-Editor.

Ports:

Port name Direction Description
res_i input asynchronous reset input, 1-active
Can be clamped to 0 when g_latency=0.
clk_i input clock input
Can be clamped to 0 when g_latency=0.
start_i input This input expects an 1-active impulse of 1 clock cycle width in order to start the calculation.
When g_latency=0 then back to back pulses can be used.
multiplicand_i(g_multiplicand_width-1:0) input Signed multiplicand (g_multiplicand_width is a generic).
The input must be stable during the calculation.
multiplier_i(g_multiplier_width-1:0) input Signed multiplier (g_multiplier_width is a generic).
The input is latched at start=1 and can be changed afterwards.
ready_o output 1-active impulse of 1 clock cycle width, when the calculation is ready (at latency 0 it gets active in the same clock cycle in which start_i gets active).
product_o(g_multiplicand_width+g_multiplier_width-1:0) output Signed product. Valid at ready_o=1. Not stable during calculation.

Generics:

Generic name Minimum Value Maximum Value Description
g_multiplicand_width 2 none Number of bits of the multiplicand
The first bit represents the sign as the operands have to be coded in 2's complement.
g_multiplier_width 2 none Number of bits of the multiplier
The first bit represents the sign as the operands have to be coded in 2's complement.
g_latency 0 none Latency of the module in clock cycles
When g_latency is 0, then the module multiply is a combinatorial design.

The module "multiply" is a hierarchical module, which is built by two submodules.

Submodule name Functionality
multiply_step

The module multiply_step processes 1 bit of the multiplier.
It is instantiated as often as multiplier-bits are processed during 1 clock cycle (which depends from the generic g_latency).

Depending on the processed multiplier bit, 0 or the multiplicand is added to the partial product.

When the processed multiplier bit is the sign bit and has the value 0, 0 is added to the partial product.
When the processed multiplier bit is the sign bit and has the value 1, the multiplicand is subtracted from the partial product.
This subtraction compensates the error which is done, when the bits of the negative multiplier are handled as if the multiplier were positive.

multiply_control

The multiply_control modules generates all the control signals which are needed.
It enables the internal registers for the intermediate or final results.
It identifies the clock period in which the sign bit of the multiplier is handled.
It activates the ready_o output at the end of the calculation.

The module multiply has 2 architectures named "struct" and "fpga".

The architecture "struct" implements a signed multiplication algorithm whilst the architecture "fpga" uses the VHDL multiplication operator.
Because the architecture "fpga" also depends on the generics, it has the same latency and of course the same interface as the architecture "struct".

Using the architecture "fpga" makes sense most of the times only in this scenario:
An ASIC is designed which uses the architecture "struct". But the ASIC is also implemented as a FPGA prototype.
Then the architecture "fpga" can be used in the FPGA prototype, which makes reaching timing closure in the FPGA
often much more easier, because at finding the VHDL multiplication operator '*', the FPGA flow inserts a DSP for the multiplication.

There are no limitations for the generics g_multiplicand_width and g_multiplier_width (except that they must be bigger than 1).
These generics are most of the time determined by the environment, where the module multiply is used.

There is also no limitation for the generic g_latency. But this generic determines not only the latency but also
how difficult it will be to reach timing closure: The smaller the value is chosen, the harder it will be to reach timing closure.

If timing closure cannot be reached and g_multiplicand_width is bigger than g_multiplier_width it may be a solution to switch
multiplicand and multiplier. The reason is that the width of the adders which are used by the module multiply depends
on g_multiplicand_width and the smaller this number is, the faster the adders get.

If g_latency is equal to g_multiplier_width, then in each clock cycle 1 bit of the multiplier is handled.

If g_latency is smaller than g_multiplier_width, then in each clock cycle more than bit of the multiplier is handled.
How many bits of the multiplier are handled can be calculated by rounding up g_multiplicand_width/g_latency to the next integer.
Be aware that handling more than 1 bit of the multiplier in a clock cycle may prevent reaching timing closure.

If g_latency is bigger than g_multiplier_width, then the number of bits of the multiplier is internally increased to g_latency and
again in each clock cycle 1 bit of the (extended) multiplier is handled. Of course this leads to an internal product register which has
the same number of additional bits as the multiplier.

symbol symbol symbol symbol

Source code for HDL-SCHEM-Editor and HDL-FSM-Editor for module "multiply" and its testbench (Number of downloads = 196 ).
With these files the schematics and the state-diagram of module multiply can be loaded into HDL-SCHEM-Editor or HDL-FSM-Editor and can be easily read and modified:

All module VHDL-files of the module "multiply" (Number of downloads = 207 ).
These files were generated by HDL-SCHEM-Editor and HDL-FSM-Editor:

All testbench VHDL-files of the testbench of the module "multiply" (Number of downloads = 207 ).
These files were generated by HDL-SCHEM-Editor and HDL-FSM-Editor:

Relocation hints:

You should extract all archives into a folder named "multiply".
Then you must replace "M:/gesicherte Daten/Programmieren/VHDL/multiply" in all "hdl_editor_designs/*.hse" source files by your path to this directory.
Now you can navigate through the design by HDL-SCHEM-Editor and generate HDL by HDL-SCHEM-Editor for all modules except multiply_control,
for which the HDL must be generated by HDL-FSM-Editor.
Of course there is only need for generating HDL, if you change something at the modules, because you can find the HDL in VHDL_designs.zip and VHDL_testbenches.zip.
If you want to simulate or modify the modules by HDL-SCHEM-Editor you also must adapt the information in the Control-tab of the toplevel you want to work on.
There you must define a "Compile through hierarchy command", an "Edit command", the path to your HDL-FSM-Editor and a "Working directory".

Change log:

Version 1.0 (05.04.2024):

If you detect any bugs or have any questions,
please send a mail to "matthias.schweikart@gmx.de".