The VHDL module "square_root" (see symbol) calculates the root of a radicand by an exact algorithm.
The module is configurable by generics in order
The differences to the module "cordic_square_root" (also available at this website) are (when both modules create the same number of root bits):
The module "square_root" was developed with HDL-SCHEM-Editor.
Port name | Direction | Description |
---|---|---|
res_i | input | asynchronous reset input, 1-active |
clk_i | input | clock input |
start_i | input | This input expects an 1-active impulse of 1 clock cycle width in order to start the calculation. |
radicand_i(g_radicand_width-1:0) | input | Unsigned radicand (g_radicand_width is a generic) The input value is latched at start=1. |
ready_o | output | 1-active impulse of 1 clock cycle width, when the calculation is ready (at latency 0 it gets active in the same clock cycle in which start_i gets active). |
root_o (g_radicand_width/2-1+ g_additional_root_bits+g_radicand_width mod 2:0) |
output | Unsigned root (with g_additional_root_bits beneath the binary point) Valid at ready_o=1 Not stable between 2 impulses at ready_o |
Generic name | Minimum Value | Maximum Value | Description |
---|---|---|---|
g_radicand_width | 1 | n/a | Number of bits of the radicand |
g_additional_root_bits | 0 | n/a | Number of additional root bits after the binary point |
g_latency | 0 | n/a | Latency of the module |
The module "square_root" is a hierarchical module, which is built by 2 submodules.
Submodule name | Functionality |
---|---|
square_root_control |
This module enables the internal registers and controls the output ready_o. |
square_root_step |
The combinatorial submodule square_root_step executes 1 iteration of the algorithm. The module is instantiated several times, when more than 1 iteration has to be executed in 1 clock cycle. |
The generic g_latency works in this way:
As each iteration calculates 1 bit of the output root_o, as many iterations are needed as root_o has bits.
If g_latency="number of root_o bits"=(g_radicand_width+g_radicand_width mod 2)/2+g_additional_root_bits then
1 iteration has to be done in 1 clock cycle, so the submodule square_root_by_cordic_step is only instantiated once.
If g_latency>(g_radicand_width+g_radicand_width mod 2)/2+g_additional_root_bits then again
1 iteration is done in 1 clock cycle and the submodule square_root_by_cordic_step is only instantiated once.
In this case extra root bits are calculated but not propagated to the output.
If g_latency<(g_radicand_width+g_radicand_width mod 2)/2+g_additional_root_bits then
more than 1 iteration has to be done in 1 clock cycle, so the submodule square_root_by_cordic_step is instantiated several times.
If (g_radicand_width+g_radicand_width mod 2)/2+g_additional_root_bits is not dividable without remainder by g_latency, then
again more than the needed number of root bits are calculated.
The integer part of the calculated root at output root_o is always smaller than or equal to the exact root.
The accuracy can be increased by setting g_additional_root_bits to a value different from 0,
which creates additional bits at root_o beneath the binary point.
The difference between the exact root and the calculated root at output root_o is always less than 2**(-g_additional_root_bits).
In comparison to the module "cordic_square_root" (also available at this website) the module "square_root" gives better accuracy,
especially when g_radicand_width is bigger than 10 bits (when cordic_square_root and square_root are configured in a way that
they generate the same number of root bits).
This module calculates the square root of a positive integer (radicand).
As the calculated root has a limited number of bits, it cannot represent the exact square root of the radicand (except when the radicand itself is a square number).
But the calculated root will always exactly fulfill this equation:
radicand = root**2 + remainder (0)
where remainder is always positive or zero and is made as small as possible.
To find the smallest remainder, the algorithm sets each bit of the root to 1 on a trial basis (starting at the most significand bit, MSB) and
checks the sign of the resulting remainder after each set bit.
When the remainder is still positive, then setting the bit was correct and can be kept.
The algorithm starts with root=0 and remainder=radicand, which is always correct (step i=0, see below).
At the next step the MSB of root is set and the sign of the new remainder is determined by calculating the difference radicand-root**2.
When the sign of the difference is positive, then setting the bit to 1 was correct, otherwise the bit must be reset to 0.
Afterwards this procedure is done with each next bit of root, until the last bit of root is determined.
Calculating the difference radicand-root**2 is not so easy, as the multiplication root*root is involved.
But this multiplication can be avoided:
The radicand has 2*n bits (if g_radicand_width is an odd number, an additional MSB with value 0 is added to the radicand),
where n is the number of bits of the root.
When in step i (i=1..n) of the algorithm a next bit of the root is determined, then
the previous result root[i-1] and remainder[i-1] already fulfill equation 0:
radicand = root[i-1]**2 + remainder[i-1] (1)
After step i the new determined root(i) must also fulfill equation 0:
radicand = root[i]**2 + remainder[i] (2)
As the sign of remainder[i] must be checked, the equation 2 is resolved to remainder[i]:
remainder[i] = radicand - root[i]**2 (3)
The term root[i] is equal to root[i-1] with the bit in evaluation set to 1 (at step i=1 the MSB is set) which is equivalent to this addition:
root[i] = root[i-1] + 2**(n-i)
Now in equation 3 the term root[i] can be replaced:
remainder[i] = radicand - (root[i-1] + 2**(n-i))**2 (4)
remainder[i] = radicand - root[i-1]**2 - 2*root[i-1]*2**(n-i) - 2**(2*n-2*i) (5)
The difference "radicand - root[i-1]**2" is known from equation 1 and has the value remainder[i-1]:
remainder[i] = remainder[i-1] - 2*root[i-1]*2**(n-i) - 2**(2*n-2*i)
remainder[i] = remainder[i-1] - ( root[i-1]*2**(n-i+1) + 2**(2*n-2*i)) (6)
Adding 2**(2*n-2*i) to the product root[i-1]*2**(n-i+1) means that bit (2*n-2*i) of the product must be set to 1.
The product root[i-1]*2**(n-i+1) means that root[i-1] must be shifted left in each step by a decreasing number (n,n-1,...,1) of bits.
Instead of implementing these different left shifts, root[0] is implemented as a signal with 2n bits, which is already shifted left by n bits and has all bits at value 0.
If in step i=1 the MSB of root has to be set, then root[1] is calculated by setting the MSB of root[0] (which has 2n bits) to 1.
Otherwise root[1] is identical to root[0].
Then root[1] is shifted 1 bit to the right for the next step, this is the same as shifting left by (n-2+1) bits in the next step.
In the next step when the next bit of root has to be set, the shifted position of root has to be taken into account.
The index of the next bit of root to be set is determined by 2*n-1-2*(i-1).
Again the new calculated root is shift right by 1 bit at last.
After the last step root was shifted right by 1 bit so often, that it is shifted to the correct position.
At each step the new calculated remainder can only be used, if it is positive, otherwise the remainder is kept unchanged.
Source code for HDL-SCHEM-Editor and HDL-FSM-Editor for module "square_root" and its testbenches (Number of downloads =
17 ).
With these files the schematics and the state-diagram can be loaded into HDL-SCHEM-Editor or HDL-FSM-Editor and can be easily read and modified:
All module VHDL-files of the module "square_root" (Number of downloads =
17 ).
These files were generated by HDL-SCHEM-Editor and HDL-FSM-Editor:
All testbench VHDL-files of the 2 testbenches of the module "square_root" (Number of downloads =
18 ).
These files were generated by HDL-SCHEM-Editor and HDL-FSM-Editor:
You should extract all archives into a folder named "square_root".
Then you must replace "M:/gesicherte Daten/Programmieren/VHDL/square_root" in all "hdl_editor_designs/*.hse" source files by your path to this directory.
Now you can navigate through the design by HDL-SCHEM-Editor and generate HDL by HDL-SCHEM-Editor for all modules except square_root_control,
for which the HDL must be generated by HDL-FSM-Editor.
Of course there is only need for generating HDL, if you change something at the modules, because you can find the HDL in VHDL_designs.zip and VHDL_testbenches.zip.
If you want to simulate or modify the modules by HDL-SCHEM-Editor you also must adapt the information in the Control-tab of the toplevel you want to work on.
There you must define a "Compile through hierarchy command", an "Edit command", the path to your HDL-FSM-Editor and a "Working directory".
Version 1.0 (13.01.2025):
If you detect any bugs or have any questions,
please send a mail to "matthias.schweikart@gmx.de".