The module "cordic_square_root"
a VHDL implementation of the Hyperbolic Cordic Algorithm

The VHDL module "cordic_square_root" (see symbol) calculates the root of a radicand.
The radicand is converted into a vector with a x-component and a y-component.
Then this vector is rotated by the Hyperbolic Cordic Algorithm to the x-axis.
Because x- and y-component are built from the radicand in a special way, the x-component of the vector is equal to the root at the end.

The module executes g_radicand_width/2 Hyperbolic Cordic micro-rotations.
The module calculates not only the integer part of the root but also bits after the binary point.
The module extends the internal operands by additional least significant bits if the generic g_improve_accuracy is set to true.
The latency of the module can be configured (by generics) independently from the number of micro-rotations.

This means, the module is configurable by generics in order

But of course there is no guarantee that timing closure can be reached with the selected values
for the generics, as the timing depends on the technology which is used at synthesis.

The differences to the module "square_root" (also available at this website) are (when both modules create the same number of root bits):

The module "cordic_square_root" was developed with HDL-SCHEM-Editor.

Ports:

Port name Direction Description
res_i input asynchronous reset input, 1-active
clk_i input clock input
start_i input This input expects an 1-active impulse of 1 clock cycle width in order to start the calculation.
radicand_i(g_radicand_width-1:0) input Unsigned positive radicand (g_radicand_width is a generic).
The input value is latched at start=1.
ready_o output 1-active impulse of 1 clock cycle width, which gets active when the calculation is ready
(at latency 0 it gets active in the same clock cycle in which start_i gets active).
square_root_o
(g_radicand_width/2+g_radicand_width mod 2:0)
output Unsigned positive root of the radicand. Valid at ready_o=1. Stable between 2 impulse at ready_o if g_latency_shift_back!=0.
square_root_fract_o
(g_radicand_width/2-1+g_radicand_width mod 2:0)
output Unsigned positive root bits after the binary point. Valid at ready_o=1. Stable between 2 impulse at ready_o if g_latency_shift_back!=0.

Generics:

Generic name Minimum Value Maximum Value Description
g_radicand_width 1 1328 Number of bits of the radicand
The maximum limit is caused by the constant c_mod_value_unsigned (module cordic_square_root_prepare_operands). This limit is only correct, if g_improved_accuracy=false, otherwise it is smaller by a small amount.
g_improved_accuracy false true If set to true, then additional least significant bits are used for the internal adders.
How many additional bits are used is calculated by log2(g_radicand_width).
g_latency_shift_radicand 0 1 Latency of the sub-module which shifts the radicand left, to make it as big as possible
Bigger values than 1 are handled as 1.
g_latency_prepare 0 1 Latency of the sub-module calculates the x- and the y-component from the radicand
Bigger values than 1 are handled as 1.
g_latency_rotate_by_cordic 0 Latency of the sub-module which implements the cordic algorithm
Recommended as maximum value is (g_radicand_width + g_radicand_width mod 2)/2,
which is the minimal number of iterations the algorithm uses.
g_latency_shift_back 0 1 Latency of the sub-module which shifts the root to the right to compensate the left shift
Bigger values than 1 are handled as 1.
g_latency_shorten_vector 0 1 Latency of the sub-module which shortens the vector to its original length
Bigger values than 1 are handled as 1.

The module "cordic_square_root" is a hierarchical module, which is built by several submodules.

Submodule name Functionality
cordic_square_root_shift_radicand

Small radicands are handled with bad accuracy by the Hyperbolic Cordic Algorithm.
So all radicands are shifted left as much as possible by this module.
The shift number must be an even number, as the calculated root has to be shifted to
the right by half of this shift number.

The latency of the module can be configured to be 0 (combinatorial circuit) or 1 (sequential circuit with 1 flipflop stage).

cordic_square_root_prepare_operands

The Hyperbolic Cordic Algorithm rotates a vector which has a x- and a y-component.
These components are calculated here from the radicand.
If g_improved_accuracy is true, then additional least significand bits are added.

The latency of the module can be configured to be 0 (combinatorial circuit) or 1 (sequential circuit with 1 flipflop stage).

cordic_square_root_rotate

This module is built by the two submodules cordic_square_root_control (small FSM) and cordic_square_root_rotate_step and by additionally necessary glue logic.

The combinatorial submodule cordic_square_root_rotate_step executes 1 micro-rotation of the Hyperbolic Cordic Algorithm. Which micro-rotation is executed is configured by an input which is controlled by the submodule cordic_square_root_control. As the module only executes 1 micro-rotation, the module has to be instantiated several times and/or has to be used again several times.

The exact structure of the submodule cordic_square_root_rotate is automatically determined by the generics g_radicand_width and g_latency_rotate_by_cordic. The module contains the infrastructure to reuse the submodule cordic_square_root_rotate_step several times.

The latency of the module can be configured to be 0 (combinatorial circuit) or different from 0 (sequential circuit). When the configured latency is 0 or 1, then for each micro-rotation a submodule cordic_square_root_rotate_step is instantiated. When the latency is identical to the minimal number of micro-rotations the algorithm uses, then the submodule cordic_square_root_rotate_step is instantiated only once. See tab "Configuration of latency" for some examples.

cordic_square_root_shift_back

The shift left of the radicand is undone by a shift right of the calculated root here.
The calculated root is split into an integer and a fractional part here.

The latency of the module can be configured to be 0 (combinatorial circuit) or 1 (sequential circuit with 1 flipflop stage).

The generics g_latency_shift_radicand, g_latency_prepare, g_latency_shift_back
work all in the same way:

Which value for each of the generics shall be used, depends on the connected logic and the used
technology and must be derived from the synthesis timing reports and from the requirements.

The generic g_latency_rotate_by_cordic for the submodule cordic_square_root_rotate works in a different way:

If g_latency_rotate_by_cordic=(g_radicand_width + g_radicand_width mod 2)/2 then the latency is equal to the number of needed micro-rotations
and exact 1 micro-rotation has to be done in 1 clock cycle, so the submodule cordic_square_root_rotate_step is only instantiated once.

If g_latency_rotate_by_cordic is bigger than (g_radicand_width + g_radicand_width mod 2)/2, then additional micro-rotations are executed,
but again the submodule cordic_square_root_rotate_step is instantiated only once.

If g_latency_rotate_by_cordic is smaller than (g_radicand_width + g_radicand_width mod 2)/2, then more than 1 micro-rotation has to be done in 1 clock cycle,
and the submodule cordic_square_root_rotate_step is instantiated several times. When (g_radicand_width + g_radicand_width mod 2)/2 is not dividable without
remainder by g_latency_rotate_by_cordic, then again additional micro-rotations are added.

An exact result can only be created, if infinite micro-rotations are executed, which is not possible.

The module calculates the root by (g_radicand_width+grandicand_width mod 2)/2 micro-rotations.
If you increase the number of bits of the radicand, the results are improved not only by the increased number of micro-rotations,
but also by the increased number of bits of square_root_fract_o.

In order to increase accuracy without increasing the number of bits of the radicand you can set the generic g_improved_accuracy to "true".
Then the module rotate calculates how many additional bits should be used internally at the adders to improve accuracy.
The number of micro-rotations is not increased.

In comparison to the module "square_root" (also available at this website) the module "cordic_square_root" gives poorer accuracy,
especially when the g_radicand_width is bigger than 10 bits (cordic_square_root and square_root must be configured in a way that
they generate the same number of root bits).

symbol symbol symbol symbol symbol symbol symbol symbol

Source code for HDL-SCHEM-Editor and HDL-FSM-Editor for module "cordic_square_root" and its testbench (Number of downloads = 24 ).
With these files the schematics and the state-diagram can be loaded into HDL-SCHEM-Editor or HDL-FSM-Editor and can be easily read and modified:

All module VHDL-files of the module "cordic_square_root" (Number of downloads = 25 ).
These files were generated by HDL-SCHEM-Editor and HDL-FSM-Editor:

All VHDL-files of the testbench of the module "cordic_square_root" (Number of downloads = 20 ).
These files were generated by HDL-SCHEM-Editor and HDL-FSM-Editor:

Relocation hints:

You should extract all archives into a folder named "cordic_square_root".
Then you must replace "M:/gesicherte Daten/Programmieren/VHDL/cordic_square_root" in all "hdl_editor_designs/*.hse" source files by your path to this directory.
Now you can navigate through the design by HDL-SCHEM-Editor and generate HDL by HDL-SCHEM-Editor for all modules except cordic_square_root_control,
for which the HDL must be generated by HDL-FSM-Editor.
Of course there is only need for generating HDL, if you change something at the modules, because you can find the HDL in VHDL_designs.zip and VHDL_testbenches.zip.
If you want to simulate or modify the modules by HDL-SCHEM-Editor you also must adapt the information in the Control-tab of the toplevel you want to work on.
There you must define a "Compile through hierarchy command", an "Edit command", the path to your HDL-FSM-Editor and a "Working directory".

Change log:

Version 1.1 (27.01.2025):

Version 1.0 (13.01.2025):

If you detect any bugs or have any questions,
please send a mail to "matthias.schweikart@gmx.de".