Computes the signed remainder of the signed dividend and divisor, and uses the status of each iterated subtraction to control the Quotient module. Part of the Signed Integer Divider module. Not really usable by itself. But if you want to do so, be sure to short-circuit the control handshake interface.
The calculation starts after any pending results have been read out by the output ready/valid handshake, and an input ready/valid handshake provides a new dividend and divisor. Each calculation step is synchronized with the Quotient module through a control handshake.
Think of the dividend
and divisor
as points along a number line.
Depending on their initial signs, we will iteratively add or subtract the
largest possible multiple of divisor
to/from the dividend
as necessary
to bring the dividend
towards zero without passing zero. No initial
calculations of absolute values or final sign corrections are necessary,
which saves a lot of hardware and cycles.
`default_nettype none module Remainder_Integer_Signed #( parameter WORD_WIDTH = 0, parameter STEP_WORD_WIDTH = 0 ) ( input wire clock, input wire clear, input wire input_valid, output reg input_ready, input wire [WORD_WIDTH-1:0] dividend, input wire [WORD_WIDTH-1:0] divisor, output reg output_valid, input wire output_ready, output wire [WORD_WIDTH-1:0] remainder, output wire divide_by_zero, output reg control_valid, input wire control_ready, output reg step_ok ); `include "clog2_function.vh" localparam WORD_ZERO = {WORD_WIDTH{1'b0}}; initial begin input_ready = 1'b0; output_valid = 1'b0; control_valid = 1'b0; step_ok = 1'b0; end
Some basic definitions to establish our two's-complement signed representation.
localparam ADD = 1'b0; localparam SUB = 1'b1; localparam POSITIVE = 1'b0; localparam NEGATIVE = 1'b1;
We have to internally compute with one extra bit of range to allow the
minimum signed number to be expressible as an unsigned number. Else, we
cannot subtract any multiple of itself from this minimum signed number.
For example, -8 / 1
needs to add +8
to the remainder to reach
a remainder of 0
, and with the minimum number of bits (4) to represent
-8
, we can only represent up to +7
.
localparam WORD_WIDTH_LONG = WORD_WIDTH + 1; localparam WORD_ZERO_LONG = {WORD_WIDTH_LONG{1'b0}}; localparam WORD_ONE_LONG = {{WORD_WIDTH_LONG-1{1'b0}},1'b1}; localparam WORD_ONES_LONG = {WORD_WIDTH_LONG{1'b1}};
We must then also increase the step word width by one to avoid an unnecessary extra calculation step. See the Multiprecision Adder/Subtractor module for details.
localparam STEP_WORD_WIDTH_LONG = STEP_WORD_WIDTH + 1;
wire [WORD_WIDTH_LONG-1:0] divisor_long; Width_Adjuster #( .WORD_WIDTH_IN (WORD_WIDTH), .SIGNED (1), .WORD_WIDTH_OUT (WORD_WIDTH_LONG) ) divisor_extend ( .original_input (divisor), .adjusted_output (divisor_long) );
Extract and store the initial sign of the divisor.
reg divisor_sign_initial_load = 1'b0; wire divisor_sign_initial; Register #( .WORD_WIDTH (1), .RESET_VALUE (1'b0) ) divisor_sign_initial_storage ( .clock (clock), .clock_enable (divisor_sign_initial_load), .clear (1'b0), .data_in (divisor_long [WORD_WIDTH_LONG-1]), .data_out (divisor_sign_initial) );
Store the divisor aside and shift its LSB into the MSB of the remainder_increment at each division step, while sign-extending the divisor. The initial load does the first shift implicitly.
NOTE: We do the signed shift right manually rather than use the Verilog
operator ">>>" since that would need us to declare divisor_long
, and only
it, as signed, which is asking for unexpected bugs.
reg divisor_enable = 1'b0; reg divisor_load = 1'b0; wire [WORD_WIDTH_LONG-1:0] divisor_loaded; reg [WORD_WIDTH_LONG-1:0] divisor_initial = WORD_ZERO_LONG; reg [WORD_WIDTH_LONG-1:0] divisor_next = WORD_ZERO_LONG; always @(*) begin divisor_initial = {divisor_long [WORD_WIDTH_LONG-1], divisor_long [WORD_WIDTH_LONG-1:1]}; end Register_Pipeline #( .WORD_WIDTH (WORD_WIDTH_LONG), .PIPE_DEPTH (1), .RESET_VALUES (WORD_ZERO_LONG) ) divisor_storage ( .clock (clock), .clock_enable (divisor_enable), .clear (1'b0), .parallel_load (divisor_load), .parallel_in (divisor_initial), // verilator lint_off PINCONNECTEMPTY .parallel_out (), // verilator lint_on PINCONNECTEMPTY .pipe_in (divisor_next), .pipe_out (divisor_loaded) );
The initial shift into the MSB is done at load.
reg remainder_increment_enable = 1'b0; reg remainder_increment_load = 1'b0; wire [WORD_WIDTH_LONG-1:0] remainder_increment_loaded; reg [WORD_WIDTH_LONG-1:0] remainder_increment_initial = WORD_ZERO_LONG; reg [WORD_WIDTH_LONG-1:0] remainder_increment_next = WORD_ZERO_LONG; always @(*) begin remainder_increment_initial = {divisor_long [0], WORD_ZERO}; end Register_Pipeline #( .WORD_WIDTH (WORD_WIDTH_LONG), .PIPE_DEPTH (1), .RESET_VALUES (WORD_ZERO_LONG) ) remainder_increment_storage ( .clock (clock), .clock_enable (remainder_increment_enable), .clear (1'b0), .parallel_load (remainder_increment_load), .parallel_in (remainder_increment_initial), // verilator lint_off PINCONNECTEMPTY .parallel_out (), // verilator lint_on PINCONNECTEMPTY .pipe_in (remainder_increment_next), .pipe_out (remainder_increment_loaded) );
wire [WORD_WIDTH_LONG-1:0] dividend_long; Width_Adjuster #( .WORD_WIDTH_IN (WORD_WIDTH), .SIGNED (1), .WORD_WIDTH_OUT (WORD_WIDTH_LONG) ) dividend_extend ( // It's possible some input bits are truncated away // verilator lint_off UNUSED .original_input (dividend), // verilator lint_on UNUSED .adjusted_output (dividend_long) );
Extract the initial sign of the dividend.
reg dividend_sign_initial_load = 1'b0; wire dividend_sign_initial; Register #( .WORD_WIDTH (1), .RESET_VALUE (1'b0) ) dividend_sign_initial_storage ( .clock (clock), .clock_enable (dividend_sign_initial_load), .clear (clear), .data_in (dividend_long [WORD_WIDTH_LONG-1]), .data_out (dividend_sign_initial) );
The dividend
(numerator of a fraction) is stored as the remainder
of
the division. We repeatedly add/subtract the remainder_increment
unless
the remainder
would become too small and flip its sign.
reg remainder_enable = 1'b0; reg remainder_load = 1'b0; wire [WORD_WIDTH_LONG-1:0] remainder_loaded; wire [WORD_WIDTH_LONG-1:0] remainder_next; Register_Pipeline #( .WORD_WIDTH (WORD_WIDTH_LONG), .PIPE_DEPTH (1), .RESET_VALUES (WORD_ZERO_LONG) ) remainder_storage ( .clock (clock), .clock_enable (remainder_enable), .clear (1'b0), .parallel_load (remainder_load), .parallel_in (dividend_long), // verilator lint_off PINCONNECTEMPTY .parallel_out (), // verilator lint_on PINCONNECTEMPTY .pipe_in (remainder_next), .pipe_out (remainder_loaded) ); Width_Adjuster #( .WORD_WIDTH_IN (WORD_WIDTH_LONG), .SIGNED (1), .WORD_WIDTH_OUT (WORD_WIDTH) ) remainder_shorten ( .original_input (remainder_loaded), .adjusted_output (remainder) );
Shift the LSB of the divisor into the MSB of the remainder_increment. Shifts are signed, and done manually to avoid Verilog pitfalls.
always @(*) begin divisor_next = {divisor_loaded [WORD_WIDTH_LONG-1], divisor_loaded [WORD_WIDTH_LONG-1:1]}; remainder_increment_next = {divisor_loaded [0], remainder_increment_loaded [WORD_WIDTH_LONG-1:1]}; end
Now, depending on the divisor sign, check the contents of divisor to see if there are still non-sign bits in it, which means the remainder_increment is invalid, and check if the sign of the remainder_increment does not match the sign of the divisor, which means we also haven't yet shifted enough bits into the remainder_increment to make it a valid number.
NOTE: The bit reduction when testing if the divisor only contains sign bits may end up being a critical path (but a high-speed one). It's not currently pipelinable without major control changes.
reg remainder_increment_sign = 1'b0; reg divisor_all_sign_bits = 1'b0; reg remainder_increment_valid = 1'b0; always @(*) begin remainder_increment_sign = remainder_increment_loaded [WORD_WIDTH_LONG-1]; divisor_all_sign_bits = (divisor_sign_initial == POSITIVE) ? (divisor_loaded == WORD_ZERO_LONG) : (divisor_loaded == WORD_ONES_LONG); remainder_increment_valid = (remainder_increment_sign == divisor_sign_initial) && (divisor_all_sign_bits == 1'b1); end
Then apply the remainder_increment to the remainder
reg remainder_add_sub = 1'b0; always @(*) begin remainder_add_sub = (divisor_sign_initial == dividend_sign_initial) ? SUB : ADD; end reg remainder_input_valid = 1'b0; // wire remainder_input_ready; wire remainder_output_valid; reg remainder_output_ready = 1'b0; wire remainder_next_overflow; Adder_Subtractor_Binary_Multiprecision #( .WORD_WIDTH (WORD_WIDTH_LONG), .STEP_WORD_WIDTH (STEP_WORD_WIDTH_LONG) ) remainder_calc ( .clock (clock), .clock_enable (1'b1), .clear (clear), .input_valid (remainder_input_valid), //verilator lint_off PINCONNECTEMPTY .input_ready (), //verilator lint_on PINCONNECTEMPTY .add_sub (remainder_add_sub), // 0/1 -> A+B/A-B .A (remainder_loaded), .B (remainder_increment_loaded), .output_valid (remainder_output_valid), .output_ready (remainder_output_ready), .sum (remainder_next), // verilator lint_off PINCONNECTEMPTY .carry_out (), .carries (), // verilator lint_on PINCONNECTEMPTY .overflow (remainder_next_overflow) );
If the next division step would overshoot past zero and change the sign of the remainder, meaning too much was added/subtracted, then this division step does nothing.
NOTE: The bit reduction when testing if the remainder has reached zero exactly may end up being a critical path (but a high-speed one). It's not currently pipelinable without major control changes.
reg remainder_next_valid = 1'b0; always @(*) begin remainder_next_valid = ((remainder_next [WORD_WIDTH_LONG-1] == dividend_sign_initial) && (remainder_next_overflow == 1'b0)) || (remainder_next == WORD_ZERO_LONG); end
And report if we tried to divide by zero. We do this after the pipeline since it's a reduction operation. We can load this at the end of the first calculation step.
We have to reconstruct the initially loaded divisor by undoing the first shift into the remainder increment at load time.
reg divisor_is_zero = 1'b0; reg divide_by_zero_load = 1'b0; always @(*) begin divisor_is_zero = ({divisor_loaded [0 +: WORD_WIDTH], remainder_increment_loaded [WORD_WIDTH_LONG-1]} == WORD_ZERO_LONG); end Register #( .WORD_WIDTH (1), .RESET_VALUE (1'b0) ) divide_by_zero_storage ( .clock (clock), .clock_enable (divide_by_zero_load), .clear (clear), .data_in (divisor_is_zero), .data_out (divide_by_zero) );
We denote state as two bits, with the following transitions: LOAD -> CALC -> DONE -> LOAD -> ... We don't handle the fourth, impossible case. The state encoding is arbitrary.
localparam STATE_WIDTH = 2; localparam [STATE_WIDTH-1:0] STATE_LOAD = 2'b00; localparam [STATE_WIDTH-1:0] STATE_CALC = 2'b10; localparam [STATE_WIDTH-1:0] STATE_DONE = 2'b11; localparam [STATE_WIDTH-1:0] STATE_ERROR = 2'b01; // Never reached
The state bits, from which we derive the control outputs and the internal control signals.
reg [STATE_WIDTH-1:0] state_next = STATE_LOAD; wire [STATE_WIDTH-1:0] state; Register #( .WORD_WIDTH (STATE_WIDTH), .RESET_VALUE (STATE_LOAD) ) state_storage ( .clock (clock), .clock_enable (1'b1), .clear (clear), .data_in (state_next), .data_out (state) );
Each division takes WORD_WIDTH_LONG
steps, from WORD_WIDTH_LONG-1
to 0
, plus
one step to initially load the dividend and divisor. Thus, we need
a counter of the correct width.
localparam STEPS_WIDTH = clog2(WORD_WIDTH_LONG); localparam STEPS_INITIAL = WORD_WIDTH_LONG - 1; localparam STEPS_ZERO = {STEPS_WIDTH{1'b0}}; localparam STEPS_ONE = {{STEPS_WIDTH-1{1'b0}},1'b1};
Count down WORD_WIDTH-1 calculation steps. Stops at zero, and reloads when leaving STATE_LOAD.
reg calculation_step_clear = 1'b0; reg calculation_step_do = 1'b0; wire [STEPS_WIDTH-1:0] calculation_step; Counter_Binary #( .WORD_WIDTH (STEPS_WIDTH), .INCREMENT (STEPS_ONE), .INITIAL_COUNT (STEPS_INITIAL [STEPS_WIDTH-1:0]) ) calculation_steps ( .clock (clock), .clear (calculation_step_clear), .up_down (1'b1), // 0/1 -> up/down .run (calculation_step_do), .load (1'b0), .load_count (STEPS_ZERO), .carry_in (1'b0), // verilator lint_off PINCONNECTEMPTY .carry_out (), .carries (), .overflow (), // verilator lint_on PINCONNECTEMPTY .count (calculation_step) );
Accept inputs when empty (after results are read out) or frehsly reset/cleared). Declare outputs valid when calculation is done. Perform a control handshake each time the addition/subtraction is complete.
always @(*) begin output_valid = (state == STATE_DONE); input_ready = (state == STATE_LOAD); control_valid = (state == STATE_CALC) && (remainder_output_valid == 1'b1); end
reg load_inputs = 1'b0; // When we load the dividend and divisor. reg read_outputs = 1'b0; // When we read out the remainder. reg read_control = 1'b0; // When other computations acknowledge the calculation step. reg calculating = 1'b0; // High while performing the division steps. reg step_done = 1'b0; // High when a calculation exits the adder/subtractor. reg first_calculation = 1'b0; // High during the first calculation step. reg last_calculation = 1'b0; // High during the last calculation step. always @(*) begin load_inputs = (input_ready == 1'b1) && (input_valid == 1'b1); read_outputs = (output_valid == 1'b1) && (output_ready == 1'b1); read_control = (control_valid == 1'b1) && (control_ready == 1'b1); calculating = (state == STATE_CALC); step_done = (read_control == 1'b1); first_calculation = (read_control == 1'b1) && (calculation_step == STEPS_INITIAL [STEPS_WIDTH-1:0]); last_calculation = (read_control == 1'b1) && (calculation_step == STEPS_ZERO); end
Past this point, we should not refer directly to the FSM states, but to these events which are combinations of states and signals.
There is no handling of erroneous states.
always @(*) begin state_next = (load_inputs == 1'b1) ? STATE_CALC : state; state_next = (last_calculation == 1'b1) ? STATE_DONE : state_next; state_next = (read_outputs == 1'b1) ? STATE_LOAD : state_next; end
Signal to the Quotient module if this calculation step was valid.
always @(*) begin step_ok = (remainder_next_valid == 1'b1) && (remainder_increment_valid == 1'b1) && (calculating == 1'b1) && (step_done == 1'b1); end
always @(*) begin calculation_step_clear = (load_inputs == 1'b1) || (clear == 1'b1); calculation_step_do = (step_done == 1'b1); end
Adder/Subtractor Control
always @(*) begin remainder_input_valid = (calculating == 1'b1); remainder_output_ready = (control_ready == 1'b1); end
Divisor and Remainder Increment Control
always @(*) begin divide_by_zero_load = (first_calculation == 1'b1); divisor_load = (load_inputs == 1'b1); divisor_enable = (load_inputs == 1'b1) || ((calculating == 1'b1) && (step_done == 1'b1)); divisor_sign_initial_load = (load_inputs == 1'b1); remainder_increment_load = (divisor_load == 1'b1); remainder_increment_enable = (divisor_enable == 1'b1); end
Dividend and the Remainder Control
always @(*) begin dividend_sign_initial_load = (load_inputs == 1'b1); remainder_load = (load_inputs == 1'b1); remainder_enable = (load_inputs == 1'b1) || ((step_ok == 1'b1) && (step_done == 1'b1)); end endmodule