Reduces multiple words into a single word, using the given Boolean
operation. Put differently: it's a bit-reduction of
each bit position across all words. The words_in
input contains all the
input words concatenated one after the other.
A common use case is to compute multiple results and their selecting conditions in parallel, then annul all but the result you want and OR-reduce them into a single result. Or don't annul the results, but NAND them to see each bit position where the results disagree, and then maybe bit-reduce that to signal if any of the results disagree, possibly signalling an error.
`default_nettype none module Word_Reducer #( parameter OPERATION = "", parameter WORD_WIDTH = 0, parameter WORD_COUNT = 0, // Don't change at instantiation parameter TOTAL_WIDTH = WORD_WIDTH * WORD_COUNT ) ( input wire [TOTAL_WIDTH-1:0] words_in, output wire [WORD_WIDTH-1:0] word_out ); localparam BIT_ZERO = {WORD_COUNT{1'b0}};
Instantiate the following hardware once for each bit position in a word.
The bit_word
gathers the bit at a given position from all the words.
(e.g.: all the first bits, all the second bits, etc...) Then, for each
word, extract the given bit position into the bit_word
.
generate genvar i, j; for (j=0; j < WORD_WIDTH; j=j+1) begin : per_bit reg [WORD_COUNT-1:0] bit_word = BIT_ZERO; for (i=0; i < WORD_COUNT; i=i+1) begin : per_word always @(*) begin bit_word[i] = words_in[(WORD_WIDTH*i)+j]; end end
Then reduce the bit_word
into the output bit using the specified Boolean
function. (i.e.: all input words first bits, gathered into bit_word
,
reduce to the first output word bit). I use the
Bit_Reducer here to both express that word reduction
is a composition of bit reduction, and to avoid having to rewrite each
possible case along with the special linter directives to avoid width
warnings.
The downside is that the list of possible operations is not visible here, but if you need to find them out, then reading the bit reducer code is the best documentation. And if you need to add an operation, then the word reducer code remains unchanged.
Bit_Reducer #( .OPERATION (OPERATION), .INPUT_COUNT (WORD_COUNT) ) bit_position ( .bits_in (bit_word), .bit_out (word_out[j]) ); end endgenerate endmodule
There exists an alternate implementation of word reduction which is differently elegant, but has a couple of pitfalls and cannot re-use the bit reducer code. I'll outline it here because it uses looped partial calculations with a peeled-out first iteration, which is a common code pattern.
Repeatedly using a register in an unclocked loop expresses a combinational logic loop, which must be avoided: without special effort the CAD tool cannot analyze it for timing, or sometimes even synthesize it. So we create an array of registers to hold each partial result, and initialize them to zero.
reg [WORD_WIDTH-1:0] partial_reduction [WORD_COUNT-1:0];
integer i;
initial begin
for(i=0; i < WORD_COUNT; i=i+1) begin
partial_reduction[i] = ZERO;
end
end
First, connect the zeroth input word to the zeroth partial result. This peels out the first loop iteration, where the read index would be out of range (negative!) otherwise.
always @(*) begin
partial_reduction[0] = in[0 +: WORD_WIDTH];
Then OR the previous partial result with the current input word, creating
the next partial result. Note the start index because of the peeled-out
first iteration: i=1
. This is where you would implement each possible
operation, and most of the code would be duplicated boilerplate, differing
only by the Boolean operator. This is dull, error-prone, and drags in
synthesis-time complications, such as linter directives and operation
selection, into the middle of run-time code.
for(i=1; i < WORD_COUNT; i=i+1) begin
partial_reduction[i] = partial_reduction[i-1] | words_in[WORD_WIDTH*i +: WORD_WIDTH];
end
The last partial result is the final result.
word_out = partial_reduction[WORD_COUNT-1];
end