Source

License

Index

Memory Controller

Contains the MIG IP DDR3 controller instance to connect to the DDR3 RAM, and the necessary logic to read/write a given number of MEM_DATA_WIDTH-wide blocks of data and transfer it from/to the DDR3 controller.

The memory controller only reads/writes blocks of data. The data MUST be in MEM_DATA_WIDTH blocks and the address MUST be aligned to MEM_DATA_WIDTH lengths. Basically, the DDR is accessed as a block-oriented device and any address/data translation must be done in software. The length is counted in number of MEM_DATA_WIDTH blocks. A length of 0 makes the operation a no-op which ends immediately.

The memory controller will accept an operation on a ready/valid interface, keep ready low while operating, and raise ready once the data transfer is complete.

While an operation is in progress, the read OR write interface will become active until all the data has transferred. There is no interleaving of operations.

NOTE: the operation interface is in the clk_main domain, but the read/write interfaces are in the mem_clk domain! It's not obvious, but mem_clk and mem_rst are outputs from the DDR controller and define that clock domain.

Parameters, Ports, and Constants

`default_nettype none

module memory_controller
#(
    parameter MAX_REPEAT_COUNT          = 255,   // Max number of reads or writes per operation.

    parameter READ_FIFO_RAMSTYLE        = "block",  // Implementation of read data FIFO.
    parameter READ_FIFO_DEPTH           = 64,       // FIFO depth in MEM_DATA_WIDTH blocks (SEE "FIFO DEPTH LIMITS" NOTE BELOW!!!)

    // MEM is for the User Interface

    parameter MEM_DATA_WIDTH            = 128,  // 16 bit words * 8, from 4:1 clock ratio and Double Data Rate
    parameter MEM_ADDR_WIDTH            = 28,   // Set by IP block, but only 2**27 exists
    parameter MEM_CMD_WIDTH             = 3,    // Read (3'b001) or Write (3'b000) (see UG 586, p.93) 
    parameter MEM_MASK_WIDTH            = 16,   // Bytemask for writes
    parameter MEM_BURST_LENGTH          = 8,    // How many 16-bit DDR locations per read/write

    // DDR3 is for the connections to the DDR3 RAM device, not seen/touched by the user.

    parameter DDR3_ADDR_WIDTH           = 14,
    parameter DDR3_BANK_WIDTH           = 3,
    parameter DDR3_DATA_WIDTH           = 16,
    parameter DDR3_MASK_WIDTH           = 2,
    parameter DDR3_STROBE_WIDTH         = 2,

    // *** Do not set the parameters after this line at instantiation, except in Vivado IPI. ***

    parameter REPEAT_COUNT_WIDTH        = clog2(MAX_REPEAT_COUNT) + 1 // +1 for exact representation, not a 0-index

)
(
    // Main clock domain (50 MHz)

    input  wire                             clk_main,
    input  wire                             rst_main_n,

    output wire                             memory_calibration_complete, // Cannot use the controller until this goes high

    output wire                             operation_ready,
    input  wire                             operation_valid,
    input  wire [MEM_CTRL_OP_WIDTH-1:0]     operation,
    input  wire [MEM_ADDR_WIDTH-1:0]        operation_address,
    input  wire [REPEAT_COUNT_WIDTH-1:0]    operation_length,

    // MIG User Interface domain (100 MHz)

    output wire                             mem_clk,            // 4:1 ratio to DDR3 clock, so 100 MHz
    output wire                             mem_rst,           

    output wire                             write_data_ready,
    input  wire                             write_data_valid,
    input  wire [MEM_DATA_WIDTH-1:0]        write_data,

    input  wire                             read_data_ready,
    output wire                             read_data_valid,
    output wire [MEM_DATA_WIDTH-1:0]        read_data,

    output reg                              read_data_fifo_overflow, // Raises when FIFO drops data from DDR

    // DDR3 Interface domain (no user signals here)

    input  wire                             ddr3_clk,           // 400 MHz (DDR3 operating speed)
    input  wire                             ddr3_rst,           // Raise for 5ns minimum
    input  wire                             ddr3_clk_ref,       // IDELAYCTRL reference clock: 200 MHz

    // These connect to the DDR3 RAM device, and nothing else.

    output wire [DDR3_ADDR_WIDTH-1:0]       ddr3_addr,
    output wire [DDR3_BANK_WIDTH-1:0]       ddr3_ba,
    output wire                             ddr3_cas_n,
    output wire                             ddr3_ck_n,
    output wire                             ddr3_ck_p,
    output wire                             ddr3_cke,
    output wire                             ddr3_ras_n,
    output wire                             ddr3_reset_n,
    output wire                             ddr3_we_n,
    inout  wire [DDR3_DATA_WIDTH-1:0]       ddr3_dq,
    inout  wire [DDR3_STROBE_WIDTH-1:0]     ddr3_dqs_n,
    inout  wire [DDR3_STROBE_WIDTH-1:0]     ddr3_dqs_p,
    output wire                             ddr3_cs_n,
    output wire [DDR3_MASK_WIDTH-1:0]       ddr3_dm,
    output wire                             ddr3_odt
);

    `include "clog2_function.vh"
    `include "memory_controller_operations.vh"

    localparam MEM_ADDR_ZERO    = {MEM_ADDR_WIDTH{1'b0}};
    localparam MEM_ADDR_ONE     = {{MEM_ADDR_WIDTH-1{1'b0}},1'b1};
    localparam MEM_DATA_ZERO    = {MEM_DATA_WIDTH{1'b0}};
    localparam MEM_MASK_NONE    = {MEM_MASK_WIDTH{1'b0}};

    // These are the internal MIG DDR3 controller operation encodings, from
    // the datasheet. DO NOT CHANGE.

    localparam [MEM_CMD_WIDTH-1:0] MEM_CMD_WRITE = 'd0;
    localparam [MEM_CMD_WIDTH-1:0] MEM_CMD_READ  = 'd1;

    initial begin
        read_data_fifo_overflow = 1'b0;
    end

Operation Interface, CDC, and Buffer

First, we take in an operation, do CDC from the main clock domain into the memory controller interface clock domain, and drop operation_ready for the duration of the operation by not including a buffer, so later logic will raise ready at the end of the operation.

    wire                            operation_cdc_ready;
    wire                            operation_cdc_valid;
    wire [MEM_CTRL_OP_WIDTH-1:0]    operation_cdc;
    wire [MEM_ADDR_WIDTH-1:0]       operation_cdc_address;
    wire [REPEAT_COUNT_WIDTH-1:0]   operation_cdc_length;

    CDC_Word_Synchronizer
    #(
        .WORD_WIDTH             (MEM_CTRL_OP_WIDTH + MEM_ADDR_WIDTH + REPEAT_COUNT_WIDTH),
        .EXTRA_CDC_DEPTH        (0),
        .OUTPUT_BUFFER_TYPE     ("NONE"),   // "NONE", "HALF", "SKID", "FIFO"
        .OUTPUT_BUFFER_CIRCULAR (0),        // non-zero to enable
        // verilator lint_off PINNOCONNECT
        .FIFO_BUFFER_DEPTH      (),         // Only for "FIFO"
        .FIFO_BUFFER_RAMSTYLE   ()          // Only for "FIFO"
        // verilator lint_on  PINNOCONNECT
    )
    operation_buffer
    (
        .sending_clock          (clk_main),
        .sending_clear          (~rst_main_n),
        .sending_data           ({operation,         operation_address,         operation_length}),
        .sending_valid          (operation_valid),
        .sending_ready          (operation_ready),

        .receiving_clock        (mem_clk),
        .receiving_clear        (mem_rst),
        .receiving_data         ({operation_cdc,     operation_cdc_address,     operation_cdc_length}),
        .receiving_valid        (operation_cdc_valid),
        .receiving_ready        (operation_cdc_ready)
    );

Signal the CDC sychronizer when the operation is complete by pulsing operation_cdc_ready high when the Pipeline Handshake Multiplier completes its operation and its gate opens (in the case of pending reads). Otherwise the handshake would complete at the CDC synchronizer, and that tells the outside world the operation is complete, when it is still running.

    wire operation_cdc_ready_internal;

    Pulse_Generator
    operation_completion_detector
    (
        .clock              (mem_clk),
        .level_in           (operation_cdc_ready_internal),
        .pulse_posedge_out  (operation_cdc_ready),
        // verilator lint_off PINCONNECTEMPTY
        .pulse_negedge_out  (),
        .pulse_anyedge_out  ()
        // verilator lint_on  PINCONNECTEMPTY
    );

Signal the address counter when a new operation appears at the output of the CDC synchronizer. This loads the counter just ahead of the multiplied operation, and removes the need to pass the address through the Pipeline Handshake Multiplier or its gate.

This pulse also signals the Pipeline Handshake Multiplier that a new operation begins.

    wire operation_cdc_valid_pulse;

    Pulse_Generator
    new_operation_detector
    (
        .clock              (mem_clk),
        .level_in           (operation_cdc_valid),
        .pulse_posedge_out  (operation_cdc_valid_pulse),
        // verilator lint_off PINCONNECTEMPTY
        .pulse_negedge_out  (),
        .pulse_anyedge_out  ()
        // verilator lint_on  PINCONNECTEMPTY
    );

Gate the CDC-transfered input to the Pipeline Handshake Multiplier. The gate is closed whenever there are pending data reads (operation was loaded, but the read data has not all been sent out yet) so we don't start a new operation before all read data is returned from DDR and accepted. Writes are unaffected by this gate.

    reg open_operation_gate = 1'b0;

    wire                            operation_gated_ready;
    wire                            operation_gated_valid;
    wire [MEM_CTRL_OP_WIDTH-1:0]    operation_gated;
    wire [REPEAT_COUNT_WIDTH-1:0]   operation_gated_length;

    Pipeline_Gate
    #(
        .WORD_WIDTH     (MEM_CTRL_OP_WIDTH + REPEAT_COUNT_WIDTH),
        .IMPLEMENTATION ("AND"),
        .GATE_DATA      (0)
    )
    pending_read_data_gate
    (
        .enable         (open_operation_gate),

        .input_ready    (operation_cdc_ready_internal),
        .input_valid    (operation_cdc_valid_pulse),
        .input_data     ({operation_cdc,   operation_cdc_length}),

        .output_valid   (operation_gated_valid),
        .output_ready   (operation_gated_ready),
        .output_data    ({operation_gated, operation_gated_length})
    );

Then we buffer the CDC'ed operation into a Handshake Multiplier, which will repeat the operation "length" times. A length of zero immediately ends the operation, without effect.

    wire                            operation_current_ready;
    wire                            operation_current_valid;
    wire [MEM_CTRL_OP_WIDTH-1:0]    operation_current_raw;

    Pipeline_Handshake_Multiplier
    #(
        .WORD_WIDTH         (MEM_CTRL_OP_WIDTH),
        .MAX_REPEAT_COUNT   (MAX_REPEAT_COUNT)
    )
    operation_repeat
    (
        .clock                      (mem_clk),
        .clear                      (mem_rst),

        .input_data_valid           (operation_gated_valid),
        .input_data_ready           (operation_gated_ready),
        .input_data                 (operation_gated),
        .input_data_repeat_count    (operation_gated_length),

        .output_data_valid          (operation_current_valid),
        .output_data_ready          (operation_current_ready),
        .output_data                (operation_current_raw)
    );

Finally, translate the MEM_CTRL_OP_WRITE/READ into the bit pattern for read/write commands for the DDR controller.

    reg [MEM_CMD_WIDTH-1:0] operation_current = MEM_CMD_WRITE;

    always @(*) begin
        operation_current = (operation_current_raw == MEM_CTRL_OP_WRITE) ? MEM_CMD_WRITE : MEM_CMD_READ;
    end

Depending on the operation, we Branch the operation handshake to either a write command pipeline or a read command pipeline, which each do different things as handshakes complete along them.

    localparam OPERATION_BRANCH_COUNT = 2;

    reg [OPERATION_BRANCH_COUNT-1:0] operation_branch_selector = 2'b00; // All zero, so no branch selected.

    always @(*) begin
        operation_branch_selector = (operation_current == MEM_CMD_WRITE) ? 2'b01 : 2'b10;
    end

    wire                        operation_read_valid;
    wire                        operation_read_ready;
    wire [MEM_CMD_WIDTH-1:0]    operation_read;

    wire                        operation_write_valid;
    wire                        operation_write_ready;
    wire [MEM_CMD_WIDTH-1:0]    operation_write;

    Pipeline_Branch_One_Hot
    #(
        .WORD_WIDTH     (MEM_CMD_WIDTH),
        .OUTPUT_COUNT   (2),
        .IMPLEMENTATION ("AND")
    )
    operation_branch
    (
        .selector       (operation_branch_selector),

        .input_valid    (operation_current_valid),
        .input_ready    (operation_current_ready),
        .input_data     (operation_current),

        .output_valid   ({operation_read_valid, operation_write_valid}),
        .output_ready   ({operation_read_ready, operation_write_ready}),
        .output_data    ({operation_read,       operation_write})
    );

The write pipeline first lazily synchronizes the write operation handshake with the write data handshake since both the operation (and its address, which is already loaded into the address counter) and the data must be presented at the same time to the DDR controller.

(The datasheet says from 1 cycle before to 2 cycles after, but neither is proper blocking behaviour, so it can fail, so let's force it to simultaneity.)

Since the operation and data pipelines have different widths, we have to expand to the larger width, then truncate after. At least MEM_DATA_WIDTH is always greater than MEM_CMD_WIDTH.

(The expansion bits are constant zeros, so they will optimize away.)

    wire [MEM_DATA_WIDTH-1:0] operation_write_expanded;

    Width_Adjuster
    #(
        .WORD_WIDTH_IN  (MEM_CMD_WIDTH),
        .SIGNED         (0),
        .WORD_WIDTH_OUT (MEM_DATA_WIDTH)
    )
    operation_expander
    (
        // It's possible some input bits are truncated away
        // verilator lint_off UNUSED
        .original_input     (operation_write),
        // verilator lint_on  UNUSED
        .adjusted_output    (operation_write_expanded)
    );

    wire                        operation_write_valid_synced;
    wire                        operation_write_ready_synced;
    wire [MEM_DATA_WIDTH-1:0]   operation_write_synced_expanded;

    wire                        write_data_valid_synced;
    reg                         write_data_ready_synced = 1'b0;
    wire [MEM_DATA_WIDTH-1:0]   write_data_synced;

    Pipeline_Synchronizer_Lazy
    #(
        .WORD_WIDTH (MEM_DATA_WIDTH),
        .PORT_COUNT (2)
    )
    write_synchronizer
    (
        .input_data_ready   ({operation_write_ready,    write_data_ready}),
        .input_data_valid   ({operation_write_valid,    write_data_valid}),
        .input_data         ({operation_write_expanded, write_data}),

        .output_data_ready  ({operation_write_ready_synced,    write_data_ready_synced}),
        .output_data_valid  ({operation_write_valid_synced,    write_data_valid_synced}),
        .output_data        ({operation_write_synced_expanded, write_data_synced})
    );

    wire [MEM_CMD_WIDTH-1:0]    operation_write_synced;

    Width_Adjuster
    #(
        .WORD_WIDTH_IN  (MEM_DATA_WIDTH),
        .SIGNED         (0),
        .WORD_WIDTH_OUT (MEM_CMD_WIDTH)
    )
    operation_truncator
    (
        // It's possible some input bits are truncated away
        // verilator lint_off UNUSED
        .original_input     (operation_write_synced_expanded),
        // verilator lint_on  UNUSED
        .adjusted_output    (operation_write_synced)
    );

Then we lazily Merge the read operation and synchronized write operation pipelines, letting the operation decide which gets sent to the DDR controller command interface.

We necessarily use the same selector (and port ordering!) as the Pipeline_Branch_One_Hot, else we'd have a deadlock!

    // MEM Command FIFO

    wire  [MEM_CMD_WIDTH-1:0]   app_cmd;
    wire                        app_en;
    wire                        app_rdy;

    Pipeline_Merge_One_Hot_Lazy
    #(
        .WORD_WIDTH         (MEM_CMD_WIDTH),
        .INPUT_COUNT        (OPERATION_BRANCH_COUNT),
        .HANDSHAKE_MERGE    ("OR"),
        .DATA_MERGE         ("OR"),
        .IMPLEMENTATION     ("AND")
    )
    operation_pipeline_selector
    (
        .selector           (operation_branch_selector),

        .input_valid        ({operation_read_valid, operation_write_valid_synced}),
        .input_ready        ({operation_read_ready, operation_write_ready_synced}),
        .input_data         ({operation_read,       operation_write_synced}),

        .output_valid       (app_en),
        .output_ready       (app_rdy),
        .output_data        (app_cmd)
    );

The operation address is kept in a counter, loaded at the initial operation load, and incremented by MEM_BURST_LENGTH when each multiplied command is accepted by the DDR controller.

    // MEM Command FIFO

    wire [MEM_ADDR_WIDTH-1:0] app_addr;

    reg address_counter_increment = 1'b0;

    always @(*) begin
        address_counter_increment = (app_en == 1'b1) && (app_rdy == 1'b1);
    end

    Counter_Binary
    #(
        .WORD_WIDTH     (MEM_ADDR_WIDTH),
        .INCREMENT      (MEM_BURST_LENGTH [MEM_ADDR_WIDTH-1:0]),
        .INITIAL_COUNT  (MEM_ADDR_ZERO)
    )
    operation_address_counter
    (
        .clock          (mem_clk),
        .clear          (mem_rst),

        .up_down        (1'b0), // 0/1 --> up/down
        .run            (address_counter_increment),

        .load           (operation_cdc_valid_pulse),
        .load_count     (operation_cdc_address),

        // verilator lint_off PINCONNECTEMPTY
        .carry_in       (1'b0),
        .carry_out      (),
        .carries        (),
        .overflow       (),
        // verilator lint_on  PINCONNECTEMPTY

        .count          (app_addr)
    );

FIFO DEPTH LIMITS

The DDR controller read data interface has no backpressure support, so to present a proper interface to the outside, let's run it through a FIFO buffer which can hold enough pending reads for a given read operation (before multiplication) and latency before the read output starts being read out.

One read command (multiplied) in == one data read out.

It's unclear how many read operations the DDR controller can have pending. The MIG documentation (UG586) hints at one per DDR bank controller, so up to 8, but there may be extra buffering internally.

However, we can never have more pending reads than a multiplied read operation can perform, as the operation will not report as done until all read data has been read out, so MAX_REPEAT_COUNT is our worst-case FIFO depth.

However, the MAX_REPEAT_COUNT is almost always impractical, implying a multi-MB FIFO buffer, so you will have to accept less read buffering. In that case, when depth is less than MAX_REPEAT_COUNT, you must ensure the read port is read out soon enough after the read command is accepted and at a sufficient rate to guarantee the FIFO will never fill-up and thus quietly drop read data from the DDR controller.

A "FIFO overflow" error signal is brought out to help test these cases.

    // MEM read data FIFO

    wire                            app_rd_data_valid;
    wire                            app_rd_data_ready;
    wire  [MEM_DATA_WIDTH-1:0]      app_rd_data;
    // verilator lint_off UNUSED
    wire                            app_rd_data_end; // Equivalent to app_rd_data_valid since each read is a complete MEM_DATA_WIDTH burst.
    // verilator lint_on  UNUSED

    wire                            read_fifo_valid;
    wire                            read_fifo_ready;
    wire  [MEM_DATA_WIDTH-1:0]      read_fifo_data;

    Pipeline_FIFO_Buffer
    #(
        .WORD_WIDTH         (MEM_DATA_WIDTH),
        .DEPTH              (READ_FIFO_DEPTH),
        .RAMSTYLE           (READ_FIFO_RAMSTYLE),
        .CIRCULAR_BUFFER    (0)  // non-zero to enable
    )
    read_data_fifo
    (
        .clock          (mem_clk),
        .clear          (mem_rst),

        .input_valid    (app_rd_data_valid),
        .input_ready    (app_rd_data_ready),
        .input_data     (app_rd_data),

        .output_valid   (read_fifo_valid),
        .output_ready   (read_fifo_ready),
        .output_data    (read_fifo_data)
    );

If the DDR controller sends out data when the FIFO is full, then we've dropped data. Signal this to the enclosing module.

    always @(*) begin
        read_data_fifo_overflow = (app_rd_data_valid == 1'b1) && (app_rd_data_ready == 1'b0);
    end

The FIFO output is then run through a credit gate so it will both only allow reads to complete when a read operation is in progress, and prevent the current read operation (before the handshake multiplier) to report as complete until all pending reads are complete. We need as many possible credits as there can be pending reads, so MAX_REPEAT_COUNT.

We add a credit for each multiplied read operation the DDR controller accepts. A credit is consumed for each read accepted at the read data port.

    reg                             add_credit_pulse = 1'b0;
    wire                            current_credit_count_zero;
    // These signals are left here for debugging
    // verilator lint_off UNUSED
    wire                            add_credit_fail;
    wire [REPEAT_COUNT_WIDTH-1:0]   current_credit_count;
    wire                            current_credit_count_max;
    // verilator lint_on  UNUSED

    always @(*) begin
        add_credit_pulse    = (operation_read_ready == 1'b1) && (operation_read_valid == 1'b1);
        open_operation_gate = (current_credit_count_zero == 1'b1);
    end

    Pipeline_Credit_Gate
    #(
        .WORD_WIDTH         (MEM_DATA_WIDTH),
        .MAX_CREDIT_COUNT   (MAX_REPEAT_COUNT)
    )
    pending_read_tracker
    (
        .clock                      (mem_clk),
        .clear                      (mem_rst),

        .input_data_valid           (read_fifo_valid),
        .input_data_ready           (read_fifo_ready),
        .input_data                 (read_fifo_data),

        .add_credit_pulse           (add_credit_pulse),
        .add_credit_fail            (add_credit_fail),
        .current_credit_count       (current_credit_count),
        .current_credit_count_max   (current_credit_count_max),
        .current_credit_count_zero  (current_credit_count_zero),

        .output_data_valid          (read_data_valid),
        .output_data_ready          (read_data_ready),
        .output_data                (read_data)
    );

Finally, the DDR3 Controller generated by the Xilinx MIG.

One side connects to the DDR3 device, the other presents the UI (User Interface), which is a set of ready/valid interfaces into command and read/write data FIFOs, and a couple req/ack pulse interfaces for low-level control of refresh, calibration, etc... (not used here).

    // MEM write data FIFO

    reg   [MEM_DATA_WIDTH-1:0]      app_wdf_data    = MEM_DATA_ZERO;
    reg                             app_wdf_end     = 1'b0;
    reg                             app_wdf_wren    = 1'b0;
    wire                            app_wdf_rdy;
    localparam [MEM_MASK_WIDTH-1:0] app_wdf_mask    = MEM_MASK_NONE; // CONSTANT

    // Connect to output of write_synchronizer
    // This is sync'ed to the write operation handshake

    always @(*) begin
        app_wdf_data            = write_data_synced;
        app_wdf_wren            = write_data_valid_synced;
        app_wdf_end             = write_data_valid_synced; // A 4:1 interface writes a burst of 8 in one cycle
        write_data_ready_synced = app_wdf_rdy;
    end

    // MEM special requests/status

    localparam                      app_sr_req      = 1'b0; // RESERVED, UNUSED
    wire                            app_sr_active;

    reg                             app_zq_req      = 1'b0; // ZQ calibration request, UNUSED
    wire                            app_zq_ack;

    reg                             app_ref_req     = 1'b0; // Refresh command, UNUSED
    wire                            app_ref_ack;

    // Memory calibration flag (mem_clk domain)

    wire                            init_calib_complete;

    // Taken from MIG_DDR3.veo in ip/MIG_DDR3
    // There are pins present in the defition not shown in the instantiation template.
    // verilator lint_off PINNOCONNECT
    MIG_DDR3 
    DDR3_Controller
    (
        // Memory interface ports to/from DDR3 device

        .ddr3_addr              (ddr3_addr),            // output [13:0]    ddr3_addr
        .ddr3_ba                (ddr3_ba),              // output [2:0]     ddr3_ba
        .ddr3_cas_n             (ddr3_cas_n),           // output           ddr3_cas_n
        .ddr3_ck_n              (ddr3_ck_n),            // output [0:0]     ddr3_ck_n
        .ddr3_ck_p              (ddr3_ck_p),            // output [0:0]     ddr3_ck_p
        .ddr3_cke               (ddr3_cke),             // output [0:0]     ddr3_cke
        .ddr3_ras_n             (ddr3_ras_n),           // output           ddr3_ras_n
        .ddr3_reset_n           (ddr3_reset_n),         // output           ddr3_reset_n
        .ddr3_we_n              (ddr3_we_n),            // output           ddr3_we_n
        .ddr3_dq                (ddr3_dq),              // inout [15:0]     ddr3_dq
        .ddr3_dqs_n             (ddr3_dqs_n),           // inout [1:0]      ddr3_dqs_n
        .ddr3_dqs_p             (ddr3_dqs_p),           // inout [1:0]      ddr3_dqs_p
        .ddr3_cs_n              (ddr3_cs_n),            // output [0:0]     ddr3_cs_n
        .ddr3_dm                (ddr3_dm),              // output [1:0]     ddr3_dm
        .ddr3_odt               (ddr3_odt),             // output [0:0]     ddr3_odt

        // Application interface ports (outputs own clock)

        .app_addr               (app_addr),             // input [27:0]     app_addr
        .app_cmd                (app_cmd),              // input [2:0]      app_cmd
        .app_en                 (app_en),               // input            app_en
        .app_rdy                (app_rdy),              // output           app_rdy

        .app_wdf_data           (app_wdf_data),         // input [127:0]    app_wdf_data
        .app_wdf_end            (app_wdf_end),          // input            app_wdf_end
        .app_wdf_wren           (app_wdf_wren),         // input            app_wdf_wren
        .app_wdf_rdy            (app_wdf_rdy),          // output           app_wdf_rdy
        .app_wdf_mask           (app_wdf_mask),         // input [15:0]     app_wdf_mask

        .app_rd_data            (app_rd_data),          // output [127:0]   app_rd_data
        .app_rd_data_end        (app_rd_data_end),      // output           app_rd_data_end
        .app_rd_data_valid      (app_rd_data_valid),    // output           app_rd_data_valid

        .app_ref_ack            (app_ref_ack),          // output           app_ref_ack
        .app_ref_req            (app_ref_req),          // input            app_ref_req

        .app_zq_req             (app_zq_req),           // input            app_zq_req
        .app_zq_ack             (app_zq_ack),           // output           app_zq_ack

        .app_sr_req             (app_sr_req),           // input            app_sr_req
        .app_sr_active          (app_sr_active),        // output           app_sr_active

        .ui_clk                 (mem_clk),              // output           ui_clk
        .ui_clk_sync_rst        (mem_rst),              // output           ui_clk_sync_rst

        .init_calib_complete    (init_calib_complete),  // output           init_calib_complete

        // System Clock Ports (operating clock for DDR3)

        .sys_clk_i              (ddr3_clk),
        .sys_rst                (ddr3_rst),              // input            sys_rst

        // Reference Clock Ports (200 MHz (typ) for IDELAYCTRL)

        .clk_ref_i              (ddr3_clk_ref)
    );
    // verilator lint_on PINNOCONNECT

Let's CDC init_calib_complete into the main clock domain so we can use it to control external logic feeding the memory controller. We don't want to start sending data to memory until it is calibrated and ready, and that can take a relatively long time after startup (~100us).

Fortunately, init_calib_complete is a single-bit flag, and usually only transitions once, so we only need a bit synchronizer.

    CDC_Bit_Synchronizer
    #(
        .EXTRA_DEPTH        (0)  // Must be 0 or greater
    )
    init_calib_complete_synchronizer
    (
        .receiving_clock    (clk_main),
        .bit_in             (init_calib_complete),
        .bit_out            (memory_calibration_complete)
    );

endmodule

Back to FPGA Design Elements

fpgacpu.ca