by GateForge Consulting Ltd.
In a project with multiple FPGAs and boards and connecting paths between them, all designed by separate engineering teams, I came up with a common code architecture on each FPGA which made managing various mismatches, errors, and last-minute changes much easier, as well as improving portability.
I'll start at the bottom and work upwards. Each higher layer instantiates the one below it as a single module, and performs one specific function to support that underlying module. As we move upwards, the functions shift from managing the application logic to managing the interface to the outside world:
The Core logic of the application contains all the parts needed in the abstract, without concern for interface protocols, size of FPGA, number of instances, etc... It may culminate in a single module or a few. The only FPGA-specific parts are technology-mapping issues like which DSP or BRAM blocks are used. Other than porting to another FPGA family, this code should never change. For example, the core logic could contain the control and data paths of a soft-CPU, and the building blocks for a network-on-chip, but all as a library of parts.
The Instance culminates in a single module which instantiates the Core logic, and specifies the size and number of core logic modules, as well as all the wiring between them. The Instance primarily deals with scaling the Core logic to the specific FPGA device being used. For example, the instance defines how many data paths a soft-CPU will have if it's a SIMD system, and how many CPUs in total and the network-on-chip which connects them. There are still no concerns about particular external interfaces: buses and clocks are simple signals.
At this point, the design can be simulated to check for logical correctness, and synthesized for a preliminary estimate of its final operating frequency, assuming the instance is surrounded by a temporary test harness of registers to ensure correct static timing analysis.
The Adapter instantiates the Instance, defines the final interface ports expected by the board upon which the FPGA resides, and wires up the instance to those ports. The Adapter adds any logic necessary to support the Instance and also satisfy the FPGA CAD tools. For example, the Adapter will contain any clock generation and management hardware, add any differential buffers to signals that need it, even if unused, to satisfy CAD tool constraints, and convert interface protocols as required (e.g.: as AXI masters/slaves, bi-directional or tri-state buffers, etc...).
At this point, synthesis of the design should find any final errors, such as insufficient clock buffers or logic resources, expose any missing/violated constraints, and in general point out anything that needs attention. You should, of course, fix things to minimize the number of warnings from the CAD tool.
The Shim instantiates the Adapter, and nothing else. It has the same interface ports as the Adapter, and the Shim's ports are at the top level and physically connect to the FPGA pins. What the Shim does, however, is handle the problems that crop up with PCBs: unusable connections, schematic errors, incompatible voltage standards, etc... For example, if some top-level pins are unusable due to a design or manufacturing error, the Shim can shuffle the signals between the top-level pins and the adapter ports so as to present a uniform, maybe reduced, interface to the Adapter. If you have spare board pins, this is where you use them.
Another example is where, due to how a PCB had been designed, the pins of a bus were not connected to consecutive pins along the FPGA's I/O Banks, scattering the bus (and it's associated logic) all over the device. The Shim file translates the logical bus pins from the adapter to physically consecutive pins on the device, which helped keep related logic grouped together. Of course, any other FPGA on the PCB needs a similar Shim file to re-organize the now scrambled bus signals coming in.
The Shim file is also a good place to annotate the top level pins: which ones are clock-capable, are they usable as differential pairs, which clock region or I/O Bank they reside in, etc... which helps future alterations.
Finally, you now can do a full synthesis to get the final operating frequency, do post-placement synthesis to check things like the I/O delay lines, and do final floorplanning.