Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to setup a custom CL on F2 instance FPGA and run it through my host script #685

Open
PratMaha opened this issue Feb 19, 2025 · 14 comments

Comments

@PratMaha
Copy link

I am trying to send data through the PCIE interface to my custom logic, but I'm not able to get anything to work. I would appreciate some support on the same since the documentation you have provided has not been helpful.

@PratMaha PratMaha changed the title Unable to setup a custom CL on FPGA and run it through my host script Unable to setup a custom CL on F2 instance FPGA and run it through my host script Feb 19, 2025
@czfpga
Copy link
Contributor

czfpga commented Feb 19, 2025

@PratMaha,

Have you fully verified the CL in simulation to ensure the it works as expected? Please refer to this doc for details of running RTL simulation. If the design has passed the simulations, I will recommend check and ensure the correct AFI is loaded to the instance. Before passing large chunk of data, start with something basic, like accessing a CL register to ensure the shell interfaces are functional.

@PratMaha
Copy link
Author

PratMaha commented Feb 19, 2025

EDIT : Got past previous issues. Now I'm getting this error :
ERROR: [VRFC 10-2996] 'ocl' is not found for implicit .* port connection [/home/ubuntu/src/project_data/aws-fpga/hdk/common/verif/models/fpga/fpga.sv:42]
ERROR: [XSIM 43-3322] Static elaboration of top level Verilog design unit(s) in library work failed.
This is my top-level module :
module cl_gpu(
input clk_main_a0,
input rst_main_n,
axi4_lite.slave ocl
);

@czfpga
Copy link
Contributor

czfpga commented Feb 19, 2025

Refer to one of HDK examples for CL top module I/O connection: https://github.com/aws/aws-fpga/blob/f2/hdk/cl/examples/cl_sde/design/cl_sde.sv#L23-L27. The CL-Shell interfaces are already defined and fixed here, they cannot be modified.

@PratMaha
Copy link
Author

PratMaha commented Feb 19, 2025

So, to clarify, these are the interfaces I use to connect my host to my CL design right?
What do I implement in my host code for the communication?

@PratMaha
Copy link
Author

For reference, this is my host code, does it look about right :
#include <fpga_pci.h>
#include <fpga_mgmt.h>
#include <stdio.h>
#include <unistd.h>

#define APP_PF_BAR0 0
#define RESET_DELAY_US 250

void send_instruction(pci_bar_handle_t bar0, const char* line) {
static uint32_t ctrl_val = 0;
static uint32_t data_val = 0;
int rc;

if(line[0] == 'S') {  // STATE command (type 0)
    sscanf(line, "STATE %x", &ctrl_val);
    rc = fpga_pci_poke(bar0, 0x00, &ctrl_val);
if (rc) {
        fprintf(stderr, "Write error: %s\n", fpga_mgmt_strerror(rc));
        exit(rc);
    }
} 
else if(line[0] == 'D') {  // DATA command (type 1)
    sscanf(line, "DATA %x", &data_val);
    rc = fpga_pci_poke(bar0, 0x04, &data_val);
if (rc) {
        fprintf(stderr, "Write error: %s\n", fpga_mgmt_strerror(rc));
        exit(rc);
    }
}
else if(line[0] == 'R') {  // READ_DATA command (type 1)
    uint32_t read_data;
    rc = fpga_pci_peek(bar0, 0x08, &read_data);
if (rc) {
        fprintf(stderr, "Read error: %s\n", fpga_mgmt_strerror(rc));
        exit(rc);
    }
    printf("READ_DATA %08x\n", read_data);
}

// Wait for GPU acknowledgment
uint32_t status;
do {
    rc = fpga_pci_peek(bar0, 0x0C, &status);
    if (rc) {
        fprintf(stderr, "Read error: %s\n", fpga_mgmt_strerror(rc));
        exit(rc);
    }
} while(!(status & 0x1));  // Block until ack

}

int main() {
fpga_mgmt_init();

pci_bar_handle_t bar0;
int rc = fpga_pci_attach(0, FPGA_APP_PF, APP_PF_BAR0, 0, &bar0);
if (rc) {
    fprintf(stderr, "Error %d: %s\n", rc, fpga_mgmt_strerror(rc));
    exit(1);
}

// F2-compliant reset sequence
const uint32_t reset_seq[] = {0x1, 0x0};
for(int i = 0; i < 2; i++) {
    rc = fpga_pci_poke(bar0, 0x20, reset_seq[i]);
    if (rc) {
        fprintf(stderr, "Write error: %s\n", fpga_mgmt_strerror(rc));
        exit(rc);
    }
    usleep(RESET_DELAY_US);
}

// Execute command stream
FILE* cmd_file = fopen("gpu_instructions.txt", "r");
char cmd_line[256];
while(fgets(cmd_line, sizeof(cmd_line), cmd_file)) {
    if(cmd_line[0] == '#') continue;  // Skip comments
    send_instruction(bar0, cmd_line);
}
fclose(cmd_file);

fpga_pci_detach(bar0);
return 0;

}

@czfpga
Copy link
Contributor

czfpga commented Feb 19, 2025

Correct, the Shell and CL are connected through these interfaces.

For the host code, you can refer to the runtime example code provided in the CL examples.

@PratMaha
Copy link
Author

I don't see any code in the link you have provided. Can you provide a link to a file corresponding to the same example you sent? It has no host code in this repo

@czfpga
Copy link
Contributor

czfpga commented Feb 19, 2025

I strongly recommend reviewing the document above and going through all the CL examples to get familiar with the HDK development environment first. That will help you quickly find out the files needed in the repo as well as build your own example. The code examples can be found in the software/runtime/ directory of each example.

@PratMaha
Copy link
Author

There are no examples here that have runtime code for SH to CL communication. Can you point me to one? All that's there is a null example

@PratMaha
Copy link
Author

Also, how do I resolve such synthesis errors :
RROR: [Synth 8-5809] Error generated from encrypted envelope. [/home/ubuntu/src/project_data/aws-fpga/hdk/cl/examples/cl_gpu/build/src_post_encryption/global_mem_controller_synth.sv:35]
INFO: [Synth 8-9084] Verilog file '/home/ubuntu/src/project_data/aws-fpga/hdk/cl/examples/cl_gpu/build/src_post_encryption/global_mem_controller_synth.sv' ignored due to errors
ERROR: [Synth 8-5809] Error generated from encrypted envelope. [/home/ubuntu/src/project_data/aws-fpga/hdk/cl/examples/cl_gpu/build/src_post_encryption/gpu_die.sv:54]
ERROR: [Synth 8-5809] Error generated from encrypted envelope. [/home/ubuntu/src/project_data/aws-fpga/hdk/cl/examples/cl_gpu/build/src_post_encryption/gpu_die.sv:2]
ERROR: [Synth 8-5809] Error generated from encrypted envelope. [/home/ubuntu/src/project_data/aws-fpga/hdk/cl/examples/cl_gpu/build/src_post_encryption/cl_gpu.sv:2]
Like I said, I've run everything on Vivado locally, but it's failing on AWS

@czfpga
Copy link
Contributor

czfpga commented Feb 19, 2025

You can comment out these two lines in your example directory. It will stop the source encryption so that you can see the true errors.

@PratMaha
Copy link
Author

PratMaha commented Feb 24, 2025

Is there any basic example that actually completely works (and doesn't have caveats like XDMA not supported etc etc) for Host->FPGA PCIE communication?
I want to minimise the effort involved in moving from my local Vivado to the AWS (I don't have enough time to ramp up and learn every detail of the provided examples), but there's no useful example that just demonstrates basic Host<-> FPGA PCIE communication

@PratMaha
Copy link
Author

PratMaha commented Feb 26, 2025

Okay, so I'm getting this error right now (some protocol_checker error):
tb.card.u_ddr4_rdimm.rcd_enabled.NOT_LRDIMM.u_ddr4_dimm.rank_instances[1].mc_ca_mirroring_odd_rank.u_ddr4_rank.Micron_model.instance_of_sdram_devices[17].micron_mem_model.u_ddr4_model:Configured as x4 16G stack:1
5052.000 ns : tb.card.fpga.sh.axi_pc_mstr_inst_pcim.REP : BIT( 0) : ERROR : Invalid state x
5052.000 ns : tb.card.fpga.sh.axi_pc_mstr_inst_pcim.REP : BIT( 7) : ERROR : Invalid state x
5052.000 ns : tb.card.fpga.sh.axi_pc_mstr_inst_pcim.REP : BIT( 37) : ERROR : Invalid state x
5052.000 ns : tb.card.fpga.sh.axi_pc_mstr_inst_pcim.REP : BIT( 44) : ERROR : Invalid state x
5052.000 ns : tb.card.fpga.sh.axi_pc_mstr_inst_pcim.REP : BIT( 90) : ERROR : Invalid state x
5052.000 ns : tb.card.fpga.sh.axi_pc_mstr_inst_pcim.REP : BIT( 91) : ERROR : Invalid state x
5052.000 ns : tb.card.fpga.sh.axl_pc_ocl_slv_inst.REP : BIT( 59) : ERROR : Invalid state x
5052.000 ns : tb.card.fpga.sh.axl_pc_ocl_slv_inst.REP : BIT( 60) : ERROR : Invalid state x
5052.000 ns : tb.card.fpga.sh.axl_pc_ocl_slv_inst.REP : BIT( 83) : ERROR : Invalid state x
5052.000 ns : tb.card.fpga.sh.axl_pc_ocl_slv_inst.REP : BIT( 84) : ERROR : Invalid state x
5052.000 ns : tb.card.fpga.sh.axl_pc_sda_slv_inst.REP : BIT( 59) : ERROR : Invalid state x
5052.000 ns : tb.card.fpga.sh.axl_pc_sda_slv_inst.REP : BIT( 60) : ERROR : Invalid state x
5052.000 ns : tb.card.fpga.sh.axl_pc_sda_slv_inst.REP : BIT( 83) : ERROR : Invalid state x
5052.000 ns : tb.card.fpga.sh.axl_pc_sda_slv_inst.REP : BIT( 84) : ERROR : Invalid state x
[ 6100.500 ns] : Detected 0 errors
[ 6100.500 ns] : Checking protocol checker error status...
[ 6100.500 ns] : *** 'X' propagation detected in protocol checker status bits. Please dump waves and look at pc_status bits for more information***

This is my top-level module :
module cl_gpu(
input clk_main_a0,
input rst_main_n,

// PCIS AXI-4 interface between host and cl
// Address Write Signals
input logic [15:0]  sh_cl_dma_pcis_awid,
input logic [63:0]  sh_cl_dma_pcis_awaddr,
input logic [7:0]   sh_cl_dma_pcis_awlen,
input logic [2:0]   sh_cl_dma_pcis_awsize,
input logic [1:0]   sh_cl_dma_pcis_awburst,
input logic         sh_cl_dma_pcis_awvalid,
output logic        cl_sh_dma_pcis_awready,
// Data Write Signals
input logic [15:0]  sh_cl_dma_pcis_wid,
input logic [511:0] sh_cl_dma_pcis_wdata,
input logic [63:0]  sh_cl_dma_pcis_wstrb,
input logic         sh_cl_dma_pcis_wlast,
input logic         sh_cl_dma_pcis_wvalid,
output logic        cl_sh_dma_pcis_wready,
// Response Write Signals
output logic [15:0] cl_sh_dma_pcis_bid,
output logic [1:0]  cl_sh_dma_pcis_bresp,
output logic        cl_sh_dma_pcis_bvalid,
input logic         sh_cl_dma_pcis_bready,
// Address Read Signals
input logic [15:0]  sh_cl_dma_pcis_arid,
input logic [63:0]  sh_cl_dma_pcis_araddr,
input logic [7:0]   sh_cl_dma_pcis_arlen,
input logic [2:0]   sh_cl_dma_pcis_arsize,
input logic [1:0]   sh_cl_dma_pcis_arburst,
input logic         sh_cl_dma_pcis_arvalid,
output logic        cl_sh_dma_pcis_arready,
// Data Read Signals
output logic [15:0] cl_sh_dma_pcis_rid,
output logic [511:0] cl_sh_dma_pcis_rdata,
output logic [1:0]  cl_sh_dma_pcis_rresp,
output logic        cl_sh_dma_pcis_rlast,
output logic        cl_sh_dma_pcis_rvalid,
input logic         sh_cl_dma_pcis_rready

);

And this is my testbench :
`include "common_base_test.svh"

module cl_gpu_base_test();
import tb_type_defines_pkg::*;

initial begin
tb.power_up();
#500
tb.power_down();
report_pass_fail_status();

  $finish;

end
endmodule

What's going on? Why is such a simple testbench not working?

@PratMaha
Copy link
Author

So I'm also getting this error when I try to push my design through the RTL->AFI flow :
AWS FPGA: (22:27:44): Start linking customer design cl_gpu

add_files ${AWS_DCP_DIR}/cl_bb_routed.${SHELL_MODE}.dcp

add_files ${checkpoints_dir}/${CL}.${TAG}.post_synth.dcp

set_property SCOPED_TO_CELLS {WRAPPER/CL} \

[get_files ${checkpoints_dir}/${CL}.${TAG}.post_synth.dcp]

link_design -mode default \

-reconfig_partitions {WRAPPER/CL} \

-top top

Command: link_design -mode default -reconfig_partitions WRAPPER/CL -top top
Design is defaulting to srcset: sources_1
Design is defaulting to constrset: constrs_1
Design is defaulting to dcp part: xcvu47p-fsvh2892-2-e
INFO: [Project 1-454] Reading design checkpoint '/home/ubuntu/src/project_data/aws-fpga/hdk/cl/examples/cl_gpu/build/checkpoints/cl_gpu.2025_02_26-222645.post_synth.dcp' for cell 'WRAPPER/CL'
1 Infos, 0 Warnings, 0 Critical Warnings and 1 Errors encountered.
link_design failed
ERROR: [Netlist 29-77] Could not replace (cell 'cl_simple_bb', library 'work_CL_CL_1', file 'NOFILE') with (cell 'cl_gpu', library 'work', file 'cl_gpu.edf') because of a port interface mismatch; 3235 ports are missing on the replacing cell. 5 of the missing ports are: 'CLK_DIMM_DN' 'CLK_DIMM_DP' 'M_ACT_N' 'sh_cl_status_vdip[1]' 'sh_cl_status_vdip[0]'.
Resolution: Modify RTL to reference correct ports from the netlist

while executing

"source ${scripts_dir}/build_level_1_cl.tcl"
("default" arm line 3)
invoked from within
"switch $BUILD_FLOW {
"SynthCL" {
source ${scripts_dir}/synth_${CL}.tcl
}

"ImplCL" {
source ${scripts_dir}/build_level_1_cl.tcl
}

d..."
(file "build_all.tcl" line 265)
INFO: [Common 17-206] Exiting Vivado at Wed Feb 26 22:27:46 2025...
Save the existing to_aws/ to to_aws_backup_2025_02_26-222747/
ERROR: Did not find the post-route DCP file from /home/ubuntu/src/project_data/aws-fpga/hdk/cl/examples/cl_gpu/build/checkpoints/

Any idea what I could be doing wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants