Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CTG tests are not reproducible #566

Open
trdthg opened this issue Nov 21, 2024 · 9 comments · May be fixed by #570
Open

CTG tests are not reproducible #566

trdthg opened this issue Nov 21, 2024 · 9 comments · May be fixed by #570

Comments

@trdthg
Copy link
Contributor

trdthg commented Nov 21, 2024

After some testing, I found that the tests ctg (without -r option) generated each time is different. This is completely different from what I expected. I also had another person try it, and the results were also different.

I think Tim have already found this issue a few month ago: riscv-software-src/riscv-ctg#115

I'm trying to fix this, but simply appending a line random.seed(10) after all import random statements does not solve the problem. I will check it again.

Update: The reason is the use of set in the code, and I am replacing them with ordered_set


if you don't know what -r option does:

-r, --randomize Randomize Outputs.

If randomization is enabled we use the MinConflictsSolver solver to find solutions.

if self.random:
problem = Problem(MinConflictsSolver())
else:
problem = Problem()

I haven't tested the behavior of MinConflictsSolver, which will use random inside

Originally posted by @trdthg in #565 (comment)


Update: In summary, two steps are needed to solve the problem here

@trdthg trdthg changed the title I thought this is strange too. CTG tests are not reproducible Nov 21, 2024
@trdthg
Copy link
Contributor Author

trdthg commented Nov 26, 2024

But I don't know how to handle random, according to the following code, it seems intentional, or it is necessary because we need to choose one from various solutions

# Randomly choose an instruction
instr = random.choice(instrs_sol)
isa_set += (self.OP_TEMPLATE[instr]['isa'])

However, I think we can set the random seed as a parameter, which ensures reproducibility while also allowing for some randomness

@allenjbaum
Copy link
Collaborator

allenjbaum commented Nov 26, 2024 via email

@trdthg
Copy link
Contributor Author

trdthg commented Nov 26, 2024

I roughly understand what you mean; ctg only generates new test cases for points that have not yet been covered. I will give it a try

I highly recommend adding an option to filter the instruction set, as it would make debugging much simpler, and I will send a PR later, I've already add a --filter=<PATH> for riscof riscv-software-src/riscof#127

@allenjbaum
Copy link
Collaborator

If you are generating tests for a specific extension, you can add a filter,
Riscof generates a test list default, so you could filter it, but there is also an option to skip generating the test list and just test using an existing test list, which probably does what you want.

@trdthg
Copy link
Contributor Author

trdthg commented Dec 3, 2024

Some other research about random

Currently riscv_ctg will use random in the following places

  • misc/bitmanip_real_world.py: It is a standalone script, and the seed has already been set inside
  • GeneratorCSRComb(base_isa, xlen, randomize): The parameters were passed, but not used
  • cross_comb.py: There are many random calls here, but I didn't trigger them during testing, and I didn't find any cgf that used cross_comb
  • Generator: Used to select MinConflictsSolver(which use random inside), if -r is not used, this is disabled

So currently, it seems that there is no need to modify any random seed, #570 is enough

@allenjbaum
Copy link
Collaborator

allenjbaum commented Dec 3, 2024 via email

@trdthg
Copy link
Contributor Author

trdthg commented Dec 4, 2024

No, that will be another PR, #570 only fix reproducible when you rerun riscv_ctg

@allenjbaum
Copy link
Collaborator

allenjbaum commented Dec 4, 2024 via email

@trdthg
Copy link
Contributor Author

trdthg commented Dec 4, 2024

Oh, sorry I misunderstood your meaning, I thought you were saying that you are comparing with the existing test cases in the ctg repository currently.

should it?

Yes


Here are some of my test results

If you for example

  • Whether adding or reducing (including inserting and appending)
  • Or just adjust the order

The results of the generated test cases may be different.

I did a test: taking sraw as an example, I added a new coverage point to its val_comb

    val_comb:
      'rs1_val == 0 and rs2_val >= 0 and rs2_val < xlen': 0
      'rs1_val < 0 and rs2_val > 0 and rs2_val < xlen': 0
      'rs1_val > 0 and rs2_val > 0 and rs2_val < xlen': 0
      'rs1_val < 0 and rs2_val == 0': 0
+      'rs1_val == (-2**(xlen-1)) and rs2_val >= 0 and rs2_val < xlen': 0
      'rs1_val == rs2_val and rs2_val > 0 and rs2_val < xlen': 0
      'rs1_val > 0 and rs2_val == 0': 0
      'rs1_val == (2**(xlen-1)-1) and rs2_val >= 0 and rs2_val < xlen': 0
      'rs1_val == 1 and rs2_val >= 0 and rs2_val < xlen': 0

结果是这样的

diff --git a/tests/sraw-01.S b/tests/sraw-01.S
index 549b42a1..a135e14e 100644
--- a/tests/sraw-01.S
+++ b/tests/sraw-01.S
@@ -2,7 +2,7 @@
 // -----------
 // This file was generated by riscv_ctg (https://github.com/riscv-software-src/riscv-ctg)
 // version   : 0.12.2
-// timestamp : Wed Dec  4 10:06:40 2024 GMT
+// timestamp : Wed Dec  4 10:06:56 2024 GMT
 // usage     : riscv_ctg \
 //                  -- cgf //                  --cgf /workspaces/riscv-arch-test/coverage/dataset.cgf \
 //                  --cgf /workspaces/riscv-arch-test/coverage/i/rv64i.cgf \
@@ -64,9 +64,9 @@ inst_5:
 TEST_RR_OP(sraw, x26, x25, x24, -0x1, 0x7fffffffffffffff, 0x1f, x1, 5*XLEN/8, x2)
 
 inst_6:
-// rs1==x24, rs2==x26, rd==x25, rs1_val < 0 and rs2_val > 0 and rs2_val < xlen, 
-// opcode: sraw ; op1:x24; op2:x26; dest:x25; op1val:-0xb504f332;  op2val:0x1f
-TEST_RR_OP(sraw, x25, x24, x26, 0x0, -0xb504f332, 0x1f, x1, 6*XLEN/8, x2)
+// rs1==x24, rs2==x26, rd==x25, rs1_val == (-2**(xlen-1)) and rs2_val >= 0 and rs2_val < xlen, rs1_val < 0 and rs2_val > 0 and rs2_val < xlen
+// opcode: sraw ; op1:x24; op2:x26; dest:x25; op1val:-0x8000000000000000;  op2val:0x1f
+TEST_RR_OP(sraw, x25, x24, x26, 0x0, -0x8000000000000000, 0x1f, x1, 6*XLEN/8, x2)
 
 inst_7:
 // rs1==x26, rs2==x25, rd==x24, rs1_val < 0 and rs2_val == 0, 
@@ -208,6 +208,21 @@ inst_34:
 // rs1_val == 1 and rs2_val >= 0 and rs2_val < xlen, 
 // opcode: sraw ; op1:x30; op2:x29; dest:x31; op1val:0x1;  op2val:0x1f
 TEST_RR_OP(sraw, x31, x30, x29, 0x0, 0x1, 0x1f, x9, 10*XLEN/8, x10)
+
+inst_35:
+// 
+// opcode: sraw ; op1:x30; op2:x29; dest:x31; op1val:-0x8000000000000000;  op2val:0x0
+TEST_RR_OP(sraw, x31, x30, x29, 0x0, -0x8000000000000000, 0x0, x9, 11*XLEN/8, x10)
+
+inst_36:
+// 
+// opcode: sraw ; op1:x30; op2:x29; dest:x31; op1val:-0x8000000000000000;  op2val:0x0
+TEST_RR_OP(sraw, x31, x30, x29, 0x0, -0x8000000000000000, 0x0, x9, 12*XLEN/8, x10)
+
+inst_37:
+// 
+// opcode: sraw ; op1:x30; op2:x29; dest:x31; op1val:-0x8000000000000000;  op2val:0x0
+TEST_RR_OP(sraw, x31, x30, x29, 0x0, -0x8000000000000000, 0x0, x9, 13*XLEN/8, x10)
 #endif
 
 
@@ -239,7 +254,7 @@ signature_x1_1:
 
 
 signature_x9_0:
-    .fill 11*((XLEN/8)/4),4,0xdeadbeef
+    .fill 14*((XLEN/8)/4),4,0xdeadbeef
 
 #ifdef rvtest_mtrap_routine
 tsig_begin_canary:

If you have added a new cover point, now given an input combination,ctg will try to find test cases that cover as many points as possible. Therefore, some things in inst_6 have been modified to cover more coverage points at the same time

However, these changes can be reviewed through the comments on the test cases, I think it's fine

About adjusting the order: I have tested changing the order, but the generated result is the same. I think at least the order of cover points in comment should be different, maybe points has been sorted in some place, I don't know yet

@trdthg trdthg linked a pull request Dec 23, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants