CTG tests are not reproducible #566

trdthg · 2024-11-21T02:43:56Z

After some testing, I found that the tests ctg (without -r option) generated each time is different. This is completely different from what I expected. I also had another person try it, and the results were also different.

I think Tim have already found this issue a few month ago: riscv-software-src/riscv-ctg#115

I'm trying to fix this, but simply appending a line random.seed(10) after all import random statements does not solve the problem. I will check it again.

Update: The reason is the use of set in the code, and I am replacing them with ordered_set

if you don't know what -r option does:

-r, --randomize Randomize Outputs.

If randomization is enabled we use the MinConflictsSolver solver to find solutions.

riscv-arch-test/riscv-ctg/riscv_ctg/generator.py

Lines 493 to 496 in cd94912

    
           if self.random: 
        
               problem = Problem(MinConflictsSolver()) 
        
           else: 
        
               problem = Problem()

I haven't tested the behavior of MinConflictsSolver, which will use random inside

Originally posted by @trdthg in #565 (comment)

Update: In summary, two steps are needed to solve the problem here

~~random: <>~~
set: [CTG] Replace set with OrderedSet #570

The text was updated successfully, but these errors were encountered:

trdthg · 2024-11-26T03:16:44Z

But I don't know how to handle random, according to the following code, it seems intentional, or it is necessary because we need to choose one from various solutions

riscv-arch-test/riscv-ctg/riscv_ctg/cross_comb.py

Lines 223 to 225 in cd94912

    
           # Randomly choose an instruction 
        
           instr = random.choice(instrs_sol) 
        
           isa_set += (self.OP_TEMPLATE[instr]['isa'])

However, I think we can set the random seed as a parameter, which ensures reproducibility while also allowing for some randomness

allenjbaum · 2024-11-26T04:57:44Z

While I wish it were reproducible, what is important is really the coverage. The only reason tests are recreated is to add specific cover points, Test generation should not be adding tests that don't add new coverpoints, so as CTG runs, it must be keeping track of how much coverage has already been found. If we specify just new coverpoints, AND add the capability of pointing to existing cover points from a previous run, then new tests that match only existing coverpoints could be dropped, only tests matching new coverpoints will be added, and existing tests won't remain unchanged. That would make reviewing new tests much simpler. I don't think that's an option that CTG has, though internally it must do something like that when creating tests, so shouldn't be hard to add. (I can say this because I know I'm incapable of making that change....:-

…

On Mon, Nov 25, 2024 at 7:17 PM MingZhu Yan ***@***.***> wrote: But I don't know how to handle random, according to the following code, it seems intentional ( I haven't looked closely at why to do this way) https://github.com/riscv-non-isa/riscv-arch-test/blob/cd94912fed2aab75d7d5f115b441da0813fdce8d/riscv-ctg/riscv_ctg/cross_comb.py#L223-L225 — Reply to this email directly, view it on GitHub <#566 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AHPXVJVNH4AH2ITBER3VM5L2CPR3JAVCNFSM6AAAAABSF6BSO2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOJZGU3DIMZVGQ> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

trdthg · 2024-11-26T06:00:45Z

I roughly understand what you mean; ctg only generates new test cases for points that have not yet been covered. I will give it a try

I highly recommend adding an option to filter the instruction set, as it would make debugging much simpler, and I will send a PR later, I've already add a --filter=<PATH> for riscof riscv-software-src/riscof#127

allenjbaum · 2024-11-26T06:11:27Z

If you are generating tests for a specific extension, you can add a filter,
Riscof generates a test list default, so you could filter it, but there is also an option to skip generating the test list and just test using an existing test list, which probably does what you want.

trdthg · 2024-12-03T03:45:45Z

Some other research about random

Currently riscv_ctg will use random in the following places

misc/bitmanip_real_world.py: It is a standalone script, and the seed has already been set inside
GeneratorCSRComb(base_isa, xlen, randomize): The parameters were passed, but not used
cross_comb.py: There are many random calls here, but I didn't trigger them during testing, and I didn't find any cgf that used cross_comb
Generator: Used to select MinConflictsSolver(which use random inside), if -r is not used, this is disabled

So currently, it seems that there is no need to modify any random seed, #570 is enough

allenjbaum · 2024-12-03T19:05:35Z

Just to be clear: PR#570 will enable someone to add coverage, but not change the existing test?

…

On Mon, Dec 2, 2024 at 7:46 PM MingZhu Yan ***@***.***> wrote: Some other research about random Currently riscv_ctg will use random in the following places - misc/bitmanip_real_world.py: It is a standalone script, and the seed has already been set inside - GeneratorCSRComb(base_isa, xlen, randomize): The parameters were passed, but not used - cross_comb.py: There are many random calls here, but I didn't trigger them during testing, and I didn't find any cgf that used cross_comb - Generator: Used to select MinConflictsSolver(which use random inside), if -r is not used, this is disabled So currently, it seems that there is no need to modify any random seed, #570 <#570> is enough — Reply to this email directly, view it on GitHub <#566 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AHPXVJTIZ533EYP33K7APIL2DUSP7AVCNFSM6AAAAABSF6BSO2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMJTGQ4DKNBYGE> . You are receiving this because you commented.Message ID: ***@***.***>

trdthg · 2024-12-04T00:53:07Z

No, that will be another PR, #570 only fix reproducible when you rerun riscv_ctg

allenjbaum · 2024-12-04T08:46:40Z

I'm not quite sure of the difference. If eCTG is rerun with this fix, then then running it twice should produce the same tests. If I then run it again with extra coverpoints, then it should reproduce the same tests, and then add more test for the extra coverpoints (assuming they haven't eren covered by the earlier tests), should it?

…

On Tue, Dec 3, 2024 at 4:53 PM MingZhu Yan ***@***.***> wrote: No, that will be another PR, #570 <#570> only fix reproducible when you rerun riscv_ctg — Reply to this email directly, view it on GitHub <#566 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AHPXVJSI62YDNLIHQSN4AVT2DZHARAVCNFSM6AAAAABSF6BSO2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMJVHA4TONRYG4> . You are receiving this because you commented.Message ID: ***@***.***>

trdthg · 2024-12-04T10:23:54Z

Oh, sorry I misunderstood your meaning, I thought you were saying that you are comparing with the existing test cases in the ctg repository currently.

should it?

Yes

Here are some of my test results

If you for example

Whether adding or reducing (including inserting and appending)
Or just adjust the order

The results of the generated test cases may be different.

I did a test: taking sraw as an example, I added a new coverage point to its val_comb

    val_comb:
      'rs1_val == 0 and rs2_val >= 0 and rs2_val < xlen': 0
      'rs1_val < 0 and rs2_val > 0 and rs2_val < xlen': 0
      'rs1_val > 0 and rs2_val > 0 and rs2_val < xlen': 0
      'rs1_val < 0 and rs2_val == 0': 0
+      'rs1_val == (-2**(xlen-1)) and rs2_val >= 0 and rs2_val < xlen': 0
      'rs1_val == rs2_val and rs2_val > 0 and rs2_val < xlen': 0
      'rs1_val > 0 and rs2_val == 0': 0
      'rs1_val == (2**(xlen-1)-1) and rs2_val >= 0 and rs2_val < xlen': 0
      'rs1_val == 1 and rs2_val >= 0 and rs2_val < xlen': 0

结果是这样的

diff --git a/tests/sraw-01.S b/tests/sraw-01.S
index 549b42a1..a135e14e 100644
--- a/tests/sraw-01.S
+++ b/tests/sraw-01.S
@@ -2,7 +2,7 @@
 // -----------
 // This file was generated by riscv_ctg (https://github.com/riscv-software-src/riscv-ctg)
 // version   : 0.12.2
-// timestamp : Wed Dec  4 10:06:40 2024 GMT
+// timestamp : Wed Dec  4 10:06:56 2024 GMT
 // usage     : riscv_ctg \
 //                  -- cgf //                  --cgf /workspaces/riscv-arch-test/coverage/dataset.cgf \
 //                  --cgf /workspaces/riscv-arch-test/coverage/i/rv64i.cgf \
@@ -64,9 +64,9 @@ inst_5:
 TEST_RR_OP(sraw, x26, x25, x24, -0x1, 0x7fffffffffffffff, 0x1f, x1, 5*XLEN/8, x2)
 
 inst_6:
-// rs1==x24, rs2==x26, rd==x25, rs1_val < 0 and rs2_val > 0 and rs2_val < xlen, 
-// opcode: sraw ; op1:x24; op2:x26; dest:x25; op1val:-0xb504f332;  op2val:0x1f
-TEST_RR_OP(sraw, x25, x24, x26, 0x0, -0xb504f332, 0x1f, x1, 6*XLEN/8, x2)
+// rs1==x24, rs2==x26, rd==x25, rs1_val == (-2**(xlen-1)) and rs2_val >= 0 and rs2_val < xlen, rs1_val < 0 and rs2_val > 0 and rs2_val < xlen
+// opcode: sraw ; op1:x24; op2:x26; dest:x25; op1val:-0x8000000000000000;  op2val:0x1f
+TEST_RR_OP(sraw, x25, x24, x26, 0x0, -0x8000000000000000, 0x1f, x1, 6*XLEN/8, x2)
 
 inst_7:
 // rs1==x26, rs2==x25, rd==x24, rs1_val < 0 and rs2_val == 0, 
@@ -208,6 +208,21 @@ inst_34:
 // rs1_val == 1 and rs2_val >= 0 and rs2_val < xlen, 
 // opcode: sraw ; op1:x30; op2:x29; dest:x31; op1val:0x1;  op2val:0x1f
 TEST_RR_OP(sraw, x31, x30, x29, 0x0, 0x1, 0x1f, x9, 10*XLEN/8, x10)
+
+inst_35:
+// 
+// opcode: sraw ; op1:x30; op2:x29; dest:x31; op1val:-0x8000000000000000;  op2val:0x0
+TEST_RR_OP(sraw, x31, x30, x29, 0x0, -0x8000000000000000, 0x0, x9, 11*XLEN/8, x10)
+
+inst_36:
+// 
+// opcode: sraw ; op1:x30; op2:x29; dest:x31; op1val:-0x8000000000000000;  op2val:0x0
+TEST_RR_OP(sraw, x31, x30, x29, 0x0, -0x8000000000000000, 0x0, x9, 12*XLEN/8, x10)
+
+inst_37:
+// 
+// opcode: sraw ; op1:x30; op2:x29; dest:x31; op1val:-0x8000000000000000;  op2val:0x0
+TEST_RR_OP(sraw, x31, x30, x29, 0x0, -0x8000000000000000, 0x0, x9, 13*XLEN/8, x10)
 #endif
 
 
@@ -239,7 +254,7 @@ signature_x1_1:
 
 
 signature_x9_0:
-    .fill 11*((XLEN/8)/4),4,0xdeadbeef
+    .fill 14*((XLEN/8)/4),4,0xdeadbeef
 
 #ifdef rvtest_mtrap_routine
 tsig_begin_canary:

If you have added a new cover point, now given an input combination，ctg will try to find test cases that cover as many points as possible. Therefore, some things in inst_6 have been modified to cover more coverage points at the same time

However, these changes can be reviewed through the comments on the test cases, I think it's fine

About adjusting the order: I have tested changing the order, but the generated result is the same. I think at least the order of cover points in comment should be different, maybe points has been sorted in some place, I don't know yet

UmerShahidengr · 2024-12-31T09:41:54Z

Going to merge #570 well done @trdthg . Will close this one after merging the PR

trdthg changed the title ~~I thought this is strange too.~~ CTG tests are not reproducible Nov 21, 2024

trdthg mentioned this issue Nov 26, 2024

[CTG] Add new cli param '--filter' #571

Merged

trdthg mentioned this issue Dec 23, 2024

[CTG] Replace set with OrderedSet #570

Merged

UmerShahidengr closed this as completed Dec 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CTG tests are not reproducible #566

CTG tests are not reproducible #566

trdthg commented Nov 21, 2024 •

edited

Loading

trdthg commented Nov 26, 2024 •

edited

Loading

allenjbaum commented Nov 26, 2024 via email

trdthg commented Nov 26, 2024

allenjbaum commented Nov 26, 2024

trdthg commented Dec 3, 2024

allenjbaum commented Dec 3, 2024 via email

trdthg commented Dec 4, 2024

allenjbaum commented Dec 4, 2024 via email

trdthg commented Dec 4, 2024 •

edited

Loading

UmerShahidengr commented Dec 31, 2024

CTG tests are not reproducible #566

CTG tests are not reproducible #566

Comments

trdthg commented Nov 21, 2024 • edited Loading

trdthg commented Nov 26, 2024 • edited Loading

allenjbaum commented Nov 26, 2024 via email

trdthg commented Nov 26, 2024

allenjbaum commented Nov 26, 2024

trdthg commented Dec 3, 2024

allenjbaum commented Dec 3, 2024 via email

trdthg commented Dec 4, 2024

allenjbaum commented Dec 4, 2024 via email

trdthg commented Dec 4, 2024 • edited Loading

UmerShahidengr commented Dec 31, 2024

trdthg commented Nov 21, 2024 •

edited

Loading

trdthg commented Nov 26, 2024 •

edited

Loading

trdthg commented Dec 4, 2024 •

edited

Loading