Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blacksmith on non-Coffee Lake CPUs #4

Open
hariv opened this issue Nov 18, 2021 · 31 comments
Open

Blacksmith on non-Coffee Lake CPUs #4

hariv opened this issue Nov 18, 2021 · 31 comments

Comments

@hariv
Copy link

hariv commented Nov 18, 2021

Did anyone try running blacksmith on CPUs other than Coffee Lake?

I was able to run it successfully on Kaby Lake, but it didn't work on Comet Lake. It errors out immediately saying it could not find conflicting address sets and asks if the number of banks has been defined correctly (which I checked is correct).

@pjattke
Copy link
Collaborator

pjattke commented Nov 18, 2021

Hi @hariv

The DRAM address functions are obtained from an i7-8700K (Coffee Lake). It is very likely that those functions are different on other CPUs. In this case, you would need to first-reverse engineer them (e.g., using DRAMA or TRRespass' DRAMA) and then update the DRAM addressing matrices in DRAMAddr.cpp.

@hariv
Copy link
Author

hariv commented Nov 18, 2021

Got it. Thank you @pjattke.

@hariv hariv closed this as completed Nov 18, 2021
@heechul
Copy link

heechul commented Nov 20, 2021

Hi @hariv

The DRAM address functions are obtained from an i7-8700K (Coffee Lake). It is very likely that those functions are different on other CPUs. In this case, you would need to first-reverse engineer them (e.g., using DRAMA or TRRespass' DRAMA) and then update the DRAM addressing matrices in DRAMAddr.cpp.

Hi. @pjattke.
I've got the following TRRespass's DRAMA outcome on a skylake machine. can you explain how DRAMAddr.cpp should be modified to reflect the mapping? Thanks.

         Valid Function: 0x4080                  bits: 7 + 14
         Valid Function: 0x88000                 bits: 15 + 19
         Valid Function: 0x110000                bits: 16 + 20
         Valid Function: 0x220000                bits: 17 + 21
         Valid Function: 0x440000                bits: 18 + 22
         Valid Function: 0x4b300                 bits: 8 + 9 + 12 + 13 + 15 + 18
0x4080
0x88000
0x110000
0x220000
0x440000
0x4b300

@pjattke pjattke reopened this Nov 23, 2021
@DominikBucko
Copy link

Could we get any followup on this? I used DRAMA tool as well and obtained memory functions, but don't know how to input them into the code.

@pjattke
Copy link
Collaborator

pjattke commented Nov 28, 2021

Dear @heechul and @DominikBucko,

I'm sorry for the late reply but I didn't have time to work on that yet. I'll soon (in the next days) provide you with a script that generates the addressing matrices that you can input into Blacksmith.

Thanks for your patience and understanding!

@pjattke
Copy link
Collaborator

pjattke commented Dec 5, 2021

Dear @heechul and @DominikBucko,

Finally, I managed to find some time to update our DRAM addressing matrices script. Sorry for the delay.

You can find the script mat-gen.py in this gist. You can find a # TODO note showing the section that you need to edit. It should be enough if you fill out dram_fns and row_fn based on the information from DRAMA.

Regards,
Patrick

@heechul
Copy link

heechul commented Dec 6, 2021

Hi @pjattke

Thanks a lot for sharing the script. I have a question.

If I'm not mistaken, the output of the script seems a bit different from the default configuration in the blacksmith repository when dram_fns and row_fn in the script was configured to match with the known functions in the repository (i.e., dram_fns = [0x2040, 0x24000, 0x48000, 0x90000], row_fn = 0x3ffe0000). For example, in the aforementioned single_rank configuration, DRAM_MTX[4] - DRAM_MTX[10] are shifted right by 1 bit in the generated matrix compared to the matrix in the code repository. Can you clarify on this?

Thanks

Heechul.

@pjattke
Copy link
Collaborator

pjattke commented Dec 14, 2021

Hi @heechul

To summarize what my colleague told me, who has implemented this part of Blacksmith:

It shouldn't matter all that much because changing one bit changes also the other one if you want to stay in the same bank, so it's either/or. If we assume we have the row/col bit overlapping with a bank function (multiple XORed bits), then having that row bit on the higher bit or the lower bit shouldn't matter since you can't change one without changing the other.

This is because the bank/rank functions on our CPU (i7-8700K) consist each of two bits that are combined by XOR. So if you change any of them, you will end up in a different bank.

However, coming back to your question: I cannot tell why the output is different (shifted by one bit). My colleague told me that he will look into this more once he finds time. Meanwhile, you can just try to use the output generated by mat-gen.py and report back if that worked for you.

In any way, I will try to replace this DRAM addressing part in the next couple of weeks by something that makes it easier to work with as I recognize that the current solution is cumbersome.

@heechul
Copy link

heechul commented Dec 15, 2021

Hi @pjattke

Thanks for following this up.

  • I was able to obtain the same i7-8700K (coffeelake) processor and generate bitflips with the original code in the repository I will see if I can also generate bitflips with using the mat-gen.py generated tables as well.
  • below is my attempt to understand the addr<-->bank|col|row mapping tables. can you confirm if this is correct? Indeed, the addressing part was/is a bit tricky to understand. So, it will be great if you could make it easier to understand.
struct MemConfiguration single_rank= {
..
 // bank_rank_functions = std::vector<uint64_t>({0x2040, 0x24000, 0x48000, 0x90000});
  .DRAM_MTX = { /* addr -> bank (4 bits) | col (13 bits) | row (13 bits) */
    0b000000000000000010000001000000, /* 0x02040 bank b3 = addr b6 + b13 */
    0b000000000000100100000000000000, /* 0x24000 bank b2 = addr b14 + b17 */
    0b000000000001001000000000000000, /* 0x48000 bank b1 = addr b15 + b18 */
    0b000000000010010000000000000000, /* 0x90000 bank b0 = addr b16 + b19 */
    0b000000000000000010000000000000, /* col b12 = addr b13 */
    0b000000000000000001000000000000, /* col b11 = addr b12 */
    0b000000000000000000100000000000, /* col b10 = addr b11 */
    0b000000000000000000010000000000, /* col b9 = addr b10 */
    0b000000000000000000001000000000, /* col b8 = addr b9 */
    0b000000000000000000000100000000, /* col b7 = addr b8*/
    0b000000000000000000000010000000, /* col b6 = addr b7 */
    0b000000000000000000000000100000, /* col b5 = addr b5 (not b6)*/
    0b000000000000000000000000010000, /* col b4 = addr b4*/
    0b000000000000000000000000001000, /* col b3 = addr b3 */
    0b000000000000000000000000000100, /* col b2 = addr b2 */
    0b000000000000000000000000000010, /* col b1 = addr b1 */
    0b000000000000000000000000000001, /* col b0 = addr b0*/
    0b100000000000000000000000000000, /* row b12 = addr b29 */
    0b010000000000000000000000000000, /* row b11 = addr b28 */
    0b001000000000000000000000000000, /* row b10 = addr b27 */
    0b000100000000000000000000000000, /* row b9 = addr b26 */
    0b000010000000000000000000000000, /* row b8 = addr b25 */
    0b000001000000000000000000000000, /* row b7 = addr b24 */
    0b000000100000000000000000000000, /* row b6 = addr b23 */
    0b000000010000000000000000000000, /* row b5 = addr b22 */
    0b000000001000000000000000000000, /* row b4 = addr b21 */
    0b000000000100000000000000000000, /* row b3 = addr b20 */
    0b000000000010000000000000000000, /* row b2 = addr b19 */
    0b000000000001000000000000000000, /* row b1 = addr b18 */
    0b000000000000100000000000000000, /* row b0 = addr b17 */
  },
  .ADDR_MTX =  { /* bank | col | row --> addr */
    0b000000000000000001000000000000, /* addr b29 = row b12 */
    0b000000000000000000100000000000, /* addr b28 = row b11 */
    0b000000000000000000010000000000, /* addr b27 = row b10 */
    0b000000000000000000001000000000, /* addr b26 = row b9 */
    0b000000000000000000000100000000, /* addr b25 = row b8 */
    0b000000000000000000000010000000, /* addr b24 = row b7 */
    0b000000000000000000000001000000, /* addr b23 = row b6 */
    0b000000000000000000000000100000, /* addr b22 = row b5 */
    0b000000000000000000000000010000, /* addr b21 = row b4 */
    0b000000000000000000000000001000, /* addr b20 = row b3 */
    0b000000000000000000000000000100, /* addr b19 = row b2 */
    0b000000000000000000000000000010, /* addr b18 = row b1 */
    0b000000000000000000000000000001, /* addr b17 = row b0 */
    0b000100000000000000000000000100, /* addr b16 = bank b0 + row b2 (addr b19) */
    0b001000000000000000000000000010, /* addr b15 = bank b1 + row b1 (addr b18) */
    0b010000000000000000000000000001, /* addr b14 = bank b2 + row b0 (addr b17) */
    0b000010000000000000000000000000, /* addr b13 = col b12 */
    0b000001000000000000000000000000, /* addr b12 = col b11 */
    0b000000100000000000000000000000, /* addr b11 = col b10 */
    0b000000010000000000000000000000, /* addr b10 = col b9 */
    0b000000001000000000000000000000, /* addr b9 = col b8 */
    0b000000000100000000000000000000, /* addr b8 = col b7 */
    0b000000000010000000000000000000, /* addr b7 = col b6 */
    0b100010000000000000000000000000, /* addr b6 = bank b3 + col b12 (addr b13)*/
    0b000000000001000000000000000000, /* addr b5 = col b5 */
    0b000000000000100000000000000000, /* addr b4 = col b4 */
    0b000000000000010000000000000000, /* addr b3 = col b3 */
    0b000000000000001000000000000000, /* addr b2 = col b2 */
    0b000000000000000100000000000000, /* addr b1 = col b1 */
    0b000000000000000010000000000000  /* addr b0 = col b0 */
}
  • I found that my i5-6500 skylake processor (the one I originally used before getting a coffeelake) is also having the same mapping with the coffeelake machine when 1 DIMM module was plugged.
  • Lastly, I tried to find a tigerlake machine's bank/rank mapping functions using tresspass's DRAMA but was not successful. I wonder if you have a tigerlake machine and if so whether you could successfully reverse engineer the mapping.

Thanks

pjattke added a commit that referenced this issue Dec 29, 2021
@pjattke
Copy link
Collaborator

pjattke commented Dec 29, 2021

Dear @heechul,

Thanks for your update.

  • I am glad to hear that you could reproduce bit flips on an i7-8700K (Coffee Lake). Do you already have results for the matrices generated by mat-gen.py? It would be helpful to know that so I can start integrating mat-gen.py more properly.
  • I can confirm that your reverse-engineered annotations are indeed correct. I added your annotations to the repo's code so in future people will have it easier to understand them. However, they ideally should not have to change anything in these matrices (except for replacing the whole matrix by the output of mat-gen.py in case they use a CPU with a different micro-architecture). As a first step towards making this easier, I have added a CPU model check in Blacksmith.cpp.
  • Regarding the Skylake address functions, I am sorry but I cannot help as we don't have any Skylake machines. However, there is some existing work, e.g., DRAMA (Fig. 4c and Table 2b) and work by Barenghi et al. (Fig. 4) that you may want to use to compare with. From what I see, it looks like the bank/rank functions are slightly different than on Coffee Lake.
  • No, I am sorry we do not have any Tiger Lake system.

Regards,
Patrick

@heechul
Copy link

heechul commented Dec 30, 2021

Hi @pjattke

Thanks for confirming the annotation.
I can report that we got bitflips with the mat-gen.py generated matrices.
Thanks for the pointers regarding Skylake mapping functions.

Happy new year!
Heechul

@JKRde
Copy link

JKRde commented Feb 10, 2022

Hi Patrik,

for an i3-8350k system I have created a log with DRAMA, see attachment.
Unfortunately I don't know how to get the values for dram_fns and row-fn out of this information.
Could you explain how to determine these?

drama_output.log

@pjattke
Copy link
Collaborator

pjattke commented Feb 18, 2022

Hi @JKRde,
Could you meanwhile figure it out or do you need help? Basically, you need to take the bits DRAMA found to be part of the masks, then create its hexadecimal representation, and then use the mat-gen.py script to translate the masks into the DRAM addressing matrices used by Blacksmith.

@JKRde
Copy link

JKRde commented Feb 22, 2022

Hi Patrik,

Unfortunately I have not yet managed to determine the dram_fns & row_fn values with the TRRespass' DRAMA tools. Maybe you could give me a step by step guide for dummies ;-)

BR
Jens

@TheSilentDawn
Copy link

Hi @pjattke ,
I have run the drama from Trrespass repo and get the result of DRAM mapping function info as below.

~~~~~~~~~~ Found Functions ~~~~~~~~~~
	 Valid Function: 0x8000 		 bits: 15 
	 Valid Function: 0x10000 		 bits: 16 
	 Valid Function: 0x20080 		 bits: 7 + 17 
	 Valid Function: 0x1000040 		 bits: 6 + 24 
	 Valid Function: 0x2200000 		 bits: 21 + 25 
	 Valid Function: 0x4400000 		 bits: 22 + 26 
	 Valid Function: 0x8800000 		 bits: 23 + 27 
	 Valid Function: 0x145140 		 bits: 6 + 8 + 12 + 14 + 18 + 20 
0x8000
0x10000
0x20080
0x1000040
0x2200000
0x4400000
0x8800000
0x145140
~~~~~~~~~~ Looking for row bits ~~~~~~~~~~
[LOG] - Set #0
[LOG] - 184716da80 - 18824693eb	 Time: 273 <== GOTCHA
[LOG] - 184716da80 - 18553ecda1	 Time: 270 <== GOTCHA
[LOG] - 184716da80 - 18160cf469	 Time: 270 <== GOTCHA
[LOG] - 184716da80 - 189352a594	 Time: 267 <== GOTCHA
[LOG] - 184716da80 - 180714b92f	 Time: 264 <== GOTCHA
[LOG] - Set #1
[LOG] - 1833714c40 - 18541b138a	 Time: 273 <== GOTCHA
[LOG] - 1833714c40 - 1899d90349	 Time: 276 <== GOTCHA
[LOG] - 1833714c40 - 18373f65f1	 Time: 279 <== GOTCHA
[LOG] - 1833714c40 - 1808cd7a0f	 Time: 279 <== GOTCHA
[LOG] - 1833714c40 - 1822712c20	 Time: 279 <== GOTCHA
[LOG] - Row mask: 0xffff800000 		 bits: 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 
0xffff800000

Continuously, I parse them into the script mat-gen.py as below.

num_channels = 4
num_dimms = 16
num_ranks = 2
num_banks = 16

dram_fns = [0x8000, 0x10000, 0x20080, 0x1000040, 0x2200000, 0x4400000, 0x8800000, 0x145140]
row_fn = 0xffff800000
col_fn = 8192 - 1

However, the script mat-gen.py will throw the error because https://gist.github.com/pjattke/b56baff62be77f16ad8d33376789be67#file-mat-gen-py-L56 requires a 30x30 square which is not satisfied by my drama result and parsed info in mat-gen.py.
I'm confused with that is 30x30 enforced? Obviously, my daram result is not.

@pjattke
Copy link
Collaborator

pjattke commented Apr 5, 2023

Hi @TheSilentDawn. Thanks for your interest in Blacksmith. Could you please provide us with some more information:

  1. Which CPU are these functions from?
  2. Do you really have a system equipped with 4 x 16 = 64 dual-rank DIMMs? The timing-based DRAMA cannot figure out the DIMM/channel functions.

The 30x30 constraint comes from the fact that we are using a superpage, and thus cannot control any bits higher than bit 30. It needs to be a square matrix and invertible (i.e., have full rank) such that we can compute the -to- translation matrix using linear algebra.

Best
Patrick

@TheSilentDawn
Copy link

Hi @pjattke ,
Thanks for your prompt reply.

  1. I'm using Intel Xeon E5-2690 v3.
  2. Yes, the server is equipped with 64 dual-rank DIMMs. However, to simplify the process, I have unplugged them and only one DRAM is left whose information is below.
num_channels = 1
num_dimms = 1
num_ranks = 2
num_banks = 16

I rerun drama from trrespass. The result is below.

root@ubuntu: ~/trrespass-master/drama/obj#./tester -s 16 -t 460 -o access.csv -v
...
~~~~~~~~~~ Found Functions ~~~~~~~~~~
	 Valid Function: 0x2000 		 bits: 13 
	 Valid Function: 0x200040 		 bits: 6 + 21 
	 Valid Function: 0x440000 		 bits: 18 + 22 
	 Valid Function: 0x880000 		 bits: 19 + 23 
	 Valid Function: 0x1100000 		 bits: 20 + 24 
0x2000
0x200040
0x440000
0x880000
0x1100000
~~~~~~~~~~ Looking for row bits ~~~~~~~~~~
[LOG] - Set #0
[LOG] - 3d6b5b940 - 3bb6a30f0	 Time: 413 <== GOTCHA
[LOG] - 3d6b5b940 - 39cb725cd	 Time: 407 <== GOTCHA
[LOG] - 3d6b5b940 - 3aa59e8af	 Time: 458 <== GOTCHA
[LOG] - 3d6b5b940 - 3e097750e	 Time: 458 <== GOTCHA
[LOG] - 3d6b5b940 - 3961e2b9a	 Time: 458 <== GOTCHA
[LOG] - Set #1
[LOG] - 3c86ac200 - 3e0485651	 Time: 458 <== GOTCHA
[LOG] - 3c86ac200 - 3e8c106cb	 Time: 458 <== GOTCHA
[LOG] - 3c86ac200 - 3b1f18208	 Time: 458 <== GOTCHA
[LOG] - 3c86ac200 - 3f53f1a9f	 Time: 458 <== GOTCHA
[LOG] - 3c86ac200 - 3ddf18e91	 Time: 458 <== GOTCHA
[LOG] - Row mask: 0xffff000000 		 bits: 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 
0xffff000000

Based on my understanding the variable dram_fns in mat-gen.py should be configurated as [0x2000, 0x200040, 0x440000, 0x880000, 0x1100000] and the variable row_fn should be configurated as 0xffff000000 following the result from trrespass drama. However, I'm confused with the variable col_fn, which value should it be? If I try to create a 30x30 matrix, it should be 524288 - 1. But the script mat-gen.py will throw an error meaning not invertible.

@TheSilentDawn
Copy link

Hi @pjattke ,
I also got another result running on Intel(R) Xeon(R) CPU E5-2690 and 1x16G DRAM which Part Number is HMT42GR7BFR4A-PB.
The drama output is below.

xxx:~/trrespass-master/drama # ./obj/tester -s 8 -o ddr3.csv -v
~~~~~~~~~~ Found Functions ~~~~~~~~~~
         Valid Function: 0x4000                  bits: 14
         Valid Function: 0x80000                 bits: 19
         Valid Function: 0x42000                 bits: 13 + 18
0x4000
0x80000
0x42000
~~~~~~~~~~ Looking for row bits ~~~~~~~~~~
[LOG] - Set #0
[LOG] - 2d9cde780 - 2fe09de11    Time: 288 <== GOTCHA
[LOG] - 2d9cde780 - 2de7842b5    Time: 280 <== GOTCHA
[LOG] - 2d9cde780 - 2a4b8df0f    Time: 272 <== GOTCHA
[LOG] - 2d9cde780 - 2e74ac4b4    Time: 280 <== GOTCHA
[LOG] - 2d9cde780 - 2b10de59e    Time: 304 <== GOTCHA
[LOG] - Set #1
[LOG] - 2f0f163c0 - 2b636d892    Time: 280 <== GOTCHA
[LOG] - 2f0f163c0 - 2af73e536    Time: 304 <== GOTCHA
[LOG] - 2f0f163c0 - 29eb7593b    Time: 284 <== GOTCHA
[LOG] - 2f0f163c0 - 2d2d4c2c3    Time: 284 <== GOTCHA
[LOG] - 2f0f163c0 - 2f3074bc3    Time: 276 <== GOTCHA
[LOG] - Row mask: 0xffff000000           bits: 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39
0xffff000000

Could you please help to explain what configuration should be in mat-gen.py?

@pjattke
Copy link
Collaborator

pjattke commented Apr 11, 2023

Hi @TheSilentDawn. I'm sorry, but I do not have the resources anytime in near future to help with this further. There is a little chance that one of my students will have the time to make mat-gen.py nice over the next weeks, but I cannot promise.

You will need to study the mat-gen.py carefully. It's basically just a translation matrix that it computes, so you need to have a square matrix (e.g., 30x30) with full rank (i.e., linearly independent rows). If this is not given, you either have the wrong functions (or row/column masks), or you need to augment it with "dummy" matrix rows (this would correspond to bits not involved in DRAM addressing).

I'm sorry that I cannot give you a more positive reply. I hope you understand. Good luck!

@its-luca
Copy link

@pjattke We have a student who is using blacksmith in a project. As part of the project he did some polishing on the address function import part of blacksmith and will post a PR soon.

@missyoufenglan87
Copy link

Has anyone successfully caused bitflips on DRAMs produced by Blacksmith between 2023-2024 ? I ran the project without any error message, but I didn't flip any bits on any DRAMs.Has this method now become impossible to bypass the existing TRR mechanism?

@pjattke
Copy link
Collaborator

pjattke commented Sep 20, 2024

Hi @missyoufenglan87, we have also found more recent DIMMs where we could not trigger bit flips with Blacksmith anymore. We have not investigated this further on a larger scale though. It is likely that the DRAM vendors meanwhile have improved their mitigations.

@Chapoly1305
Copy link

Chapoly1305 commented Oct 11, 2024

Great work and this repo is the best and the only we can Confirmed Reproduce.

Memory Module: HMA81GU6JJR8N-VK This is a 1 Rank 16 Banks module.
PC Model: OptiPlex-7060/Intel 8700
Detailed Computer Spec: dmidecode.txt
OS: Ubuntu 22.04

Steps:

  1. Get a copy of TRespass

  2. Using sudo, run TRespass/drama/hugepage.sh. Run again if computer rebooted, you can ignore the error saying directory existed.

  3. Compile TRespass/drama.

  4. Using sudo, run TRespass/drama/obj/tester -s 16 --mem 1024000. This step shall generate result fairly quick. We expect the whole process done in 30 seconds. If not, kill it and try again. We sometime observe the result is offset to most of the results, we don't know why. Our results are 0x2040,0x24000,0x48000,0x90000 (use for dram_fns) and 0x3ffe0000 (use for row_fn).

  5. Get mat-gen.py. Run the with updated dram_fns and row_fn.

  6. Update blacksmith/src/Memory/DRAMAddr.cpp with generated content. We have also updated blacksmith/include/GlobalDefines.hpp to match our module.
    GlobalDefines.zip

  7. Ensure you have enabled hugepage, if not, repeat step 2.

  8. Compile and run blacksmith. sudo blacksmith/build/blacksmith --dimm-id 1 --ranks 1 -t 2160000

@jpjy
Copy link

jpjy commented Dec 13, 2024

@Chapoly1305

Thanks for your sharing, I have the same dram_fns and row_fn results after running the Trrespass/Drama, and I followed your steps to update the mat-gen.py, DRAMAddr.cpp, and GlobalDefines.hpp. However, when I run the blacksmith program, it is always running without print any information to the stdout.log, even though I use the sudo ./blacksmith --dimm-id 1 --ranks 1 -t 60 to limit the runtime to 1 min, it still can not stop after that time.

I tried to update the threshold in GlobalDefines.hpp, in Trrespass/Drama, I got the threshold of row buffer conflict should be around 300, and I use this value, but it print "Could not find conflicting address sets. Is the number of banks (8) defined correctly?". If I update this threshold to be 430 (the default value in this repo), it just run without stop and does not print anything.

I appreciate if you can tell me any hints about this issue.

@Chapoly1305
Copy link

Chapoly1305 commented Dec 13, 2024

@Chapoly1305

Thanks for your sharing, I have the same dram_fns and row_fn results after running the Trrespass/Drama, and I followed your steps to update the mat-gen.py, DRAMAddr.cpp, and GlobalDefines.hpp. However, when I run the blacksmith program, it is always running without print any information to the stdout.log, even though I use the sudo ./blacksmith --dimm-id 1 --ranks 1 -t 60 to limit the runtime to 1 min, it still can not stop after that time.

I tried to update the threshold in GlobalDefines.hpp, in Trrespass/Drama, I got the threshold of row buffer conflict should be around 300, and I use this value, but it print "Could not find conflicting address sets. Is the number of banks (8) defined correctly?". If I update this threshold to be 430 (the default value in this repo), it just run without stop and does not print anything.

I appreciate if you can tell me any hints about this issue.

@jpjy
0. You MUST run TRespass/drama/hugepage.sh. If you saw no log, you probably missed this step.

  1. I have the same issue that program is not stopping after a given time. In particular enabling the sweeping. If you want to stop at a firm time, for example, a minute, or an hour, my hint would be timeout 1m ./blacksmith ..., or timeout 1h. Linux will just kill the program when timeout. If you only run for a minute, you might not find any issue. My experience is 3~5 minutes at least. The console log is not updated frequently because it has been buffered. If you wish to have line-buffed (print out each line at real time), a few modification to Logging.cpp are required. You can ask GPT to modify for you as well. We are about to have holidays so I will post a MR later this month.
  2. I am also new to this field and find the number of banks confusing. So if it says 8 is too small, my experience is just set it to 16 and try again. I think the difference is the dram_fns and row_fn would have small differences. I had the RAM working sometime on 8, but sometime I got the same notice. I just stop the program and run again sometime it just worked. By changing it from 8 to 16, my RAM worked more stable. I find 400 or 430 worked the best for my RAM.
  3. Not all the RAM can reproduce. I have 4 other models do not have any issue discovered, and only 1 module can stable reproduce and listed above.

@jpjy
Copy link

jpjy commented Dec 15, 2024

@Chapoly1305

Thanks for your prompt reply.
I found the issue is the if ((acts.size()%200)==0 && compute_std(acts, running_sum, acts.size())<3.0) break; if the std can not be lower than 3.0 (for my machine, the std can be more than 20), it never conduct hammer operations, and runtime-limit will not work. So I hard-coded the program_args.acts_per_trefi to be 90, which should be a reasonable value.

I tried sudo ./blacksmith --dimm-id 1 --ranks 1 -t 3600 to hammer one hour, the programs fuzz 553 patterns without anyone effective. The output parameters are like the following, which conduct 274568 row activation in one pattern test.
Could I know if it is sufficient to induce a bit flip with such amount of row activation? Do you think I should extend my test time to fuzz more patterns? I saw your test time is 600 hours. I appreciate your any suggestions.

[+] Running pattern #18 (4cd2c869-5d29-4be5-b1b8-174c43827b4b) for address set 0 (1b2b682f-7d25-41e9-b249-a965d02fb20a) at DRAM location #0.
[+] Printing code jitting-related fuzzing parameters:
sync_each_ref: false
wait_until_start_hammering_refs: 327
num_aggressors_for_sync: 2
[+] Hammering the last generated pattern.
[+] Synchronization stats:
Total sync acts: 274568
Number of pattern reps while hammering: 3472
Number of total synced REFs (est.): 3472
Avg. number of acts per sync: 79
[+] Checking 514 victims for bit flips.

@Chapoly1305
Copy link

@Chapoly1305

Thanks for your prompt reply. I found the issue is the if ((acts.size()%200)==0 && compute_std(acts, running_sum, acts.size())<3.0) break; if the std can not be lower than 3.0 (for my machine, the std can be more than 20), it never conduct hammer operations, and runtime-limit will not work. So I hard-coded the program_args.acts_per_trefi to be 90, which should be a reasonable value.

I tried sudo ./blacksmith --dimm-id 1 --ranks 1 -t 3600 to hammer one hour, the programs fuzz 553 patterns without anyone effective. The output parameters are like the following, which conduct 274568 row activation in one pattern test. Could I know if it is sufficient to induce a bit flip with such amount of row activation? Do you think I should extend my test time to fuzz more patterns? I saw your test time is 600 hours. I appreciate your any suggestions.

[+] Running pattern #18 (4cd2c869-5d29-4be5-b1b8-174c43827b4b) for address set 0 (1b2b682f-7d25-41e9-b249-a965d02fb20a) at DRAM location #0.
[+] Printing code jitting-related fuzzing parameters:
sync_each_ref: false
wait_until_start_hammering_refs: 327
num_aggressors_for_sync: 2
[+] Hammering the last generated pattern.
[+] Synchronization stats:
Total sync acts: 274568
Number of pattern reps while hammering: 3472
Number of total synced REFs (est.): 3472
Avg. number of acts per sync: 79
[+] Checking 514 victims for bit flips.

@jpjy

Congrats on making it to work.
I think the time is not a significant factor here. In their paper and my limited experience, if a RAM module is vulnerable, bit-flip should be detected fairly quick (first report always happens in few mintues after starting the program). 3600 is just a number I picked. In fact, the module I used can report issues with smaller numbers (I also tried 30, 60, 120), even the default value. My recommendation is to try on other modules, maybe start with manufactured in early 2016 (when the DDR4 just came out). Try get some from used market like eBay to save cost, those old 8GB sticks are about 10 dollars each. Once you find a working one, you can try on others.

I think the author made it quite clear that only a portion of CPU are supported, so I would recommand you to confirm you are using the supported one of their list. I tried this project on two independent
workstations and confirm both 8700 and 8700k will work without changing their code (for CPU). I also recommand you to try one stick at a time, at least initially, don't put multiple. I am not sure how their program selects the DIMMs.

@jpjy
Copy link

jpjy commented Dec 17, 2024

@Chapoly1305

Thanks for your suggestions and happy holiday. I will try other memory modules.
Is it convenient for you to post the stdout.log file, so I can investigate some parameters on the pattern which can induce bit flips.

@missyoufenglan87
Copy link

Hi @missyoufenglan87, we have also found more recent DIMMs where we could not trigger bit flips with Blacksmith anymore. We have not investigated this further on a larger scale though. It is likely that the DRAM vendors meanwhile have improved their mitigations.

Thanks for your prompt reply@pjattke . I deployed your latest project "Zenhammer" to the Intel platform and successfully caused the flip, but haven't found your code for bit flip exploitation as mentioned in your paper. I wonder if you can provide this part of code at your convenience.

@pjattke
Copy link
Collaborator

pjattke commented Dec 23, 2024

Thanks for your prompt reply@pjattke . I deployed your latest project "Zenhammer" to the Intel platform and successfully caused the flip, but haven't found your code for bit flip exploitation as mentioned in your paper. I wonder if you can provide this part of code at your convenience.

@missyoufenglan87 Could you please drop me an email regarding that. You can find my mail address on the Blacksmith paper, thanks!

@missyoufenglan87
Copy link

Thanks for your prompt reply@pjattke . I deployed your latest project "Zenhammer" to the Intel platform and successfully caused the flip, but haven't found your code for bit flip exploitation as mentioned in your paper. I wonder if you can provide this part of code at your convenience.

@missyoufenglan87 Could you please drop me an email regarding that. You can find my mail address on the Blacksmith paper, thanks!

@pjattke I have sent an email to your ETH mailbox,Looking forward to your reply, thanks again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants