How do I make sure Variorum can use msr_batch? #542
Replies: 7 comments 12 replies
-
Hi @brandonbiggs : I can reproduce the incorrect version reported by Variorum at my end, I'll open an issue for fixing this. Couple of questions to help debug:
|
Beta Was this translation helpful? Give feedback.
-
Hi @brandonbiggs:
|
Beta Was this translation helpful? Give feedback.
-
# sudo LD_LIBRARY_PATH=/opt/variorum/install/lib/ python variorum-print-power-python-example.py
=== Running Variorum Print Power:
server:/opt/variorum/src/variorum/msr/msr_core.c:do_batch_op():215: _ERROR_VARIORUM_MSR_BATCH: IOctl failed, does /dev/cpu/msr_batch exist?
CPU 0, MSR 0x611, ERR (Unknown error -13)
CPU 8, MSR 0x611, ERR (Unknown error -13)
CPU 0, MSR 0x619, ERR (Unknown error -13)
CPU 8, MSR 0x619, ERR (Unknown error -13)
server:/opt/variorum/src/variorum/msr/msr_core.c:do_batch_op():215: _ERROR_VARIORUM_MSR_BATCH: IOctl failed, does /dev/cpu/msr_batch exist?
CPU 0, MSR 0x606, ERR (Unknown error -13)
CPU 8, MSR 0x606, ERR (Unknown error -13)
_PACKAGE_ENERGY_STATUS Offset Host Socket Bits Energy_J Power_W Elapsed_sec Timestamp_sec
_PACKAGE_ENERGY_STATUS 0x611 server 0 0 0.000000 0.000000 0.000000 0.000005
_PACKAGE_ENERGY_STATUS 0x611 server 1 0 0.000000 0.000000 0.000000 0.000005
_DRAM_ENERGY_STATUS Offset Host Socket Bits Energy_J Power_W Elapsed_sec Timestamp_sec
_DRAM_ENERGY_STATUS 0x619 server 0 0 0.000000 0.000000 0.000000 0.000005
_DRAM_ENERGY_STATUS 0x619 server 1 0 0.000000 0.000000 0.000000 0.000005
I did see this, but I figured since I'm running as root I may not need to set those groups up yet. I wanted to try to get this running as root first to eliminate some of the potential layers of complex, i.e. if I had permissions messed up running this as a non root user.
You were right! Sorry. Yes, the examples did get installed. I was looking in [root@server: (dev) /opt/variorum/build/examples ]
# ./variorum-print-power-example
server:/opt/variorum/src/variorum/msr/msr_core.c:do_batch_op():215: _ERROR_VARIORUM_MSR_BATCH: IOctl failed, does /dev/cpu/msr_batch exist?
CPU 0, MSR 0x611, ERR (Unknown error -13)
CPU 8, MSR 0x611, ERR (Unknown error -13)
CPU 0, MSR 0x619, ERR (Unknown error -13)
CPU 8, MSR 0x619, ERR (Unknown error -13)
server:/opt/variorum/src/variorum/msr/msr_core.c:do_batch_op():215: _ERROR_VARIORUM_MSR_BATCH: IOctl failed, does /dev/cpu/msr_batch exist?
CPU 0, MSR 0x606, ERR (Unknown error -13)
CPU 8, MSR 0x606, ERR (Unknown error -13)
_PACKAGE_ENERGY_STATUS Offset Host Socket Bits Energy_J Power_W Elapsed_sec Timestamp_sec
_PACKAGE_ENERGY_STATUS 0x611 server 0 0 0.000000 0.000000 0.000000 0.000004
_PACKAGE_ENERGY_STATUS 0x611 server 1 0 0.000000 0.000000 0.000000 0.000004
_DRAM_ENERGY_STATUS Offset Host Socket Bits Energy_J Power_W Elapsed_sec Timestamp_sec
_DRAM_ENERGY_STATUS 0x619 server 0 0 0.000000 0.000000 0.000000 0.000004
_DRAM_ENERGY_STATUS 0x619 server 1 0 0.000000 0.000000 0.000000 0.000004
Final result: inf
server:/opt/variorum/src/variorum/msr/msr_core.c:do_batch_op():215: _ERROR_VARIORUM_MSR_BATCH: IOctl failed, does /dev/cpu/msr_batch exist?
CPU 0, MSR 0x611, ERR (Unknown error -13)
CPU 8, MSR 0x611, ERR (Unknown error -13)
CPU 0, MSR 0x619, ERR (Unknown error -13)
CPU 8, MSR 0x619, ERR (Unknown error -13)
_PACKAGE_ENERGY_STATUS 0x611 server 0 0 0.000000 0.000000 0.167754 0.167732
_PACKAGE_ENERGY_STATUS 0x611 server 1 0 0.000000 0.000000 0.167754 0.167732
_DRAM_ENERGY_STATUS 0x619 server 0 0 0.000000 0.000000 0.167754 0.167732
_DRAM_ENERGY_STATUS 0x619 server 1 0 0.000000 0.000000 0.167754 0.167732 |
Beta Was this translation helpful? Give feedback.
-
@brandonbiggs Could you, just for the sake of trying, set up your group permissions for the root user? Essentially, Essentially, make it look like: If that doesn't work, try setting both group and other permissions for root.
Let me know if none of those ideas work... |
Beta Was this translation helpful? Give feedback.
-
I did check. I thought it was supported as a broadwell cpu.
Here's the first group. The remaining 32 look similar, just different IDs. # cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 79
model name : Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz
stepping : 1
microcode : 0xb00002e
cpu MHz : 1824.207
cache size : 25600 KB
physical id : 0
siblings : 16
core id : 0
cpu cores : 8
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 20
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap intel_pt xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts flush_l1d
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa itlb_multihit mmio_stale_data
bogomips : 6399.93
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:
This went through all 32, I just cut the output. The registers were the same for all of them. # ./rdmsr -a 0x610
CPU 0: 7851000158438
CPU 1: 7851000158438
CPU 2: 7851000158438 # ./variorum-print-power-example
server:/opt/variorum/src/variorum/msr/msr_core.c:do_batch_op():215: _ERROR_VARIORUM_MSR_BATCH: IOctl failed, does /dev/cpu/msr_batch exist?
Something like this right? [root@server: (dev) /opt/variorum/build/examples ]
# rmmod msr-safe
[root@server: (dev) /opt/variorum/build/examples ]
# ./variorum-print-power-example
Warning: <variorum> Could not stat /dev/cpu/msr_allowlist: stat_module(): No such file or directory: server:/opt/variorum/src/variorum/msr/msr_core.c::321
/dev/cpu/msr_batch: No such file or directory
Warning: <variorum> No /dev/cpu/msr_batch, using compatibility batch: compatibility_batch(): No such file or directory: server:/opt/variorum/src/variorum/msr/msr_core.c::103
Warning: <variorum> No /dev/cpu/msr_batch, using compatibility batch: compatibility_batch(): No such file or directory: server:/opt/variorum/src/variorum/msr/msr_core.c::103
_PACKAGE_ENERGY_STATUS Offset Host Socket Bits Energy_J Power_W Elapsed_sec Timestamp_sec
_PACKAGE_ENERGY_STATUS 0x611 server 0 0xe11342d 14404.815247 0.000000 0.000000 0.000005
_PACKAGE_ENERGY_STATUS 0x611 server 1 0x67fac00c 106475.000732 0.000000 0.000000 0.000005
_DRAM_ENERGY_STATUS Offset Host Socket Bits Energy_J Power_W Elapsed_sec Timestamp_sec
_DRAM_ENERGY_STATUS 0x619 server 0 0x6a02b726 27138.715424 0.000000 0.000000 0.000005
_DRAM_ENERGY_STATUS 0x619 server 1 0xecd6863c 60630.524353 0.000000 0.000000 0.000005
Final result: inf
Warning: <variorum> Could not stat /dev/cpu/msr_allowlist: stat_module(): No such file or directory: server:/opt/variorum/src/variorum/msr/msr_core.c::321
Warning: <variorum> No /dev/cpu/msr_batch, using compatibility batch: compatibility_batch(): No such file or directory: server:/opt/variorum/src/variorum/msr/msr_core.c::103
_PACKAGE_ENERGY_STATUS 0x611 server 0 0xe11d10f 14407.266541 14.620014 0.167667 0.167659
_PACKAGE_ENERGY_STATUS 0x611 server 1 0x67fbb6a8 106478.854004 22.981693 0.167667 0.167659
_DRAM_ENERGY_STATUS 0x619 server 0 0x6a03924f 27139.571518 5.105920 0.167667 0.167659
_DRAM_ENERGY_STATUS 0x619 server 1 0xecd78af8 60631.542847 6.074503 0.167667 0.167659 |
Beta Was this translation helpful? Give feedback.
-
@tpatki Is the link you shared the |
Beta Was this translation helpful? Give feedback.
-
The one I pointed to is a good starting point. Go ahead and copy that, uncomment say We have typically left the options for setting write masks to the site and to the system administrators, as every site has different policies on which registers users should have write permissions to. I cannot as a result give you direct recommendations on what to use there. For your individual testing purposes only, if you want to set power caps, you can use the We also recommend reading MSRs first with Variorum/msr-safe and learning about MSRs in detail. The Intel Software Development Manual is the best resource for learning about MSRs. On the Variorum website, we have some information on some of the MSRs in our documentation: https://variorum.readthedocs.io/en/latest/Intel.html. Intel SDMs are here: https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html I got
I'll create an issue for adding this to our docs, we should have more details in our documentation around allowlists for new users. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I was able to build Variorum and so far it seems like most things are working. However I've noticed when running some python examples, I often see a message like:
/opt/variorum/src/variorum/msr/msr_core.c:do_batch_op():215: _ERROR_VARIORUM_MSR_BATCH: IOctl failed, does /dev/cpu/msr_batch exist?
I did build
msr_safe
and rebuilt variorum, but I may not be linking them correctly? Apparently I'm using variorum version 0.5.0.. Seems like git clone is pulling an older version? Or maybe the version number hasn't been updated?# python variorum-get-current-version-python-example.py === Running Variorum Get Current Version: Variorum version is: 0.5.0
/dev/cpu/msr_batch
does exist:Here's how I built variorum:
Here's the full output of a few of the python example commands:
Beta Was this translation helpful? Give feedback.
All reactions