Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ubuntu 22.04 to ESXI SCSI/ISER connectivity/Reliability issues #41

Open
cryptz2k opened this issue Apr 24, 2022 · 5 comments
Open

Ubuntu 22.04 to ESXI SCSI/ISER connectivity/Reliability issues #41

cryptz2k opened this issue Apr 24, 2022 · 5 comments

Comments

@cryptz2k
Copy link

I recently upgraded my server running scst (latest release) to ubuntu 22.04. I am having issues. Things start out fine but once I start trying to vmotion vms to the datastore it either errors out on each VM, or eventually, the host loses connectivity and needs to be rebooted. This was fine prior to the Ubuntu upgrade. The logs are a bit over my head, I don’t see a smoking gun other than cmd aborts. I am assuming maybe something changed with the newer kernel. I am happy to enable whatever logging might be required and work with someone to try and nail down the issue. It is a bit hard to fully describe. Connectivity is generally present but not stable. I am fairly sure the equipment is fine as i have been running this for years and it crept up immediately after the ubuntu upgrade. There is no switch present. There are 2 esxi hosts, each have 2x 100gb connections (mellanox) directly into the ubuntu box which as 2x dual port adapters. each physical link is on a 255.255.255.248 subnet. I create 4x vmkernel interfaces per physical link, which i know sounds a little weird but i found that is the sweet spot to force esxi to really push the storage. The immediate log below has net.ipv4.conf.all.arp_ignore = 2 set. I wasn't aware of this setting prior and had not encountered any problems with my setup previously without it. I can see how not having that might have created a problem since the two interfaces are isolated, an arp for the other address on the wrong interface would definitely be odd. Its too early for me to call this resolved, but that setting does seem to have helped the storage at least be functional. I also changed the scst queue depths from 32 to 128. between these two sets of logs (the immediate one below being 128). It seems slightly more functional now I am not really sure why that would have helped though,generally i would associate a higher queue depth to pushing the storage, but maybe its letting things get further behind and then they catch up. I haven't pushed it much yet though. I wanted to have someone who had a better understanding of what these aborts mean to determine if they are somewhat normal or indicative of a problem.

Prior to those two changes above it was unusable/unstable after the ubuntu upgrade.

SCST:

[ 1897.244291] [21658]: scst: scst_rx_mgmt_fn:7000:TM fn ABORT_TASK/0 (mcmd 00000000bc68ffe8, initiator iqn.1998-01.com.vmware:esxi-2.psc.net:1134957192:65, target iqn.PSC.tgt)
[ 1897.244295] [21658]: scst_rx_mgmt_fn:7008:sess=00000000ead6b191, tag_set 1, tag 1073741865, lun_set 1, lun=0, cmd_sn_set 1, cmd_sn 5544771, priv 000000008f780b48
[ 1897.244298] [21658]: scst_post_rx_mgmt_cmd:6929:Adding mgmt cmd 00000000bc68ffe8 to active mgmt cmd list
[ 1897.244308] [2703]: scst_tm_thread:6815:Deleting mgmt cmd 00000000bc68ffe8 from active cmd list
[ 1897.244320] [2703]: scst_abort_task:6302:Aborting task (cmd 00000000a5da2161, sn -20, set 1, tag 1073741865, queue_type 1)
[ 1897.244324] [2703]: scst: scst_abort_cmd:5331:Aborting cmd 00000000a5da2161 (tag 1073741865, op READ(10))
[ 1897.244328] [2703]: scst_abort_cmd:5416:cmd 00000000a5da2161 (tag 1073741865) is being executed
[ 1897.244330] [2703]: scst: scst_abort_cmd:5439:cmd 00000000a5da2161 (tag 1073741865, sn 4294967276) being executed/xmitted (state EXEC_WAIT, op READ(10), proc time 40 sec., timeout 60 sec.), deferring ABORT (cmd_done_wait_count 1, cmd_finish_wait_count 1, internal 0, mcmd fn 0 (mcmd 00000000bc68ffe8), initiator iqn.1998-01.com.vmware:esxi-2.psc.net:1134957192:65, target iqn.PSC.tgt)
[ 1897.244339] [2703]: iscsi_on_abort_cmd:2540:Scheduling abort check for scst_cmd 00000000a5da2161
[ 1897.244344] [2703]: scst: scst_set_mcmd_next_state:5492:cmd_done_wait_count(1) not 0, preparing to wait
[ 1897.244349] [302]: iscsi_cmnd_abort_fn:2484:Checking aborted scst_cmd 00000000a5da2161 (cmnd 0000000008fd665b)
[ 1897.244353] [302]: __cmnd_abort:2323:Aborting cmd 0000000008fd665b, scst_cmd 00000000a5da2161 (scst state 4, ref_cnt 1, on_write_timeout_list 0, write_start 0, ITT 40000029, sn 5517670, op 1, r2t_len_to_receive 0, r2t_len_to_send 0, CDB op 28, size to write 0, outstanding_r2t 0, sess->exp_cmd_sn 5544771, conn 0000000049710af5, rd_task 00000000d69ad377, read_cmnd 0000000000000000, read_state 0)
[ 1897.244360] [302]: __cmnd_abort:2348:Setting conn_tm_active for conn 0000000049710af5
[ 1897.244361] [302]: __cmnd_abort:2360:Mod timer on 4295369090 (conn 0000000049710af5)

Esxi

2022-04-24T02:00:34.747Z cpu31:2097914)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "eui.495345522d505343" state in doubt; requested fast path state update...
2022-04-24T02:00:35.058Z cpu11:2098194)WARNING: World: vm 2108243: 3817: vm not found
2022-04-24T02:00:55.223Z cpu31:2097914)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "eui.495345522d505343" state in doubt; requested fast path state update...
2022-04-24T02:00:55.532Z cpu13:2098194)WARNING: World: vm 2108623: 3817: vm not found
2022-04-24T02:00:56.223Z cpu18:2097913)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "eui.495345522d505343" state in doubt; requested fast path state update...
2022-04-24T02:00:56.535Z cpu8:2098194)WARNING: World: vm 2108242: 3817: vm not found

I also see these in dmesg quite a bit, from my understanding it shouldn't cause a problem but from past mailing list convos it looks like these should have been suppressed:

1452.609301] [2209078]: scst: scst_report_luns_local:46:Unsupported SELECT REPORT value 0x11 in REPORT LUNS command
[61452.609345] [2209078]: scst: scst_report_luns_local:46:Unsupported SELECT REPORT value 0x11 in REPORT LUNS command
[61452.676383] [2209078]: scst: scst_report_luns_local:46:Unsupported SELECT REPORT value 0x11 in REPORT LUNS command
[61452.676448] [2209078]: scst: scst_report_luns_local:46:Unsupported SELECT REPORT value 0x11 in REPORT LUNS command
[61452.676487] [2209078]: scst: scst_report_luns_local:46:Unsupported SELECT REPORT value 0x11 in REPORT LUNS command

earlier attempts before config changes:

scst dmesg

[59959.506899] [0]: conn_rsp_timer_fn:562:TM active: making conn 00000000dc6f1fcf RD active
[59959.506917] [0]: conn_rsp_timer_fn:562:TM active: making conn 000000004128dbaa RD active
[59961.035311] [2311262]: iscsi-scst: execute_task_management:2625:iSCSI TM fn 1
[59961.035316] [2311262]: execute_task_management:2627:TM req 00000000b178f2e7, ITT 80000062, RTT 80000077, sn 4355068, con 00000000c7135b92
[59961.035322] [2311262]: cmnd_abort_pre_checks:2430:cmd RTT 80000077 not found
[59961.035325] [2311262]: iscsi_send_task_mgmt_resp:3634:TM req 00000000b178f2e7 finished
[59961.035329] [2311262]: iscsi-scst: iscsi_send_task_mgmt_resp:3635:iSCSI TM fn 1 finished, status 1, dropped 0
[59961.035334] [2311262]: iscsi_get_send_cmnd:517:Going to send TM response 000000000d5ed8e9 (status 1, fn 1, parent_req 00000000b178f2e7)
[59963.036057] [2311262]: iscsi-scst: execute_task_management:2625:iSCSI TM fn 1
[59963.036061] [2311262]: execute_task_management:2627:TM req 0000000069c6f3f8, ITT 80000059, RTT 80000077, sn 4355068, con 00000000c7135b92
[59963.036066] [2311262]: cmnd_abort_pre_checks:2430:cmd RTT 80000077 not found
[59963.036070] [2311262]: iscsi_send_task_mgmt_resp:3634:TM req 0000000069c6f3f8 finished
[59963.036073] [2311262]: iscsi-scst: iscsi_send_task_mgmt_resp:3635:iSCSI TM fn 1 finished, status 1, dropped 0
[59963.036077] [2311262]: iscsi_get_send_cmnd:517:Going to send TM response 000000000d5ed8e9 (status 1, fn 1, parent_req 0000000069c6f3f8)
[59963.090622] [0]: conn_rsp_timer_fn:562:TM active: making conn 00000000fdb858f7 RD active
[59965.036091] [2311262]: iscsi-scst: execute_task_management:2625:iSCSI TM fn 1
[59965.036095] [2311262]: execute_task_management:2627:TM req 000000002e6e7792, ITT 8000000c, RTT 80000077, sn 4355068, con 00000000c7135b92
[59965.036101] [2311262]: cmnd_abort_pre_checks:2430:cmd RTT 80000077 not found
[59965.036103] [2311262]: iscsi_send_task_mgmt_resp:3634:TM req 000000002e6e7792 finished
[59965.036106] [2311262]: iscsi-scst: iscsi_send_task_mgmt_resp:3635:iSCSI TM fn 1 finished, status 1, dropped 0
[59965.036111] [2311262]: iscsi_get_send_cmnd:517:Going to send TM response 000000000d5ed8e9 (status 1, fn 1, parent_req 000000002e6e7792)
[59965.036190] [2311262]: iscsi-scst: execute_task_management:2625:iSCSI TM fn 1
[59965.036192] [2311262]: execute_task_management:2627:TM req 0000000091e85b21, ITT 8000000b, RTT 80000053, sn 4355068, con 00000000c7135b92
[59965.036196] [2311262]: cmnd_abort_pre_checks:2430:cmd RTT 80000053 not found
[59965.036199] [2311262]: iscsi_send_task_mgmt_resp:3634:TM req 0000000091e85b21 finished
[59965.036202] [2311262]: iscsi-scst: iscsi_send_task_mgmt_resp:3635:iSCSI TM fn 1 finished, status 1, dropped 0
[59965.036205] [2311262]: iscsi_get_send_cmnd:517:Going to send TM response 000000000d5ed8e9 (status 1, fn 1, parent_req 0000000091e85b21)
[59965.906472] [0]: conn_rsp_timer_fn:562:TM active: making conn 00000000f58edacd RD active
[59966.875307] [1725755]: req_add_to_write_timeout_list:1087:Setting conn_tm_active for conn 00000000d4decc12
[59967.036496] [2311262]: iscsi-scst: execute_task_management:2625:iSCSI TM fn 1
[59967.036500] [2311262]: execute_task_management:2627:TM req 00000000b83d99bd, ITT 80000035, RTT 80000077, sn 4355068, con 00000000c7135b92
[59967.036505] [2311262]: cmnd_abort_pre_checks:2430:cmd RTT 80000077 not found
[59967.036507] [2311262]: iscsi_send_task_mgmt_resp:3634:TM req 00000000b83d99bd finished
[59967.036510] [2311262]: iscsi-scst: iscsi_send_task_mgmt_resp:3635:iSCSI TM fn 1 finished, status 1, dropped 0
[59967.036514] [2311262]: iscsi_get_send_cmnd:517:Going to send TM response 000000000d5ed8e9 (status 1, fn 1, parent_req 00000000b83d99bd)
[59968.978145] [0]: conn_rsp_timer_fn:562:TM active: making conn 0000000094a01306 RD active
[59969.036884] [2311262]: iscsi-scst: execute_task_management:2625:iSCSI TM fn 1
[59969.036888] [2311262]: execute_task_management:2627:TM req 000000002d62bbc5, ITT 80000018, RTT 80000077, sn 4355068, con 00000000c7135b92
[59969.036894] [2311262]: cmnd_abort_pre_checks:2430:cmd RTT 80000077 not found
[59969.036897] [2311262]: iscsi_send_task_mgmt_resp:3634:TM req 000000002d62bbc5 finished
[59969.036900] [2311262]: iscsi-scst: iscsi_send_task_mgmt_resp:3635:iSCSI TM fn 1 finished, status 1, dropped 0
[59969.036904] [2311262]: iscsi_get_send_cmnd:517:Going to send TM response 000000000d5ed8e9 (status 1, fn 1, parent_req 000000002d62bbc5)
[59971.038307] [2311262]: iscsi-scst: execute_task_management:2625:iSCSI TM fn 1
[59971.038314] [2311262]: execute_task_management:2627:TM req 00000000bd791c5c, ITT 8000000a, RTT 80000077, sn 4355068, con 00000000c7135b92
[59971.038319] [2311262]: cmnd_abort_pre_checks:2430:cmd RTT 80000077 not found
[59971.038322] [2311262]: iscsi_send_task_mgmt_resp:3634:TM req 00000000bd791c5c finished
[59971.038326] [2311262]: iscsi-scst: iscsi_send_task_mgmt_resp:3635:iSCSI TM fn 1 finished, status 1, dropped 0
[59971.038331] [2311262]: iscsi_get_send_cmnd:517:Going to send TM response 000000000d5ed8e9 (status 1, fn 1, parent_req 00000000bd791c5c)
[59973.040328] [2311262]: iscsi-scst: execute_task_management:2625:iSCSI TM fn 1
[59973.040332] [2311262]: execute_task_management:2627:TM req 00000000087f9614, ITT 80000009, RTT 80000077, sn 4355068, con 00000000c7135b92
[59973.040337] [2311262]: cmnd_abort_pre_checks:2430:cmd RTT 80000077 not found
[59973.040340] [2311262]: iscsi_send_task_mgmt_resp:3634:TM req 00000000087f9614 finished
[59973.040344] [2311262]: iscsi-scst: iscsi_send_task_mgmt_resp:3635:iSCSI TM fn 1 finished, status 1, dropped 0
[59973.040348] [2311262]: iscsi_get_send_cmnd:517:Going to send TM response 000000000d5ed8e9 (status 1, fn 1, parent_req 00000000087f9614)
[59975.042354] [2311262]: iscsi-scst: execute_task_management:2625:iSCSI TM fn 1
[59975.042359] [2311262]: execute_task_management:2627:TM req 000000006e4ada2b, ITT 80000037, RTT 80000077, sn 4355068, con 00000000c7135b92
[59975.042364] [2311262]: cmnd_abort_pre_checks:2430:cmd RTT 80000077 not found
[59975.042367] [2311262]: iscsi_send_task_mgmt_resp:3634:TM req 000000006e4ada2b finished
[59975.042371] [2311262]: iscsi-scst: iscsi_send_task_mgmt_resp:3635:iSCSI TM fn 1 finished, status 1, dropped 0
[59975.042375] [2311262]: iscsi_get_send_cmnd:517:Going to send TM response 000000000d5ed8e9 (status 1, fn 1, parent_req 000000006e4ada2b)
[59975.876686] [2507423]: req_add_to_write_timeout_list:1087:Setting conn_tm_active for conn 0000000019ddb073
[59977.044351] [2311262]: iscsi-scst: execute_task_management:2625:iSCSI TM fn 1
[59977.044354] [2311262]: execute_task_management:2627:TM req 000000001b998dc7, ITT 8000005f, RTT 80000053, sn 4355068, con 00000000c7135b92
[59977.044359] [2311262]: cmnd_abort_pre_checks:2430:cmd RTT 80000053 not found
[59977.044362] [2311262]: iscsi_send_task_mgmt_resp:3634:TM req 000000001b998dc7 finished
[59977.044365] [2311262]: iscsi-scst: iscsi_send_task_mgmt_resp:3635:iSCSI TM fn 1 finished, status 1, dropped 0
[59977.044370] [2311262]: iscsi_get_send_cmnd:517:Going to send TM response 000000000d5ed8e9 (status 1, fn 1, parent_req 000000001b998dc7)
[59977.937543] [0]: conn_rsp_timer_fn:562:TM active: making conn 00000000d4decc12 RD active
[59979.045093] [2311262]: iscsi-scst: execute_task_management:2625:iSCSI TM fn 1
[59979.045098] [2311262]: execute_task_management:2627:TM req 00000000f5769a79, ITT 8000005b, RTT 80000077, sn 4355068, con 00000000c7135b92
[59979.045106] [2311262]: cmnd_abort_pre_checks:2430:cmd RTT 80000077 not found
[59979.045109] [2311262]: iscsi_send_task_mgmt_resp:3634:TM req 00000000f5769a79 finished
[59979.045113] [2311262]: iscsi-scst: iscsi_send_task_mgmt_resp:3635:iSCSI TM fn 1 finished, status 1, dropped 0
[59979.045118] [2311262]: iscsi_get_send_cmnd:517:Going to send TM response 000000000d5ed8e9 (status 1, fn 1, parent_req 00000000f5769a79)
[59979.045212] [2311262]: iscsi-scst: execute_task_management:2625:iSCSI TM fn 1
[59979.045215] [2311262]: execute_task_management:2627:TM req 0000000036485a65, ITT 80000025, RTT 80000053, sn 4355068, con 00000000c7135b92
[59979.045220] [2311262]: cmnd_abort_pre_checks:2430:cmd RTT 80000053 not found
[59979.045223] [2311262]: iscsi_send_task_mgmt_resp:3634:TM req 0000000036485a65 finished
[59979.045226] [2311262]: iscsi-scst: iscsi_send_task_mgmt_resp:3635:iSCSI TM fn 1 finished, status 1, dropped 0
[59979.045230] [2311262]: iscsi_get_send_cmnd:517:Going to send TM response 000000000d5ed8e9 (status 1, fn 1, parent_req 0000000036485a65)
[59981.047145] [2311262]: iscsi-scst: execute_task_management:2625:iSCSI TM fn 1
[59981.047151] [2311262]: execute_task_management:2627:TM req 00000000f41b1b69, ITT 80000031, RTT 80000077, sn 4355068, con 00000000c7135b92
[59981.047157] [2311262]: cmnd_abort_pre_checks:2430:cmd RTT 80000077 not found
[59981.047161] [2311262]: iscsi_send_task_mgmt_resp:3634:TM req 00000000f41b1b69 finished

Esxi vmkernel output

2022-04-23T17:01:39.711Z cpu24:2102106)WARNING: SVM: 1861: scsi0:0 IO failed on handle 2497915, childToken 0x45d9001810c0 failed: No connection
2022-04-23T17:01:39.712Z cpu24:2102106)WARNING: SVM: 1861: scsi0:0 IO failed on handle 2497915, childToken 0x45d900056740 failed: No connection
2022-04-23T17:01:39.962Z cpu23:2102106)WARNING: SVM: 1861: scsi0:0 IO failed on handle 2497915, childToken 0x45d90006a740 failed: No connection
2022-04-23T17:01:40.213Z cpu22:2102106)WARNING: SVM: 1861: scsi0:0 IO failed on handle 2497915, childToken 0x45d900062540 failed: No connection
2022-04-23T17:01:40.464Z cpu20:2102106)WARNING: SVM: 1861: scsi0:0 IO failed on handle 2497915, childToken 0x45d92f60afc0 failed: No connection
2022-04-23T17:01:40.714Z cpu18:2102106)WARNING: SVM: 1861: scsi0:0 IO failed on handle 2497915, childToken 0x45d900062540 failed: No connection
2022-04-23T17:01:40.965Z cpu30:2102106)WARNING: SVM: 1861: scsi0:0 IO failed on handle 2497915, childToken 0x45d90014fac0 failed: No connection
2022-04-23T17:01:41.216Z cpu30:2102106)WARNING: SVM: 1861: scsi0:0 IO failed on handle 2497915, childToken 0x45d9000d2c00 failed: No connection
2022-04-23T17:01:41.216Z cpu30:2102106)WARNING: SVM: 1861: scsi0:0 IO failed on handle 2497915, childToken 0x45d9000a8680 failed: No connection
2022-04-23T17:01:41.466Z cpu21:2102106)WARNING: SVM: 1861: scsi0:0 IO failed on handle 2497915, childToken 0x45d9001c7840 failed: No connection
2022-04-23T17:01:41.482Z cpu1:2097445)iser: iser_ScsiTaskMgmt: path vmhba65:3:0:0 TaskMgmt 0x4538c929bf00 invoked abort on CmdSN 0x0
2022-04-23T17:01:41.482Z cpu1:2097445)iser: iser_AbortCommands: session 0x4316de037270 taskMgmt 0x4538c929bf00
2022-04-23T17:01:41.482Z cpu1:2097445)iser: iser_AbortCommands: session 0x4316de037270 failing sc 0x45d900d01a08 itt 0x77 state 3 status 1
2022-04-23T17:01:41.482Z cpu1:2097445)iser: iser_SendMgmtTask: session: 0x4316de037270 tmf set timeout
2022-04-23T17:01:41.482Z cpu11:2097898)iser: iser_TmfResponse: TMF task response: 1, task state: 5
2022-04-23T17:01:41.482Z cpu1:2097445)iser: iser_AbortCommands: task 0x4316de139040 tmf state 5
2022-04-23T17:01:41.553Z cpu27:2115657)iser: iser_ScsiTaskMgmt: path vmhba65:13:0:0 TaskMgmt 0x4538e771bef0 invoked abort on CmdSN 0x1d87536
2022-04-23T17:01:41.553Z cpu27:2115657)iser: iser_AbortCommands: session 0x4316de4ce800 taskMgmt 0x4538e771bef0
2022-04-23T17:01:41.553Z cpu27:2115657)iser: iser_ScsiTaskMgmt: path vmhba65:3:0:0 TaskMgmt 0x4538e771bef0 invoked abort on CmdSN 0x1d87536
2022-04-23T17:01:41.553Z cpu27:2115657)iser: iser_AbortCommands: session 0x4316de037270 taskMgmt 0x4538e771bef0
2022-04-23T17:01:41.553Z cpu27:2115657)iser: iser_AbortCommands: session 0x4316de037270 failing sc 0x45b9323e83c8 itt 0x53 state 3 status 1
2022-04-23T17:01:41.553Z cpu27:2115657)iser: iser_SendMgmtTask: session: 0x4316de037270 tmf set timeout
2022-04-23T17:01:41.553Z cpu11:2097898)iser: iser_TmfResponse: TMF task response: 1, task state: 5
2022-04-23T17:01:41.553Z cpu27:2115657)iser: iser_AbortCommands: task 0x4316de1407e0 tmf state 5
2022-04-23T17:01:41.717Z cpu22:2102106)WARNING: SVM: 1861: scsi0:0 IO failed on handle 2497915, childToken 0x45d90001f4c0 failed: No connection
2022-04-23T17:01:41.794Z cpu17:2097935)HBX: 747: Reading HB at 4030464 on vol 'SAN.PSC.Net' failed: No connection
2022-04-23T17:01:41.972Z cpu29:2102106)WARNING: SVM: 1861: scsi0:0 IO failed on handle 2497915, childToken 0x45d90002bf40 failed: No connection

@cryptz2k
Copy link
Author

just wanted to leave an update, i retried moving everything, I was able to move all 16 vms and ~4tb of data without any errors. It seems those two settings at least made it workable. benchmarks are good as well.

@lnocturno
Copy link
Contributor

Could you please provide the version of SCST you are using, the SCST config and full target logs if possible?

I also see these in dmesg quite a bit, from my understanding it shouldn't cause a problem but from past mailing list convos it
looks like these should have been suppressed:

What type of device are you using under scst_local?

It seems those two settings at least made it workable

So, as far as I understand the workaround that helps you is setting net.ipv4.conf.all.arp_ignore = 2?

@cryptz2k
Copy link
Author

What is the best way to obtain full target logs? i was using dmesg, I did not see a scst log file on the server under var/logs

v3.6-120-g0fcc4cbb

config:

HANDLER vdisk_fileio {
DEVICE ISER-PSC-Net {
filename /dev/zvol/PSC.Net/ISER.PSC.Net
#threads_num 8
# Non-key attributes
#blocksize 512
nv_cache 1
read_only 0
removable 0
rotational 0
t10_dev_id ISER-PSC-Net
# thin_provisioned 1
threads_pool_type shared
# usn 46d13b96
write_through 0
}
}

TARGET_DRIVER iscsi {
enabled 1
TARGET iqn.PSC.tgt {
#Allowed_portal *
# MaxOutstandingR2T 8
QueuedCommands 128
LUN 0 ISER-PSC-Net
io_grouping_type auto
RDMAExtensions Yes

            enabled 1
    }

}

net.ipv4.conf.all.arp_ignore = 2 + increasing the queue depth seems to have atleast masked the issue. i went from being completely unable to copy vms to the scst datastore to having everything on there and seemingly running ok.

@lnocturno
Copy link
Contributor

i was using dmesg, I did not see a scst log file on the server under var/logs

/var/log/messages or /var/log/kern.log are fine if they exist. Otherwise, dmesg -TL will fine too.

@cryptz2k
Copy link
Author

cryptz2k commented Apr 25, 2022

kern.log is attached, messages did not exist.

kern.log.1.log

kern.log.1 was from when I was really having issues, kern.log (broken into 2) is more of a recent steady state.
kern.log

4_24am_kern.log

i switched the scst QueuedCommands from 32 to 128 right around the same time i implemented net.ipv4.conf.all.arp_ignore = 2 if that helps you decipher the timing of the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants