Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation faults in python with v1.4.20240928 on JKBMS BLE #86

Open
DenkBrettl opened this issue Oct 8, 2024 · 10 comments
Open

Segmentation faults in python with v1.4.20240928 on JKBMS BLE #86

DenkBrettl opened this issue Oct 8, 2024 · 10 comments
Labels
bug Something isn't working help wanted Extra attention is needed wontfix This will not be worked on

Comments

@DenkBrettl
Copy link

Describe the bug

When running v1.4.20240928, I regularly get segfaults in the python binary using a JKBMS via BLE.

I've downgraded again to v1.3.20240705 and this one has run fine since a day now. I know this is hard to debug, but I'm happy to help. If you know how to enable core dumps on this, it would be helpful, otherwise I can research it.

How to reproduce

Install latest release, wait for a certain time (few hours), segfaults regularly occur.

Expected behavior

No segfaults

Driver version of the currently installed driver

v1.4.20240928

Driver version of the last known working driver

v1.3.20240705

Venus OS device type

Raspberry Pi 4

Venus OS version

v3.42

BMS type

JKBMS (Heltec BMS)

Cell count

16

Battery count

2

Connection type

Bluetooth

Config file

[DEFAULT]

; If you want to add custom values/settings, then check the values/settings you want to change in "config.default.ini"
; and insert them below to persist future driver updates.

; Example (remove the semicolon ";" to uncomment and activate the value/setting):

LOGGING = INFO

; May need to run bluetoothctl; then in the cli "scan on" and "trust MAC" for new devices
; C8:47:8C:E2:D1:A9 MoarBatt01
; C8:47:8C:E2:DB:1F MoarBatt02
BLUETOOTH_BMS = Jkbms_Ble C8:47:8C:E2:D1:A9, Jkbms_Ble C8:47:8C:E2:DB:1F

MAX_BATTERY_CHARGE_CURRENT = 280.0
MAX_BATTERY_DISCHARGE_CURRENT = 280.0


; --------- Cell Voltages ---------
; Description:
;     Cell min/max voltages which are used for:
;         - Calculating the min/max battery voltage
;         - Trigger the SoC reset when SoC calculation is enabled
; Example:
;     16 cells * 3.45 V/cell = 55.2 V max charge voltage.
;     16 cells * 2.90 V/cell = 46.4 V min discharge voltage
MIN_CELL_VOLTAGE   = 2.850
; Max voltage (can seen as absorption voltage)
MAX_CELL_VOLTAGE   = 3.450
; Float voltage (can be seen as resting voltage)
FLOAT_CELL_VOLTAGE = 3.375


; --------- SOC reset voltage (needs to match BMS settings) ---------
; +++ This has nothing to do with "SOC calculation" in a section below +++
; This is one of the possibilities to reset the SoC to 100%, because of SoC drift.
; Description:
;     May be needed to reset the SoC to 100% once in a while for some BMS, because of SoC drift.
;     Some BMS may needed to reset the SoC to 100% once in a while, because of SoC drift. Some
;     devices, like JKBMS, will reset their internal SOC value, if they reach the upper voltage level.
;     Using this method, the charging voltage can be raised regularly to achieve that.
;     (other BMS like Daly need an active overwriting of the SOC parameter. This happens each time when
;     the charging mode changes from Bulk/Absorption to float (and the cells are equalised). They do
;     not need this feature here.)
;     Specify the cell voltage where the SoC should be reset to 100% by the BMS.
;       - JKBMS: SoC is reset to 100% if one cell reaches OVP (over voltage protection) voltage
;     As you have to adopt this value to your system, I recommend to start with
;     OVP voltage - 0.030 (see Example).
;       - Try to increase (add) by 0.005 in steps, if the system does not switch to float mode, even if
;         the target voltage SOC_RESET_VOLTAGE * CELL_COUNT is reached.
;       - Try to decrease (lower) by 0.005 in steps, if the system hits the OVP too fast, before all
;         cells could be balanced and the system goes into protection mode multiple times.
; Example:
;     If OVP is 3.650, then start with 3.620 and increase/decrease by 0.005
; Note:
;     The value has to be higher as the MAX_CELL_VOLTAGE
;     You also have to set CELL_VOLTAGES_WHILE_CHARGING accordingly, if you set CCCM_CV_ENABLE to true
;         else the charging current will be reduced to 0 before the target voltage is reached and the
;         battery will never switch to float
SOC_RESET_VOLTAGE = 3.570
; Specify after how many days the SOC reset voltage should be reached again
; The timer is reset when the SOC reset voltage is reached
; Leave empty if you don't want to use the SOC reset feature
; Example:
;     Value is set to 15
;     day 1: SOC reset reached once
;     day 16: SOC reset reached twice
;     day 31: SOC reset not reached since it's very cloudy
;     day 34: SOC reset reached since the sun came out
;     day 49: SOC reset reached again, since last time it took 3 days to reach SOC reset voltage
SOC_RESET_AFTER_DAYS = 2

; --------- Cell Voltage Current limitation (affecting CCL/DCL) ---------
; Description: Maximal charge / discharge current will be in-/decreased depending on min and max cell voltages
; Example:
;     18 cells * 3.55 V/cell = 63.9 V max charge voltage 
;     18 cells * 2.70 V/cell = 48.6 V min discharge voltage
;     But in reality not all cells reach the same voltage at the same time. The (dis)charge current
;     will be (in-/)decreased, if even ONE SINGLE BATTERY CELL reaches the limits
                          
; Charge current control management referring to cell-voltage enable (True/False).
CCCM_CV_ENABLE = True     
; Discharge current control management referring to cell-voltage enable (True/False).
DCCM_CV_ENABLE = True
                                                                     
; Set steps to reduce battery current                                     
; The current will be changed linear between those steps if LINEAR_LIMITATION_ENABLE is set to True
CELL_VOLTAGES_WHILE_CHARGING      = 3.60, 3.50, 3.45, 3.30
MAX_CHARGE_CURRENT_CV_FRACTION    =    0, 0.05,  0.5,    1                                      
                                                                                              
CELL_VOLTAGES_WHILE_DISCHARGING   = 2.70, 2.80, 2.90, 3.10                                            
MAX_DISCHARGE_CURRENT_CV_FRACTION =    0,  0.1,  0.5,    1

Relevant log output

'current' log of one of the batteries:

2024-10-07 18:48:55.139862500 
2024-10-07 18:48:57.045012500 INFO:SerialBattery:
2024-10-07 18:48:57.045307500 INFO:SerialBattery:Starting dbus-serialbattery
2024-10-07 18:48:57.046355500 INFO:SerialBattery:Venus OS v3.42
2024-10-07 18:48:57.046633500 INFO:SerialBattery:dbus-serialbattery v1.4.20240928
2024-10-07 18:49:13.314624500 INFO:SerialBattery:Init of Jkbms_Ble at C8:47:8C:E2:DB:1F
2024-10-07 18:49:13.314917500 INFO:SerialBattery:Test of Jkbms_Ble at C8:47:8C:E2:DB:1F
2024-10-07 18:49:17.324509500 INFO:SerialBattery:BAT: JKBMS 10.XG 16S (20231003)
2024-10-07 18:49:17.325302500 INFO:SerialBattery:Connection established to Jkbms_Ble
2024-10-07 18:49:17.812952500 INFO:SerialBattery:Found existing battery with DeviceInstance = 1
2024-10-07 18:49:17.955688500 INFO:SerialBattery:DeviceInstance = 1
2024-10-07 18:49:17.956666500 INFO:SerialBattery:PID file created successfully: /var/tmp/dbus-serialbattery_1.pid
2024-10-07 18:49:17.956972500 INFO:SerialBattery:Used DeviceInstances = ['1', '3']
2024-10-07 18:49:17.957280500 INFO:SerialBattery:com.victronenergy.battery.ble_c8478ce2db1f__0
2024-10-07 18:49:17.980172500 INFO:SerialBattery:publish config values = False
2024-10-07 18:49:17.992789500 INFO:SerialBattery:Battery JKBMS BLE connected to dbus from ble_c8478ce2db1f
2024-10-07 18:49:17.992794500 INFO:SerialBattery:========== Settings ==========
2024-10-07 18:49:17.992797500 INFO:SerialBattery:> Connection voltage: 53.0 V | Current: -13.1 A | SoC: 91%
2024-10-07 18:49:17.992801500 INFO:SerialBattery:> Cell count: 16 | Cells populated: 16
2024-10-07 18:49:17.992804500 INFO:SerialBattery:> LINEAR LIMITATION ENABLE: True
2024-10-07 18:49:17.992807500 INFO:SerialBattery:> MIN CELL VOLTAGE: 2.850 V | MAX CELL VOLTAGE: 3.450 V| FLOAT CELL VOLTAGE: 3.375 V
2024-10-07 18:49:17.992878500 INFO:SerialBattery:> MAX BATTERY CHARGE CURRENT: 280.0 A | MAX BATTERY DISCHARGE CURRENT: 280.0 A
2024-10-07 18:49:17.992882500 INFO:SerialBattery:> MAX BATTERY CHARGE CURRENT: 150.0 A | MAX BATTERY DISCHARGE CURRENT: 150.0 A (read from BMS)
2024-10-07 18:49:17.992886500 INFO:SerialBattery:> CVCM:     True
2024-10-07 18:49:17.992889500 INFO:SerialBattery:> CCCM CV:  True  | DCCM CV:  True
2024-10-07 18:49:17.992918500 INFO:SerialBattery:> CCCM T:   True  | DCCM T:   True
2024-10-07 18:49:17.992921500 INFO:SerialBattery:> CCCM SOC: False | DCCM SOC: False
2024-10-07 18:49:17.992924500 INFO:SerialBattery:> CHARGE FET: True | DISCHARGE FET: True | BALANCE FET: True
2024-10-07 18:49:17.992928500 INFO:SerialBattery:Serial Number/Unique Identifier: 2032809157
2024-10-07 18:55:40.719322500 [CHG] Device C8:47:8C:E2:DB:1F RSSI: -81
2024-10-07 18:55:40.719328500 [CHG] Device C8:47:8C:E2:DB:1F RSSI: -84
2024-10-07 18:59:50.874364500 ./run: line 12:  4199 Segmentation fault      python /opt/victronenergy/dbus-serialbattery/dbus-serialbattery.py Jkbms_Ble C8:47:8C:E2:DB:1F
2024-10-07 18:59:51.030523500 
2024-10-07 18:59:51.030625500 
2024-10-07 18:59:51.030791500 INFO:Preparing Bluetooth for connection to BMS
2024-10-07 18:59:51.030860500 INFO:Bluetooth details
2024-10-07 18:59:58.995073500 Attempting to disconnect from C8:47:8C:E2:DB:1F
2024-10-07 18:59:58.995077500 [CHG] Device C8:47:8C:E2:D1:A9 RSSI is nil
2024-10-07 18:59:58.995081500 [CHG] Controller E4:5F:01:06:4B:BD Discovering: no
2024-10-07 18:59:58.995084500 [CHG] Device C8:47:8C:E2:DB:1F ServicesResolved: no
2024-10-07 18:59:58.995088500 Successful disconnected
2024-10-07 19:00:04.130834500 Device C8:47:8C:E2:DB:1F (public)
2024-10-07 19:00:04.130839500 	Alias: MoarBatt02
2024-10-07 19:00:04.130842500 	Paired: no
2024-10-07 19:00:04.130844500 	Trusted: yes
2024-10-07 19:00:04.130846500 	Blocked: no
2024-10-07 19:00:04.130848500 	Connected: no
2024-10-07 19:00:04.130851500 	LegacyPairing: no
2024-10-07 19:00:04.130853500 	UUID: Device Information        (0000180a-0000-1000-8000-00805f9b34fb)
2024-10-07 19:00:04.130856500 	RSSI: -80
2024-10-07 19:00:04.133142500 
2024-10-07 19:00:06.012735500 INFO:SerialBattery:
2024-10-07 19:00:06.013011500 INFO:SerialBattery:Starting dbus-serialbattery
2024-10-07 19:00:06.014625500 INFO:SerialBattery:Venus OS v3.42
2024-10-07 19:00:06.014894500 INFO:SerialBattery:dbus-serialbattery v1.4.20240928
2024-10-07 19:00:22.260999500 INFO:SerialBattery:Init of Jkbms_Ble at C8:47:8C:E2:DB:1F
2024-10-07 19:00:22.261284500 INFO:SerialBattery:Test of Jkbms_Ble at C8:47:8C:E2:DB:1F
2024-10-07 19:00:26.268169500 INFO:SerialBattery:BAT: JKBMS 10.XG 16S (20231003)
2024-10-07 19:00:26.268858500 INFO:SerialBattery:Connection established to Jkbms_Ble
2024-10-07 19:00:26.723916500 INFO:SerialBattery:Found existing battery with DeviceInstance = 1
2024-10-07 19:00:26.853614500 INFO:SerialBattery:DeviceInstance = 1
2024-10-07 19:00:26.854515500 INFO:SerialBattery:PID file created successfully: /var/tmp/dbus-serialbattery_1.pid
2024-10-07 19:00:26.854800500 INFO:SerialBattery:Used DeviceInstances = ['1', '3']
2024-10-07 19:00:26.855096500 INFO:SerialBattery:com.victronenergy.battery.ble_c8478ce2db1f__0
2024-10-07 19:00:26.874699500 INFO:SerialBattery:publish config values = False
2024-10-07 19:00:26.887890500 INFO:SerialBattery:Battery JKBMS BLE connected to dbus from ble_c8478ce2db1f
2024-10-07 19:00:26.887896500 INFO:SerialBattery:========== Settings ==========
2024-10-07 19:00:26.887899500 INFO:SerialBattery:> Connection voltage: 52.83 V | Current: -30.1 A | SoC: 90%
2024-10-07 19:00:26.887903500 INFO:SerialBattery:> Cell count: 16 | Cells populated: 16
2024-10-07 19:00:26.887906500 INFO:SerialBattery:> LINEAR LIMITATION ENABLE: True
2024-10-07 19:00:26.887909500 INFO:SerialBattery:> MIN CELL VOLTAGE: 2.850 V | MAX CELL VOLTAGE: 3.450 V| FLOAT CELL VOLTAGE: 3.375 V
2024-10-07 19:00:26.887982500 INFO:SerialBattery:> MAX BATTERY CHARGE CURRENT: 280.0 A | MAX BATTERY DISCHARGE CURRENT: 280.0 A
2024-10-07 19:00:26.887987500 INFO:SerialBattery:> MAX BATTERY CHARGE CURRENT: 150.0 A | MAX BATTERY DISCHARGE CURRENT: 150.0 A (read from BMS)
2024-10-07 19:00:26.887991500 INFO:SerialBattery:> CVCM:     True
2024-10-07 19:00:26.887994500 INFO:SerialBattery:> CCCM CV:  True  | DCCM CV:  True
2024-10-07 19:00:26.888023500 INFO:SerialBattery:> CCCM T:   True  | DCCM T:   True
2024-10-07 19:00:26.888026500 INFO:SerialBattery:> CCCM SOC: False | DCCM SOC: False
2024-10-07 19:00:26.888029500 INFO:SerialBattery:> CHARGE FET: True | DISCHARGE FET: True | BALANCE FET: True
2024-10-07 19:00:26.888189500 INFO:SerialBattery:Serial Number/Unique Identifier: 2032809157
2024-10-07 19:02:51.979184500 [CHG] Device C8:47:8C:E2:DB:1F RSSI: -80
2024-10-07 19:02:51.979190500 [CHG] Device C8:47:8C:E2:DB:1F RSSI: -75


dmesg -T capture of one occurrence:
[Mon Oct  7 18:59:49 2024] potentially unexpected fatal signal 11.
[Mon Oct  7 18:59:49 2024] CPU: 1 PID: 4199 Comm: python Tainted: G         C        5.10.110-rpi-venus-4 #1
[Mon Oct  7 18:59:49 2024] Hardware name: BCM2711
[Mon Oct  7 18:59:49 2024] PC is at 0xb6afd898
[Mon Oct  7 18:59:49 2024] LR is at 0xb6ae5698
[Mon Oct  7 18:59:49 2024] pc : [<b6afd898>]    lr : [<b6ae5698>]    psr: 60000010
[Mon Oct  7 18:59:49 2024] sp : beea33b0  ip : b6b1e2ac  fp : 00000000
[Mon Oct  7 18:59:49 2024] r10: 00000000  r9 : b69d8a0c  r8 : beea3474
[Mon Oct  7 18:59:49 2024] r7 : 01945848  r6 : 0192b3e0  r5 : b4102dd0  r4 : b412c248
[Mon Oct  7 18:59:49 2024] r3 : 00000000  r2 : 00000000  r1 : 00000000  r0 : 00000000
[Mon Oct  7 18:59:49 2024] Flags: nZCv  IRQs on  FIQs on  Mode USER_32  ISA ARM  Segment user
[Mon Oct  7 18:59:49 2024] Control: 30c5383d  Table: 08c88f00  DAC: 55555555
[Mon Oct  7 18:59:49 2024] CPU: 1 PID: 4199 Comm: python Tainted: G         C        5.10.110-rpi-venus-4 #1
[Mon Oct  7 18:59:49 2024] Hardware name: BCM2711
[Mon Oct  7 18:59:49 2024] Backtrace: 
[Mon Oct  7 18:59:49 2024] [<c0c181c0>] (dump_backtrace) from [<c0c18534>] (show_stack+0x20/0x24)
[Mon Oct  7 18:59:49 2024]  r7:ffffffff r6:00000000 r5:60000113 r4:c14e9918
[Mon Oct  7 18:59:49 2024] [<c0c18514>] (show_stack) from [<c0c1ca0c>] (dump_stack+0xc8/0xf4)
[Mon Oct  7 18:59:49 2024] [<c0c1c944>] (dump_stack) from [<c0208b84>] (show_regs+0x1c/0x20)
[Mon Oct  7 18:59:49 2024]  r10:fffffe30 r9:c481ff58 r8:00000f00 r7:0000000a r6:c457e3c0 r5:c481e000
[Mon Oct  7 18:59:49 2024]  r4:c481ffb0 r3:10b4f0f3
[Mon Oct  7 18:59:49 2024] [<c0208b68>] (show_regs) from [<c023bd64>] (get_signal+0x9fc/0xa00)
[Mon Oct  7 18:59:49 2024] [<c023b368>] (get_signal) from [<c020b6d4>] (do_work_pending+0x15c/0x568)
[Mon Oct  7 18:59:49 2024]  r10:fffffe30 r9:5ac3c35a r8:c481ffb0 r7:00000001 r6:c481ff44 r5:c481ff28
[Mon Oct  7 18:59:49 2024]  r4:ffffe000
[Mon Oct  7 18:59:49 2024] [<c020b578>] (do_work_pending) from [<c0200074>] (slow_work_pending+0xc/0x20)
[Mon Oct  7 18:59:49 2024] Exception stack(0xc481ffb0 to 0xc481fff8)
[Mon Oct  7 18:59:49 2024] ffa0:                                     00000000 00000000 00000000 00000000
[Mon Oct  7 18:59:49 2024] ffc0: b412c248 b4102dd0 0192b3e0 01945848 beea3474 b69d8a0c 00000000 00000000
[Mon Oct  7 18:59:49 2024] ffe0: b6b1e2ac beea33b0 b6ae5698 b6afd898 60000010 ffffffff
[Mon Oct  7 18:59:49 2024]  r10:00000000 r9:c481e000 r8:00000000 r7:30c5383d r6:ffffffff r5:60000010
[Mon Oct  7 18:59:49 2024]  r4:b6afd898

Amount of times this is happening:
[Mon Oct  7 16:33:52 2024] potentially unexpected fatal signal 11.
[Mon Oct  7 16:44:40 2024] potentially unexpected fatal signal 11.
[Mon Oct  7 17:21:26 2024] potentially unexpected fatal signal 11.
[Mon Oct  7 17:44:54 2024] potentially unexpected fatal signal 11.
[Mon Oct  7 18:43:24 2024] potentially unexpected fatal signal 11.
[Mon Oct  7 18:48:46 2024] potentially unexpected fatal signal 11.
[Mon Oct  7 18:59:49 2024] potentially unexpected fatal signal 11.
[Mon Oct  7 19:03:50 2024] potentially unexpected fatal signal 11.
[Mon Oct  7 19:34:44 2024] potentially unexpected fatal signal 11.

Any other information that may be helpful

No response

@DenkBrettl DenkBrettl added the bug Something isn't working label Oct 8, 2024
@DenkBrettl
Copy link
Author

Ok, while the BLE setup has been running for a while and was mostly stable, it has always had some moments where it would just hang. I've been thinking about a watchdog process that would monitor that there are new values from the BMS within a certain time frame. Let me know if you think this is a good idea and I might find some time in the next weeks to implement that.

Due to the instability, I'll switch back to serial connection.

@Honusnap
Copy link

I had the same thing, gone back to 1.32

@mr-manuel
Copy link
Owner

Unfortunately I troubleshooted this over 100 hours and found no real solution to all this Bluetooth problems. Therefore I decided to not put any other effort into the Bluetooth part. Another reason is, that the users apparently do not appreciate the work I do and they cannot immagine how much time consuming this all is.

For that 1-2 donations, if at all, a month on over 9.000 dbus-serialbattery installations it is not worth it. If everyone would donate 1 €/year then the motivation would be another, but that is not the case.

Feel free to open a PR :-)

@mr-manuel mr-manuel added help wanted Extra attention is needed wontfix This will not be worked on labels Oct 11, 2024
@Honusnap

This comment was marked as off-topic.

@mr-manuel

This comment was marked as off-topic.

@Honusnap

This comment was marked as off-topic.

@DenkBrettl

This comment was marked as off-topic.

@mr-manuel

This comment was marked as off-topic.

@Honusnap
Copy link

Just when i see in log :
" Cell count: 16 | Cells populated: 32"

Then in battery.py
cell_counter = len(self.cells)
.
.
.
logger.info(
f"> Cell count: {self.cell_count} | Cells populated: {cell_counter}"
)
Mmmmm ok ...do not see where this 32 come from, i got 16 cells.
I do find pretty strange that the self.cells array is 32 in size ... but maybe there is some other things in it ...

@mr-manuel
Copy link
Owner

See also Louisvdw#1092 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants