Cleanup branch: PMON #37

mslaffin · 2024-12-15T20:56:51Z

This PR is for misc cleanup and changes related to the ProcessMonitorSubsystem

…secutive failed readings

mslaffin · 2024-12-17T00:03:52Z

This was tested and confirmed to be working on 12/16.

Video recording:
Recording 2024-12-16 174706.zip

Log file:
log_2024-12-16_17-32-58.txt

There's still a bug where one or more units briefly fail to respond in time, despite increasing the fault tolerance at various levels (_poll_single_unit, poll_all_units and the subsytem's update_temperatures). This presents itself as a brief flash of the grey "Disconnected" error bar before recovering back to the nominal colored bar

I do not think this is a pressing problem though, based on how quickly recovers from the lack of response. The problem doesn't seem to be isolated to any particular unit. My hunch is that we're pushing an aggressive polling cycle that involves multiple reads (status, process_value) every time and occasionally the the DP16 units are polled in an unprepared state, but like I said, I'm not too concerned about this as an immediate issue.

…iver readme

mslaffin · 2024-12-23T16:29:46Z

I mentioned this in our meeting on 12/20, but I think there's a problem with how the modbus_lock is implemented inside poll_all_units that could be leading to slower/less predictable update cycles.

The temperatures are updating slower than they should. This behavior can be seen in the video I posted in PR#34.

The modbus_lock is held for ALL units and is probably an inefficient way to do this. Currently looks like:

with self.modbus_lock:  # Lock held for all units
    for unit in self.unit_numbers:
        self._poll_single_unit(unit)  # Each unit can take up to 0.5s before timing out
time.sleep(0.1) # single sleep at end

The get_all_temperatures in the subsystem class returns a copy of the cached readings under response_lock, so we're always updating the GUI every 500ms, but not necessarily with new values until all units have been polled or timed out.

I'm going to pursue some changes to acquire and release modbus_lock between each unit, and try to track the actual time spent polling so that we're not making unnecessarily long time.sleep() calls

…t read. Simplify error paths

…BEAM_dashboard into bugfix/cleanup-PMON

add more debug logging calls for visibility during testing

7105996

mslaffin requested review from bwalkerMIR and mark11778 December 15, 2024 20:56

mslaffin self-assigned this Dec 15, 2024

mslaffin added 4 commits December 16, 2024 17:01

sensor error too aggressive. Only indicate unit error after three con…

5023052

…secutive failed readings

adjust minimum temp scale limit. Tone back error classification

5d26943

quiet down logging

294e55d

increase missed package tolerance and clean up logs

a02629c

mslaffin and others added 5 commits December 20, 2024 15:29

updated initialization flowchart

1d3b247

updated update_temperature callback flowchart

b915cc8

working on read me for the process_monitor.py

35810c9

small changes, same as above commit ^

4d1a3b0

fixed typos in process_monitor.py readme and added imports to DP16 dr…

80a6155

…iver readme

mslaffin and others added 14 commits December 23, 2024 10:50

release modbus_lock between units and maintain BASE_DELAY

f6383e6

add poll_all_units flowchart to README

3af7d66

add DP16ProcessMonitor driver init flowchart to README

1c688e2

more small changes to the Readme documentation

27c49b5

reduce connection state check frequency

78fcb08

each unit operation now fully atomic, buffer clearing before each uni…

7e7261a

…t read. Simplify error paths

add disconnection check

0a83fac

Merge branch 'bugfix/cleanup-PMON' of https://github.com/bwalkerMIR/E…

ef2db7c

…BEAM_dashboard into bugfix/cleanup-PMON

increase inter-unit poll delay

84968a3

remove blue coloring

b1f9662

re-enable error counting

5ea2184

consolidate exception handling, reduce Modbus timeout

75a2be2

enforce rate limiting for critical polling error

89b10c0

spelling

c41ef73

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleanup branch: PMON #37

Cleanup branch: PMON #37

mslaffin commented Dec 15, 2024

mslaffin commented Dec 17, 2024

mslaffin commented Dec 23, 2024

Cleanup branch: PMON #37

Are you sure you want to change the base?

Cleanup branch: PMON #37

Conversation

mslaffin commented Dec 15, 2024

mslaffin commented Dec 17, 2024

mslaffin commented Dec 23, 2024