-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/stateful workflow #29
Conversation
rmm-ch
commented
Jan 31, 2025
- Added a state machine to control progression through the user workflow. The state is used to determine what to present, and is stable even when user interactions (e.g. tabs are switched) cause app reruns. Resolves bug: elements shown are not persistent if tabs are switched #23
- Input handling now incorporates coordinate extraction when present. Resolves feat: implement coordinate extraction from metadata #19
- InputObservation class got an overhaul as part of the stateful workflow. Resolves chore: cleanup InputObservation class #27
- We can also close bug: time_input failing with parsed dates #21 , the present implementation correctly populates the date if the file exifdata has it.
- classifier_image gets split into multiple functions: run inference; manual review/validation (with display of results); just display of results - workflow_state implements a FSM using `transitions`. It is a bit too simple in the end, as the idea was to have a trigger that allows moving to the next state without needing to know what that state is called (otherwise the specification of pathways by data structs doesn't simplify it). But it is too buggy this way, you can advance in places that you shouldn't. So will refactor this. - in main - we convert some more session_state to dicts, to handle image batches - add a simple widget to show progress through the workflow - the main effort to stop losing progress is on the tab_inference. lots of testing state to check what action to take. you can see several attempts, to clean up now I understand a big bug was in gating everything by the inference button.
This reverts commit 80b4be6. - I learned what I needed to but I don't like the FSM implementation, and I created plenty of mess in main that doesn't need to remain. --> reverting.
- fsm implementation uses the `transitions` package. - added unique keys to the input forms, so can check when all are filled - included a basic viz/feedback on the state
they were singular and just overwritten on each image.
- dropped the "ML running" phase for now as we don't do it async
- main bug was that every interaction with the UI led to the file_uploader being re-instantiated, and then all the inputs got re-parsed, the hashes recalculated, and the data lost. - solution is via callback, and using the session state to implicitly store the file_uploader return value (not well documented) - on change of the file_uploader state, we dynamically generate the input elements to supply the metadata. And process them inline. - TODO: the data is stable in the session_state, but the UI loses the elements for the list -- because the list hasn't changed! the callback doesn't get triggered. - Good: we don't overwrite our loaded data, and the ML/presentation can continue, but... - Bad: we don't redraw the elements. -> more caching I suppose.
- testing for the presence of the key in session_state is no longer sufficient with current implementation (we explicitly set the keys so we can handle >1 file)
- can't generate widgets within callbacks, they are not stable - flow instead is: 1. normal flow: add file_uploader with callback 2. buffer files in the callback (st.session_state) 3. normal flow: add UI elements to get metadata, for each file in buffer
see issue #27. - added image_md5 as input (1) - removed duplicate methods (3) - removed unused hash func (5) - checked by inspection the attribute coverage (4) - still to test
- the timestamp is taken from the clock at the moment, infeasible to be the same so would always update (overwrite) all observations on reruns. - also added a difference highlighter method for inspection
- hash is passed to Observation constructor
see issue #27. - added typehints (6) for all arguments, except (date, time) which always seem to be none, and need investigation to resolve (see 2) - updated the attribute uploaded_filename to uploaded_file, since it is *not* a filename, but a BytesIO-like object, `UploadedFile`.
- also stopped passing the viewcontainer arg to setup_input, since it uses the `with st.sidebar` context, which is simpler.
- still some ambiguity with the names "date_option" and "time_option" but the present change is more involved, while renaming those two can happen after if valid
- renamed the session_state variable from singular to plural - was initialised as a dict, but then overwritten each time, meaning access was order-dependent and lossy
- current implementation is to open the HF handle once, then prepare and push each observation individually. Could check docs about pushing multiple observations in one transaction. - At present the `api.upload_file` call is commented out, just get log/visual info about the actions
- no more "options", too ambiguous. The goal of this implementation is to have one date and one time for the observation, which could have been corrected manually. - see beca8fa where I did the first update
- basically all phases seem ok, almost ready for validation
- when manual validation is performed (dropdown selection among species), it is written to the observations (And not the dynamically-created dicts). - TODO: decide if we need to retain public_observations in session_state, or just generate the dict each time it is needed.
Maybe important to note:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes requested:
- add some documentation to some of the functions
- ensure variable names are explicit and easily understandable
- make the main file as small as possible. Anything that can be outsourced elsewhere should be
- basic markdown colouring doesn't support using :primary[<text>] in our version, so the code gets messy for an 'icing' feature.
st_logs provide the logging functionality, and the current method for the few places used was not persistent to tab-switches.
removed unused code, primarily commented out / older versions
I'm developing in a venv with python 3.10 as per the deployed version. I can't actually quickly test locally as the oldest I have is 3.10 (10-2021). Step 1: can you verify it is ok now? step 2: can you upgrade your dev environment to match the deployment? (Unless we need to support older versions?) |
I did a first round of main file cleaning (mentioned above re: session_state; and also obselete code removed). |
Coverage Report
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first implementation looks very nice. good to merge!