Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Campaign observations - Abstract Point and Image data metadata #156

Merged
merged 40 commits into from
Oct 22, 2024

Conversation

jomey
Copy link
Member

@jomey jomey commented Oct 11, 2024

This is quite a big one and refactors the structure around metadata for Point and Image data as discussed in #137 and closes related issues (bottom of this description)

It also updates the test suite to use factory boy along with the integration into ptytest.

I will highlight more major changes in code comments.

Grab yourself a ☕ (or 🍺) this PR has a lot to unfold.

Close #137
Close #136
Close #132

Add new overarching table that will hold the metadata for point and
image tables. This table is set up to use Single Table Inheritance to
create a subclass to specific point and image observations.
Adds classes to handle specific functions for Point or Image
observations.
Add a class to inherit when linking a one-to-one relationship to an
observer.
This splits out metadata for individual points and image data and links
through a campaign observation. Each will have its own subclass for
specific information.
This will break the point data measurement tests and fixed in a
follow-up commit.
Update relationships for Image and Point data to CampaignObservation and
link through the respective subclasses.
Break up the get_db connection method to break out reading the
credential file and constructing the connection string into a separate
method. This prepares re-using the logic for the test DB setup.
Add the facotry-boy package along with a pytest integration. This should
make it easier to create test data for tests, especially with
relational data.
Add global pytest fixtures for the db session that allows for rolling
back any DB changes after a test has run. This allows for keeping the
schema during a whole test suite run, while changing specific test
run data setups.
Add tests for the updated scheme that tests each model/table on the
class level. This also replaces some of the schema tests to this module.
The suite as a whole is not passing since there is still a mix and match
between test data setups.
Recent renaming of the relationship needed a few more instance to
replace 'measurement' with 'measurement_type'
Make the test method name more descriptive and remove redundant
information.
Adding the date column will allow us to faster query points for a
certain date than going through the PointData table. The date on the
PointData table is also a datetime, which allows for finer grained time
specifications. This removes the date column for the ImageData since
those won't need fine-grained access as the point data.
This has been moved to the measurement_types table and is linked through
the observations relationship.
Since we are using transactions for the test suite, we might as well
commit records.
Use the SQLAlchemy 'hybrid_attribute' feature to query for dates only on
the datetime column.

Reference:
https://docs.sqlalchemy.org/en/20/orm/extensions/hybrid.html#module-sqlalchemy.ext.hybrid
Add factories to patch db connections and sessions in test class runs.
Also updates tests to use factories. The suite as a whole still fails
with the LayerData not using factories. This is the next action item.
This is required to make the tests pass as a suite since the DBSetup
method does not use transactions that can be rolled back.
snowexsql/api.py Show resolved Hide resolved
snowexsql/api.py Show resolved Hide resolved
GH actions were timing out on tests and not having the proper connection
info is hopefully the cause. This also makes for better tests :-)
So we hopefully don't wait forever if we have connection issues and
raise errors early.
@jomey
Copy link
Member Author

jomey commented Oct 11, 2024

I am out of ideas why tests for Python 3.9 through 3.12 are hanging with GitHub actions.
Maybe @micah-prime or @aaarendt could check out this PR and run it locally.
Of course, all tests run fine with my local setup and Python 3.12

@jomey
Copy link
Member Author

jomey commented Oct 11, 2024

I am out of ideas why tests for Python 3.9 through 3.12 are hanging with GitHub actions. Maybe @micah-prime or @aaarendt could check out this PR and run it locally. Of course, all tests run fine with my local setup and Python 3.12

Well now also the tests for 3.8 are hanging and I only added a test case that checks for more information in the connection string.

When I shutdown my DB locally, the tests start failing fairly quickly. Makes me think it is not a DB connection issue

@aaarendt
Copy link
Contributor

The tests hang for me when it gets to test_db.py. The api and tables tests succeed.

I'm looking at the content of pg_stat_activity while the test hangs and see this:

image

Wondering if it's something related to postgresql being strict about locking concurrent transactions, and your new use of rolling back after each test? Maybe useful: sqlalchemy/sqlalchemy#4685

Since the tables are deleted manually (as a temporary workaround), there
no need to try to rollback data.
Separate the test file in one class that tests DB tables and the other
the db module of the library.
@jomey
Copy link
Member Author

jomey commented Oct 12, 2024

The tests hang for me when it gets to test_db.py. The api and tables tests succeed.

I'm looking at the content of pg_stat_activity while the test hangs and see this:

Wondering if it's something related to postgresql being strict about locking concurrent transactions, and your new use of rolling back after each test? Maybe useful: sqlalchemy/sqlalchemy#4685

Thanks for the great detective work @aaarendt
I have updated the test suite and removed the rollback with the old session setup. I am trying not to overhaul the whole test suite with this PR and do this in a separate effort. Might have to bite the bullet though and at least replace the tests in test_db_tables with the new structure.

Along with the factory. This is the last table level test that replaces
the test_db_tables cases
@aaarendt
Copy link
Contributor

For my own learning, is it correct that e.g. lines 37-39 in api/test_layer_measurements.py (with reference to "fakeinstrument") is an example of old tests that will be superseded by your factory boy approach?
(weird that you can't comment on code that hasn't been changed in a commit...?)

Copy link
Contributor

@aaarendt aaarendt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the tests work locally when I switch to builder username. I've poked around the rest of the code to try to learn things and don't know enough yet to do a really rigorous review.

tests/test_db.py Outdated Show resolved Hide resolved
@jomey
Copy link
Member Author

jomey commented Oct 15, 2024

For my own learning, is it correct that e.g. lines 37-39 in api/test_layer_measurements.py (with reference to "fakeinstrument") is an example of old tests that will be superseded by your factory boy approach? (weird that you can't comment on code that hasn't been changed in a commit...?)

Correct. Just logged an issue for that #158

Copy link
Contributor

@micah-prime micah-prime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Just a couple comments/questions

snowexsql/tables/image_data.py Show resolved Hide resolved
snowexsql/tables/point_data.py Outdated Show resolved Hide resolved
snowexsql/tables/single_location.py Outdated Show resolved Hide resolved
snowexsql/conversions.py Show resolved Hide resolved
snowexsql/api.py Show resolved Hide resolved
tests/factories/layer_data.py Show resolved Hide resolved
Mimics the used option in the client library too.
Use a datetime column for the Site and PointData for consistency. This
replaces the separate time and date column for site and combines them
into one. Also adds the hybrid attribute date to the Site object to
be able to only query for dates.

Both table now carry the `datetime` column to indicate this behavior.

PR-feedback
@jomey jomey merged commit ec449ee into release-0.6.x Oct 22, 2024
6 checks passed
@jomey jomey deleted the campaign_observations branch October 22, 2024 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants