SQLAlchemy 2.0 #17778

jdavcs · 2024-03-18T17:31:50Z

This PR enables SQLAlchemy 2.0 without changing any behavior. It contains bug fixes, SA 2.0 fixes, typing (not mypy-specific) error fixes, mypy error fixes, and relevant refactoring. Mypy error fixes are edits that help mypy or type-ignores that are (a) false positives, (b) require DatasetInstance models to be mapped declaratively (which will allow us to add appropriate type hints to their mapped attributes), or (c) require nontrivial refactoring or analysis that go beyond the scope of this PR, which is narrowly focused on enabling SA 2.0. To the best of my knowledge, none of the type-ignores silence mypy errors caused by incorrect usage of SQLAlchemy-related code.

There's more SA2.0-specific typing and refactoring that can be applied to the model definition. However, I think it's best to leave it for follow-up PRs because (a) much is dependent on remapping the DatasetInstance and TS RepositoryMetadata models declaratively (for which we need to rename the metadata attribute in those classes, which is difficult), and (b) there's so much more new/better SA2.0 syntax, patterns, abstractions to consider and apply to our code base than could fit in one PR.

The commits have been cleaned-up. The names are self-explanatory; more details are provided in commit descriptions were needed.

The PR is very large - sorry about that! Please let me know if I can reorganize it in any way to simplify review.

How to test the changes?

(Select all options that apply)

I've included appropriate automated tests.
This is a refactoring of components with existing test coverage.
Instructions for manual testing are as follows:
1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

davelopez

Thank you @jdavcs for this titanic effort! 🎉

Most of the changes look good to me. Some of the bigger refactorings on complex queries escape my limited knowledge of SQLAlchemy and those particular models/tables, so someone else might have a closer look.

I wish we had better typing in place so we don't need that many type: ignore workarounds, but that seems like something we can try to iron out in follow-ups.

lib/galaxy/managers/export_tracker.py

lib/galaxy/managers/notification.py

lib/galaxy/model/__init__.py

jdavcs · 2024-04-02T14:04:07Z

I wish we had better typing in place so we don't need that many type: ignore workarounds, but that seems like something we can try to iron out in follow-ups.

Yes - I completely agree! @jmchilton expressed the same concern last month in Austin. Adding type-ignores was my last resort - in each case I made sure I tried all other options to fix the type without making major changes to the code base. In some cases I simply didn't find a way to do that: it would either require to add proper types to the 2 DatasetInstance class definitions (which requires metadata to be renamed, which is nontrivial), or required making significant changes to the code. I wanted to keep this PR as narrowly focused as possible (for a manageable size and scope of changes), so I made changes other than enabling SA2.0 only when it was a bug or something trivial. I'll go over every single added type-ignore as a follow-up (or follow-ups).

This conflicts with dependency requirements for sqlalchemy-graphene (used only in toolshed, new WIP client)

This does not exist in SQLAlchemy 2.0

Remove unused import For context: https://github.com/galaxyproject/galaxy/pull/14717/files#r1486979280 Also, remove model attr type hints that conflict with SA2.0

Included models: galaxy, tool shed, tool shed install Column types: DateTime Integer Boolan Unicode String (Text/TEXT/TrimmedString/VARCHAR) UUID Numeric NOTE on typing of nullability: db schema != python app - Mapped[datetime] specifies correct type for the python app; - nullable=True specifies correct mapping to the db schema (that's what the CREATE TABLE sql statement will reflect). mapped_column.nullable takes precedence over typing annotation of Mapped. So, if we have: foo: Mapped[str] = mapped_column(String, nullable=True) - that means that the foo db field will allow NULL, but the python app will not allow foo = None. And vice-versa: bar: Mapped[Optional[str]] = mapped_column(String, nullable=False) - the bar db field is NOT NULL, but bar = None is OK. This might need to be applied to other column definitions, but for now this addresses specific mypy errors. Ref: https://docs.sqlalchemy.org/en/20/orm/declarative_tables.html#mapped-column-derives-the-datatype-and-nullability-from-the-mapped-annotation

Columns: MutableJSONType JSONType DoubleEncodedJsonType TODO: I think we need a type alias for json-typed columns: bytes understand iteration, but not access by key.

Ref: https://docs.sqlalchemy.org/en/20/changelog/migration_20.html#result-rows-act-like-named-tuples

SA 1.4: str(url) renders connection string with password SA 2.0: str(url) renders connection string WITHOUT password Solution: Use render_as_string(hide_password=False)

Replaces attribute_mapped_collection (SA20)

Rename varable to fix mypy

Need to map declaratively to remove this

Also, minor SA2.0 syntax fix

1. In 2.0, when the statement contains "returning", the result type is ChunkedIteratorResult, which does not have the rowcount attr, becuase: 2. result.rowcount should not be used for statements containting the returning clause Ref: https://docs.sqlalchemy.org/en/20/core/connections.html#sqlalchemy.engine.CursorResult.rowcount

Otherwise there's an idle transaction left in the database (+locks)

Same as prev. commit: otherwise db locks are left

This restores the behavior under SQLAlchemy 1.4 (Note that we set the pool for sqlite only if it's not an in-memory db

jmchilton

The linting and unit test errors seem legitimate - are they not? - I'd fix those before merging but otherwise I've reviewed this a few times and it looks good. I am sure there will be errors - but I don't know how we would find them without merging and getting this deployed.

jdavcs · 2024-04-02T18:16:04Z

The linting and unit test errors seem legitimate - are they not? - I'd fix those before merging but otherwise I've reviewed this a few times and it looks good. I am sure there will be errors - but I don't know how we would find them without merging and getting this deployed.

@jmchilton They were legit - surfaced after I rebased; they were introduced in 77ef4a3 . I've fixed them. They did not show up in dev because one was a warning under SA 1.4 and the other was only checked by mypy after I added the SA 2.0 types to the model definitions.

Thank you for reviewing!!

galaxyproject-sentryintegration · 2024-05-31T06:58:59Z

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

‼️ OperationalError: (psycopg2.OperationalError) SSL connection has been closed unexpectedly sqlalchemy.orm.session in get View Issue
‼️ OperationalError: (psycopg2.OperationalError) SSL connection has been closed unexpectedly sqlalchemy.orm.session in get View Issue

_{Did you find this useful? React with a 👍 or 👎}

jdavcs added kind/enhancement kind/refactoring cleanup or refactoring of existing code, no functional changes area/database Galaxy's database or data access layer highlight Included in user-facing release notes at the top highlight/dev Included in admin/dev release notes labels Mar 18, 2024

jdavcs added this to the 24.1 milestone Mar 18, 2024

jdavcs force-pushed the dev_sa20 branch 2 times, most recently from 1b2b7ad to ab299a3 Compare March 25, 2024 15:39

jdavcs requested a review from a team April 2, 2024 04:10

davelopez reviewed Apr 2, 2024

View reviewed changes

lib/galaxy/managers/export_tracker.py Show resolved Hide resolved

lib/galaxy/managers/notification.py Show resolved Hide resolved

lib/galaxy/model/__init__.py Outdated Show resolved Hide resolved

jdavcs force-pushed the dev_sa20 branch from 512f51c to f7f91a7 Compare April 2, 2024 13:39

jdavcs added 18 commits April 2, 2024 10:08

Upgrade SQLAlchemy to 2.0

fa344f7

This conflicts with dependency requirements for sqlalchemy-graphene (used only in toolshed, new WIP client)

Remove RemovedIn20Warning from config

7510325

This does not exist in SQLAlchemy 2.0

Update import path for DeclarativeMeta

61c463f

Move declaration of injected attrs into constructor

9d7ae1b

Remove unused import For context: https://github.com/galaxyproject/galaxy/pull/14717/files#r1486979280 Also, remove model attr type hints that conflict with SA2.0

Add typing to JSON columns, fix related mypy errors

7c2a1a4

Columns: MutableJSONType JSONType DoubleEncodedJsonType TODO: I think we need a type alias for json-typed columns: bytes understand iteration, but not access by key.

Use correct type hints to define common model attrs

1a26f75

Start applying Mapped to relationship definitions in the model

00fc1ee

Remove column declaration from HasTags parent class

5c27fe1

Fix SA2.0 error: wrap sql in text()

937292d

Fix SA2.0 error: pass bind to create_all

3c4543b

Fix SA2.0 error: use Row._mapping for keyed attribute access

fb018c5

Ref: https://docs.sqlalchemy.org/en/20/changelog/migration_20.html#result-rows-act-like-named-tuples

Fix SA2.0 error: show password in url

ecd6c36

SA 1.4: str(url) renders connection string with password SA 2.0: str(url) renders connection string WITHOUT password Solution: Use render_as_string(hide_password=False)

Fix SA2.0 error: use attribute_keyed_dict

3aae92b

Replaces attribute_mapped_collection (SA20)

Fix SA2.0 error: make select stmt a subquery

2254054

Rename varable to fix mypy

Fix SA2.0 error: explicitly use subquery() for select-from argument

b3cef4f

Fix SA2.0 error: replase session.bind with session.get_bind()

72dcc40

Fix SA2.0 error: joinedload does not take str args

0e61d61

jdavcs added 22 commits April 2, 2024 10:08

Mypy: type-ignore hda attr-defined error

a6162ed

Need to map declaratively to remove this

Convert visualization manager index query to SA Core

82f8438

Mypy: session is not none

3fb0c9c

Mypy: type-ignore what requires more refactoring

b146837

Mypy: type-ignore hda, ldda attrs: need declarative mapping

4255c15

Also, minor SA2.0 syntax fix

Mypy: type-ignores to handle late evaluation of relationship arguments

eb46a08

Mypy: type-ignore column property assignments (type is correct)

f3740c6

Mypy: typing errors, misc. fixes

78a77d9

Mypy: all statements are reachable

f9cebc6

Mypy: need to map hda declaratively, then its parent is model.Base

3e5c46b

Fix typing errors: sharable, secured

05a5b01

Fix package mypy errors

28fd6ad

Wrap call to ensure session is closed

6a800c0

Otherwise there's an idle transaction left in the database (+locks)

Ensure session is closed on TS Registry load

3c74f9d

Same as prev. commit: otherwise db locks are left

Fix SA2.0 error: list arg to select; mypy

631b504

Use NullPool for sqlite engines

f52d35b

This restores the behavior under SQLAlchemy 1.4 (Note that we set the pool for sqlite only if it's not an in-memory db

Help mypy: job is never None

2e0ec6f

Add Decimal to accpted types by util.nice_size()

dc2621b

Fix linting after rebase

e35d017

Use model_dump_json() instead of deprecated json()

f07a6f2

Add AssociationProxy type, drop type-ignore

a8f2f2c

jmchilton approved these changes Apr 2, 2024

View reviewed changes

jdavcs added 2 commits April 2, 2024 14:03

Fix new bug: wrap raw sql in text()

1c4b114

Fix new bug: incorrect type in calculate_disk_usage_per_objectstore

2c527d4

jdavcs force-pushed the dev_sa20 branch from 0e84001 to 2c527d4 Compare April 2, 2024 18:10

jdavcs merged commit af53d03 into galaxyproject:dev Apr 3, 2024
51 of 54 checks passed

jdavcs mentioned this pull request Apr 12, 2024

Model typing and SA2.0 follow-up #17958

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SQLAlchemy 2.0 #17778

SQLAlchemy 2.0 #17778

jdavcs commented Mar 18, 2024 •

edited

Loading

davelopez left a comment

jdavcs commented Apr 2, 2024

jmchilton left a comment

jdavcs commented Apr 2, 2024

galaxyproject-sentryintegration bot commented May 31, 2024

SQLAlchemy 2.0 #17778

SQLAlchemy 2.0 #17778

Conversation

jdavcs commented Mar 18, 2024 • edited Loading

How to test the changes?

License

davelopez left a comment

Choose a reason for hiding this comment

jdavcs commented Apr 2, 2024

jmchilton left a comment

Choose a reason for hiding this comment

jdavcs commented Apr 2, 2024

galaxyproject-sentryintegration bot commented May 31, 2024

Suspect Issues

jdavcs commented Mar 18, 2024 •

edited

Loading