Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

[Task]: Finish adding Postgres Integration to Analytics Library #72

Merged
merged 14 commits into from
Jun 12, 2024

Conversation

aplybeah
Copy link

@aplybeah aplybeah commented Jun 6, 2024

Summary

Fixes #45

Time to review: 3 mins

Changes proposed

  • update config.py database url
  • add function in cli.py
  • updated packages in poetry.lock

Context for reviewers

N/A

Additional information

Row created manually in the database alongside a row created via test_connection
Screen Shot 2024-06-11 at 1 49 53 PM

Copy link
Collaborator

@chouinar chouinar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some notes regarding the DB connection

Comment on lines 23 to 25
]
database_url="postgresql://user:password@location:port/dbname" # get vars from the api?
,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you've got access to env vars here, could build this like we do for the flask API:

https://github.com/navapbc/simpler-grants-gov/blob/main/api/src/adapters/db/clients/postgres_client.py#L79

This just takes in a config object that is loading env vars, defined in: https://github.com/navapbc/simpler-grants-gov/blob/main/api/src/adapters/db/clients/postgres_config.py

Handles IAM auth as well.


Note, you might want the start of this to be postgresql+psycopg to tell it to use the latest version of psycopg (I think it defaults to psycopg2? They named the 3rd version just psycopg, its weird): https://docs.sqlalchemy.org/en/20/dialects/postgresql.html#module-sqlalchemy.dialects.postgresql.psycopg

A SQLAlchemy engine object representing the connection to the database.
"""

return create_engine(settings.database_url, pool_pre_ping=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have any PII data, we should set hide_parameters=True so that any DB errors don't dump the contents of our queries to the logs.

If we want connection pooling (no idea the run pattern of this tool), can see some of that in the API implementation of this: https://github.com/navapbc/simpler-grants-gov/blob/main/api/src/adapters/db/clients/postgres_client.py#L49

@aplybeah
Copy link
Author

Food for thought: The analytics linter is failing in two places; a docstring and a command due to length (both are over 100 char) Splitting the command is going to break it, and I haven't been able to split the docstrings any further than what is present. Is there a way to tweak this check?

@aplybeah aplybeah requested a review from chouinar June 11, 2024 16:32
@aplybeah aplybeah marked this pull request as ready for review June 11, 2024 16:32
@aplybeah aplybeah requested a review from coilysiren as a code owner June 11, 2024 16:32
@coilysiren
Copy link
Collaborator

@aplybeah

Is there a way to tweak this check?

astral-sh/ruff#3442

connection = engine.connect()

# Test INSERT INTO action
# splitting this to appease E501 Line too long (184 > 100 characters will break the command
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what its worth, you can split this without breaking the command, you use a multi-line string, eg.

text(
    "INSERT INTO audit_log (topic,timestamp, end_timestamp, user_id, details) "
    "VALUES('test', ' 2024-06-11 10:41:15+00', '2024-06-11 10:54:15+00', 87654, 'test from command');",
)

Two strings next to each other creates a multi-line string.

You can also of course use no-qa on the linter.

A SQLAlchemy engine object representing the connection to the database.
"""
return create_engine(
f"postgresql+psycopg://{settings.user}:{settings.password}@{settings.host}:{settings.port}",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⭐   (very much optional, just nice to have) recommend testing and documenting exactly how to override these settings with environment variables.

Comment on lines 1 to 4
NAME = "app"
HOST = "0.0.0.0"
USER = "app"
PORT = 5432
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⭐⭐   (optional but seriously consider) recommend prefixing all of these with POSTGRES_, eg. POSTGRES_USER

@chouinar
Copy link
Collaborator

@aplybeah

Is there a way to tweak this check?

astral-sh/ruff#3442

https://github.com/navapbc/simpler-grants-gov/blob/main/analytics/pyproject.toml#L53 - it might also be worth adjusting the rules we have enabled. Whoever setup the analytics code enabled everything in ruff and then ignored a few.

In the API, we did the opposite (enabled specific rules), partially because the tool we were using before had fewer rules, and partially due to not agreeing with / not wanting to fix 1000+ issues.

Copy link
Collaborator

@coilysiren coilysiren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines +139 to +140
"INSERT INTO audit_log (topic,timestamp, end_timestamp, user_id, details)"
"VALUES('test','2024-06-11 10:41:15','2024-06-11 10:54:15',87654,'test from command');",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"INSERT INTO audit_log (topic,timestamp, end_timestamp, user_id, details)"
"VALUES('test','2024-06-11 10:41:15','2024-06-11 10:54:15',87654,'test from command');",
"INSERT INTO audit_log (topic,timestamp, end_timestamp, user_id, details) "
"VALUES('test','2024-06-11 10:41:15','2024-06-11 10:54:15',87654,'test from command');",

Multiline strings don't turn newlines into spaces by default, so you need to add a trailing space on each line

@aplybeah
Copy link
Author

Expected Failures:

FAILED tests/integrations/test_slack.py::test_fetch_slack_channels - TypeError: 'NoneType' object is not subscriptable
FAILED tests/integrations/test_slack.py::test_upload_files_to_slack_channel - TypeError: 'NoneType' object is not subscriptable

These require Slack Auth

@aplybeah aplybeah merged commit ddf0d75 into main Jun 12, 2024
6 of 7 checks passed
@aplybeah aplybeah deleted the export-to-database branch June 12, 2024 14:37
acouch pushed a commit that referenced this pull request Sep 18, 2024
Fixes #45

* update `config.py` database url
* add function in `cli.py`
* updated packages in `poetry.lock`

N/A

Row created manually in the database alongside a row created via
`test_connection`
![Screen Shot 2024-06-11 at 1 49 53
PM](https://github.com/navapbc/simpler-grants-gov/assets/37313082/b83afad8-5fe1-404f-adf3-c94945740bbe)
acouch pushed a commit that referenced this pull request Sep 18, 2024
Fixes #45

* update `config.py` database url
* add function in `cli.py`
* updated packages in `poetry.lock`

N/A

Row created manually in the database alongside a row created via
`test_connection`
![Screen Shot 2024-06-11 at 1 49 53
PM](https://github.com/navapbc/simpler-grants-gov/assets/37313082/b83afad8-5fe1-404f-adf3-c94945740bbe)
acouch pushed a commit to HHS/simpler-grants-gov that referenced this pull request Sep 18, 2024
…pbc#72)

Fixes #45

* update `config.py` database url
* add function in `cli.py`
* updated packages in `poetry.lock`

N/A

Row created manually in the database alongside a row created via
`test_connection`
![Screen Shot 2024-06-11 at 1 49 53
PM](https://github.com/navapbc/simpler-grants-gov/assets/37313082/b83afad8-5fe1-404f-adf3-c94945740bbe)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Task]: Finish adding Postgres Integration to Analytics Library
3 participants