Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(oonifindings): add postgresql support #862

Merged
merged 11 commits into from
Aug 10, 2024
Merged

feat(oonifindings): add postgresql support #862

merged 11 commits into from
Aug 10, 2024

Conversation

DecFox
Copy link
Contributor

@DecFox DecFox commented Jul 17, 2024

This diff replaces the clickhouse database used in the oonifindings (earlier incidents) service with a postgresql database. Closes #855.

sa.Column("published", sa.Integer(), nullable=False),
sa.Column("deleted", sa.Integer(), nullable=False, default=0),
sa.Column("CCs", sa.JSON(), nullable=True),
sa.Column("ASNs", sa.JSON(), nullable=True),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use consistent capitalization for these columns?

I would suggest country_codes and asns

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done here: af0efde

sa.String(),
nullable=False,
primary_key=True,
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a column which is called something like finding_slug?

Also since we are now calling them findings, maybe this should be called finding_id or maybe even just id?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you elaborate a little more on what the finding_slug column represents? made the finding_id change here: 2289ced

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically we would like to have the ability to have some kind of slug for each finding so that instead of it being this opaque number for example https://explorer.ooni.org/findings/45013413801 it should be possible to configure it be https://explorer.ooni.org/findings/2024-07-bangladesh-blocks-facebook. This needs to be stored in the DB and we need to have the ability to lookup both by number (we can assume a number is the old format) and then to a permanent redirect to the new slug based format (which is nicer for humans and search engines).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for explaining this! I added a nullable finding_slug column for now: https://github.com/ooni/backend/pull/862/files#diff-ebc4627fea3725c4e6f1dafa16bfb842ed5cfdb74c8b292e203fc07cd2581ecbR30 along with a TODO note in the models. We should be able to add the endpoint in the next version and also make this non-nullable (if required)

@@ -3,26 +3,26 @@
"""

from datetime import datetime, timezone
from typing import List, Dict, Optional, Union, Tuple, Any, Annotated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we maybe bump the version number on the router endpoint? As in make into v2. That way if we break something badly we can always revert back to speaking to the old backend as v1

@@ -122,7 +134,7 @@ class OONIFindingIncidents(BaseModel):
@router.get(
"/v1/incidents/search",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is an opportunity here to improve a bit the semantics of these endpoints. We should speak to @majakomel about it though.

For example if we want to make this a truly restful endpoint, then we should not have the verb in the endpoint, but rather just have it be /v2/findings?[query_params]

incident_model = OONIFinding.model_validate(incident)
incident_models.append(incident_model)
return OONIFindingIncidents(incidents=incident_models)
return OONIFindingIncidents(incidents=findings)


@router.get(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, there is no need in RESTful API design to include the verb in the path, it can just be /v2/finding/{finding_id | slug}

raise HTTPException(status_code=400, detail="invalid query paramters")
return True


@router.post(
"/v1/incidents/create",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, no verb is needed, we can just call it POST /v2/finding

setnocacheresponse(response)
return OONIFindingsUpdateResponse(r=r, id=incident_id)
return OONIFindingsUpdateResponse(r=1, id=incident_id)


@router.post(
"/v1/incidents/update",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PUT /v2/finding/

models.OONIFinding.incident_id == incident_id
)
if token["role"] == "user":
q = q.filter(models.OONIFinding.creator_account_id == account_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if you can just do all of this in a single query, where you get one row and then apply the authentication checks once you have the row depending on the token role.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. You are right. I made some changes here: f4c97af. In most cases, I have tried to assemble the query logic into a single query.

setnocacheresponse(response)
return {}



@router.post(
"/v1/incidents/{action}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need for this to be a separate endpoint, it can just be part of the update endpoint where an update request is made with the update column being set.

Copy link
Member

@hellais hellais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note my comments are mostly related on how I think we ought to make the future of ooni findings look like.

I think we can start off by just swapping out the DB (as you did) and then improve the API based on my comments in a second moment, once we ensure that the existing API is 100% compatible with the existing one.

When and if we decide to do this, it's essential we also coordinate with @majakomel so that we don't break anything on the frontend.

nullable=False,
primary_key=True,
),
sa.Column("create_time", sa.DateTime(timezone=True), nullable=False),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are missing the finding_slug here and you should rename the incident_id to finding_id, also the capitalization of the CCs ASNs, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for pointing this out! This was an unrelated change we had to undo since this is the oonirun service. I have removed this from the diff now.

Copy link
Member

@hellais hellais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

We can do all the v2 stuff later.

@DecFox DecFox merged commit 53a3bf7 into master Aug 10, 2024
6 checks passed
@DecFox DecFox deleted the issue/855 branch August 10, 2024 12:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Port findings platform to use postgresql
2 participants