Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CI workflow to check for incorrect catalog updates #6091

Closed
wants to merge 2 commits into from

Conversation

konskov
Copy link
Contributor

@konskov konskov commented Sep 19, 2023

When adding or dropping columns to timescaledb catalog tables, the
right way to do this is to drop and recreate the table with the desired
definition, instead of doing ALTER TABLE .. ADD/DROP COLUMN. This is
required to ensure consistent attribute numbers across versions.
This workflow will fail with an error if it detects such an incorrect
modification of a catalog table.

Fixes #6049

@codecov
Copy link

codecov bot commented Sep 19, 2023

Codecov Report

Merging #6091 (38754f1) into main (42c3750) will decrease coverage by 0.04%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main    #6091      +/-   ##
==========================================
- Coverage   81.52%   81.48%   -0.04%     
==========================================
  Files         246      246              
  Lines       56781    56735      -46     
  Branches    12588    12569      -19     
==========================================
- Hits        46288    46233      -55     
- Misses       8092     8109      +17     
+ Partials     2401     2393       -8     

see 44 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@konskov konskov force-pushed the ci_check_latest_dev branch 10 times, most recently from 4e689b0 to ee6d793 Compare September 20, 2023 08:45
@konskov konskov changed the title Add python script and workflow to check catalog updates Add CI workflow to check for incorrect catalog updates Sep 20, 2023
@konskov konskov marked this pull request as ready for review September 20, 2023 08:47
@github-actions
Copy link

@jnidzwetzki, @erimatnor: please review this pull request.

Powered by pull-review

# also need a function to separate comments from actual commands
def remove_comments(lines):
code_lines = []
for line in lines:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More idiomatic I believe:

code_lines = (line for line in lines if not line.startswith("--"))
return "\n".join(code_lines)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @wrobell

ref: ${{ github.event.pull_request.head.sha }}
fetch-depth: 0
- name: Check if latest-dev was modified
shell: bash --norc --noprofile {0}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for these parameters? Could you add a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh I'm not sure why, I see we specify these in other workflows but I don’t know the reason, so let’s test without them and find out if they’re needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems bash {0} is enough and -eo pipefail {0} is what we need to avoid in this workflow which is by default set if we simply use the bash option (https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsshell)

Copy link
Contributor

@jnidzwetzki jnidzwetzki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable. Added a few minor comments.

@konskov konskov force-pushed the ci_check_latest_dev branch 5 times, most recently from cfdaf90 to 1ebb4fb Compare September 25, 2023 08:37
@konskov konskov force-pushed the ci_check_latest_dev branch 6 times, most recently from 5f72946 to ba6b5c4 Compare September 25, 2023 10:01
@konskov konskov marked this pull request as draft September 25, 2023 10:39
@konskov konskov force-pushed the ci_check_latest_dev branch from ec24cb0 to 9150d4f Compare September 25, 2023 11:15
When adding or dropping columns to timescaledb catalog tables, the
right way to do this is to drop and recreate the table with the desired
definition, instead of doing ALTER TABLE .. ADD/DROP COLUMN. This is
required to ensure consistent attribute numbers across versions.
This workflow will fail with an error if it detects such an incorrect
modification of a catalog table.

Fixes timescale#6049
@konskov konskov force-pushed the ci_check_latest_dev branch from 9150d4f to 2c62b33 Compare September 25, 2023 11:15
@konskov konskov marked this pull request as ready for review September 25, 2023 11:17
@timescale timescale deleted a comment from github-actions bot Sep 25, 2023
@konskov konskov requested review from svenklemm and removed request for akuzm and erimatnor September 25, 2023 11:18
@konskov konskov force-pushed the ci_check_latest_dev branch from 2c62b33 to 38754f1 Compare September 25, 2023 11:20
@konskov
Copy link
Contributor Author

konskov commented Sep 26, 2023

As discussed at standup today, two more checks need to be added:

  • CREATE TEMP TABLE (should not be allowed because pgextwlist does not allow the use of temporary tables, so better to catch this early)
  • ALTER TABLE _timescaledb_catalog.oldname RENAME TO newname

@konskov konskov marked this pull request as draft September 26, 2023 14:05
@svenklemm
Copy link
Member

Have you considered using proper sql parser instead of "grepping" the text. Using proper sql parser would be more robust.

pgspot uses https://pypi.org/project/pglast/

Comment on lines +26 to +33
run: |
file="sql/updates/latest-dev.sql"
updated=0
diff=$(gh pr view $PR_NUMBER --json files --jq '.files[] | select(.path == "sql/updates/latest-dev.sql") | .path')
if [[ ${diff} != "" ]]; then
updated=1
fi
echo "latestdev_updated=$updated" >> $GITHUB_OUTPUT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to run this always. This should be pretty quick check and not running it when the script itself is adjusted seems suboptimal. In addition we also want to run this on reverse-dev.sql

@konskov
Copy link
Contributor Author

konskov commented Sep 28, 2023

I will update this PR or create a new one to use pglast instead of regular expressions, working on the changes here https://github.com/konskov/timescaledb/tree/ci_check_alternative_pglast

@konskov
Copy link
Contributor Author

konskov commented Sep 29, 2023

closing in favor of #6130

@konskov konskov closed this Sep 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Enhancement]: CI check/warning for latest-dev updates
4 participants