Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run postgres datastore tests with pgbouncer #1594

Merged
merged 2 commits into from
Oct 27, 2023

Conversation

bradengroom
Copy link
Contributor

@bradengroom bradengroom commented Oct 19, 2023

Fixes #1217

@bradengroom
Copy link
Contributor Author

I'll take a look at the test failures as I have time

@jzelinskie jzelinskie added the area/datastore Affects the storage system label Oct 19, 2023
@github-actions github-actions bot removed the area/datastore Affects the storage system label Oct 20, 2023
@bradengroom bradengroom force-pushed the pgbouncer branch 2 times, most recently from 64b89cc to 97ad115 Compare October 20, 2023 03:34
Comment on lines 97 to 98
"POOL_MODE=session", // https://github.com/authzed/spicedb/issues/1217
"IGNORE_STARTUP_PARAMETERS=plan_cache_mode", // Tell pgbouncer to pass this param thru to postgres.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the two pgbouncer settings that need to be documented.

@bradengroom
Copy link
Contributor Author

The spanner tests failed, but that feels like an unrelated flaky test

@bradengroom bradengroom marked this pull request as ready for review October 20, 2023 03:44
@bradengroom bradengroom requested a review from a team October 20, 2023 03:44
Copy link
Contributor

@vroldanbet vroldanbet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bradengroom this is cool! I think we would want to retain the tests talking directly to Postgres too, would you mind redesigning this so it's an option? then we can run this as another datastore GitHub Action in parallel with the non-pgbouncer enabled test suite.

@bradengroom
Copy link
Contributor Author

@bradengroom this is cool! I think we would want to retain the tests talking directly to Postgres too, would you mind redesigning this so it's an option? then we can run this as another datastore GitHub Action in parallel with the non-pgbouncer enabled test suite.

Sure thing. Makes sense!

@github-actions github-actions bot added the area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) label Oct 23, 2023
Comment on lines 24 to 25
testArgs := append([]string{"test", "-failfast", "-count=1"}, args...)
return RunSh(goCmdForTests(), WithV(), WithDir(dir), WithArgs(testArgs...))(path)
testArgs := append([]string{"test", path, "-failfast", "-count=1"}, args...)
return RunSh(goCmdForTests(), WithV(), WithDir(dir))(testArgs...)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests were unable to access the -enablePgbouncer flag if it came before the path for some reason. After looking at this, it didn't feel like there was any reason that the path needed to be treated as a unique argument. Now I treat them similarly and place path at the front.

@bradengroom
Copy link
Contributor Author

I didn't want to structure this like all of the other datastore tests by creating a new folder for pgbouncer tests. I wanted to run all existing postgres tests through the pgbouncer setup. So I enabled this by adding an -enablePgbouncer flag that may be passed to the postgres tests.

I'm open to other ideas in supporting this.

@bradengroom
Copy link
Contributor Author

Rebased. I think the currently failing test is possibly flaky. (unrelated? )

@vroldanbet
Copy link
Contributor

@bradengroom 🤔 The way this is tested seems to introduce a new pattern into the codebase instead of using the ones in place. The way I imagined we would implement is by adding a new parameter to testdatastore.RunPostgresForTesting:

b := testdatastore.RunPostgresForTesting(t, "", config.targetMigration, config.pgVersion)

You can see that TestPostgresDatastore method fans out in different combinations of test environments that are run in parallel, like for example various postgres versions:

for _, config := range []struct {
targetMigration string
migrationPhase string
pgVersion string
}{
{"head", "", pgversion.MinimumSupportedPostgresVersion},
{"head", "", "14"},
{"head", "", "15"},
{"head", "", "16"},
} {
config := config
t.Run(fmt.Sprintf("postgres-%s-%s-%s", config.pgVersion, config.targetMigration, config.migrationPhase), func(t *testing.T) {
t.Parallel()
b := testdatastore.RunPostgresForTesting(t, "", config.targetMigration, config.pgVersion)
test.All(t, test.DatastoreTesterFunc(func(revisionQuantization, gcInterval, gcWindow time.Duration, watchBufferLength uint16) (datastore.Datastore, error) {
ds := b.NewDatastore(t, func(engine, uri string) datastore.Datastore {
ds, err := newPostgresDatastore(uri,
RevisionQuantization(revisionQuantization),
GCWindow(gcWindow),
GCInterval(gcInterval),
WatchBufferLength(watchBufferLength),
DebugAnalyzeBeforeStatistics(),
MigrationPhase(config.migrationPhase),
)
require.NoError(t, err)
return ds
})
return ds, nil
}))

@github-actions github-actions bot added the area/datastore Affects the storage system label Oct 26, 2023
@bradengroom
Copy link
Contributor Author

@vroldanbet Thanks for the review!

then we can run this as another datastore GitHub Action in parallel with the non-pgbouncer enabled test suite.

I think I over-indexed on this comment in order to get the separate CI steps. I've updated the postgres steps to include pgbouncer tests. Let me know if I've still misunderstood something. I don't mind iterating on it until the preferred patterns are met.

@bradengroom
Copy link
Contributor Author

There is a failing CI step, but it looks like a vulnerable dependency that should be addressed in a separate PR.

Copy link
Contributor

@vroldanbet vroldanbet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me! I left a few questions. I think my main concern is that CI time has increased greatly, making the postgres datastore test now the slowest of the CI jobs:

image

This is not surprising given we are greatly increasing the combination of parallel tasks to run, and please note we are using beefier runners here.

Sorry for the back and forth - I think the foundation is the right one. We can certainly run these tests on our local machines but CI may need more granular jobs. Another alternative would be to reduce the number of combinations, although I think being exhaustive here is a good thing. I think that for now adding a new pgbouncer datastore as you did initially may be a good compromise, but hopefully we can now avoid using specific environment variables to tweak the behaviour.

The failed trivy is yet another gRPC vulnerability (we fixed one just 2 weeks ago) that is triggered via grpc-health-check. I see they have updated to a new version but it hasn't been released yet, so we will need to build from a specific commit. I'll open a PR. EDIT: I've opened #1613

internal/testserver/datastore/postgres.go Show resolved Hide resolved
internal/testserver/datastore/postgres.go Show resolved Hide resolved
internal/datastore/postgres/postgres_test.go Outdated Show resolved Hide resolved
internal/datastore/postgres/postgres_test.go Outdated Show resolved Hide resolved
internal/testserver/datastore/postgres.go Show resolved Hide resolved
"AUTH_TYPE=md5", // use the same auth type as postgres
"MAX_CLIENT_CONN=" + POSTGRES_TEST_MAX_CONNECTIONS,
// params needed for spicedb
"POOL_MODE=session", // https://github.com/authzed/spicedb/issues/1217
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware this is advice provided by the community (I certainly didn't verify this myself) but it's surprising given session is the default. I wonder if this is related to the zalando operator mentioned by the user in Discord not setting session as the default. Anyway, it does not hurt adding it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it looks like the Zalando operator chooses a different default (transaction) for some reason:
https://github.com/zalando/postgres-operator/blob/409e4c78344f397d2c771129bcfb6968f84c1972/docs/reference/operator_parameters.md?plain=1#L998

By the way, we don't have to specify this setting here. I was just trying to explicitly document it in the test. I can remove this line and the tests continue to run successfully though. I'm happy to remove it if you prefer

@bradengroom
Copy link
Contributor Author

bradengroom commented Oct 27, 2023

I think that for now adding a new pgbouncer datastore as you did initially may be a good compromise, but hopefully we can now avoid using specific environment variables to tweak the behaviour.

@vroldanbet Do you have thoughts on what that'd look like? I think maybe I'm missing an alternate pattern that you're seeing.

My goal is to avoid writing new tests specifically for pgbouncer. (Every test for postgres should be a valid test for pgbouncer). I'd like to just point all of the existing tests at both options.

So I need to signal to the test suite somehow that we care about testing pgbouncer vs postgres in the current run.

Each github action points to a function in the magefile setup. Each function invokes this datastoreTest function which invokes a new go test command.

return goTest(fmt.Sprintf("./internal/datastore/%s/...", datastore), "-tags", "ci,docker", "-timeout", "10m")

Since we have to go through a go test command, it feels like I have to communicate my intent through the command somehow. Leaving me with few options: environment variables or cli flags. Maybe the build tags are also an option? I'm less familiar with that.

Sorry if I'm overlooking something obvious and making this more complicated than it should be 😅

This is done solely so that CI builds don't take too long.
As a solution this is run as a separate datastore test
leveraging build tags

consistency tests are left out of scope for now
@vroldanbet
Copy link
Contributor

@bradengroom I've pushed 8d1dc41 with what I think is a sensible approach to have those tests run in different CI jobs. Thoughts?

Copy link
Contributor

@vroldanbet vroldanbet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Kudos for all the clean up and being willing to look into all suggestions! 🎉

I suspect the tests were already slow the moment I added the PR to test multiple PG versions 😢 They continue to take the same amount of time. I think we can work on that on a follow up if needed.

@vroldanbet
Copy link
Contributor

I think there may be a race happening with database creation, but I think it predated this PR:

Error Trace:	/home/runner/actions-runner/_work/spicedb/spicedb/internal/testserver/datastore/postgres.go:[13](https://github.com/authzed/spicedb/actions/runs/6665578278/job/18115460184?pr=1594#step:4:14)4
            	            				/home/runner/actions-runner/_work/spicedb/spicedb/internal/datastore/postgres/postgres_shared_test.go:207
            	            				/home/runner/actions-runner/_work/spicedb/spicedb/pkg/datastore/test/datastore.go:33
            	            				/home/runner/actions-runner/_work/spicedb/spicedb/pkg/datastore/test/namespace.go:38
            	            				/home/runner/actions-runner/_work/spicedb/spicedb/pkg/datastore/test/datastore.go:72
            	Error:      	Received unexpected error:
            	            	failed to connect to `host=localhost user=postgres database=db5d7fdd59`: server error (FATAL: database "db5d7fdd59" does not exist (SQLSTATE 3D000))
            	Test:       	TestPostgresDatastoreWithPgBouncerWithoutCommitTimestamps/postgres-13.8

@vroldanbet vroldanbet added this pull request to the merge queue Oct 27, 2023
@vroldanbet vroldanbet removed this pull request from the merge queue due to a manual request Oct 27, 2023
@vroldanbet vroldanbet added this pull request to the merge queue Oct 27, 2023
Merged via the queue into authzed:main with commit 6770f9b Oct 27, 2023
@github-actions github-actions bot locked and limited conversation to collaborators Oct 27, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/datastore Affects the storage system area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support pgbouncer with SpiceDB Postgres datastore w/o session pooler mode
3 participants