Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geospatial support for Snowflake #516

Merged
merged 12 commits into from
Oct 16, 2023
Merged

Geospatial support for Snowflake #516

merged 12 commits into from
Oct 16, 2023

Conversation

Amogh-Bharadwaj
Copy link
Contributor

  • Supports GEOGRAPHY and GEOMETRY of PostGIS from PostgreSQL to Snowflake - both CDC and QRep
  • Supports Postgres' POINT data type to Snowflake
  • Adds unique timestamp in prefix for S3 test

@@ -75,6 +76,7 @@ func GetAvroSchemaDefinition(
nullableFields := map[string]bool{}

for _, qField := range qRecordSchema.Fields {
log.Infof("qField name: %s, qField type: %s", qField.Name, qField.Type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove log

@@ -0,0 +1,6 @@
package model

type ColumnInformation struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comments on what column map example would look like

Copy link
Contributor

@iskakaushik iskakaushik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets store the typid to typename in the constructor rather than passing around the ConnStr, and pass around the map as needed.


var qValueKind qvalue.QValueKind
switch typeName {
case "geometry":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"point"?

Copy link
Contributor Author

@Amogh-Bharadwaj Amogh-Bharadwaj Oct 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point in PostGIS comes under GEOMETRY. Example : GEOMETRY('POINT(1 2)'), like GEOMETRY('LINESTRING(1 2, 3 4)')
The point data type I'm adding support for in this PR is Postgres' inbuilt point type which is not custom

@@ -395,7 +413,7 @@ func (qe *QRepQueryExecutor) ExecuteAndProcessQueryStream(
return totalRecordsFetched, nil
}

func mapRowToQRecord(row pgx.Rows, fds []pgconn.FieldDescription) (*model.QRecord, error) {
func mapRowToQRecord(row pgx.Rows, fds []pgconn.FieldDescription, customTypeMap map[uint32]string) (*model.QRecord, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [golangci] reported by reviewdog 🐶
line is 124 characters (lll)

case "NUMBER":
transformations = append(transformations, fmt.Sprintf("$1:\"%s\" AS \"%s\"", colName, colName))
default:
transformations = append(transformations, fmt.Sprintf("($1:\"%s\")::%s AS \"%s\"", colName, colType, colName))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [golangci] reported by reviewdog 🐶
line is 122 characters (lll)

return qvalue.QValueKindInvalid

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [golangci] reported by reviewdog 🐶
unnecessary trailing newline (whitespace)

Kind: customTypeToQKind(typeName),
Value: values[i],
}
record.Set(i, *&customTypeVal)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [golangci] reported by reviewdog 🐶
SA4001: *&x will be simplified to x. It will not copy x. (staticcheck)

@Amogh-Bharadwaj Amogh-Bharadwaj force-pushed the postgis-support branch 2 times, most recently from 92f36f6 to 9618cb8 Compare October 16, 2023 14:46
'{"key": "value"}', 15
'{"key": "value"}', 15,
'POINT(1 2)','POINT(40.7128 -74.0060)',
'LINESTRING(0 0, 1 1, 2 2)','LINESTRING(-74.0060 40.7128, -73.9352 40.7306, -73.9123 40.7831)',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [golangci] reported by reviewdog 🐶
line is 123 characters (lll)

'{"key": "value"}', 15,
'POINT(1 2)','POINT(40.7128 -74.0060)',
'LINESTRING(0 0, 1 1, 2 2)','LINESTRING(-74.0060 40.7128, -73.9352 40.7306, -73.9123 40.7831)',
'POLYGON((0 0, 0 1, 1 1, 1 0, 0 0))','POLYGON((-74.0060 40.7128, -73.9352 40.7306, -73.9123 40.7831, -74.0060 40.7128))'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [golangci] reported by reviewdog 🐶
line is 148 characters (lll)

@Amogh-Bharadwaj Amogh-Bharadwaj merged commit 5fb024f into main Oct 16, 2023
12 checks passed
Amogh-Bharadwaj added a commit that referenced this pull request Jan 2, 2024
Similar to #516 

Leverages https://cloud.google.com/bigquery/docs/geospatial-data to
implement syncing of Postgres' POSTGIS types to BigQuery's GEOGRAPHY
data type - for both QRep, Initial Load and CDC.
In `qrep_avro_sync`, we now have a function where we can perform
transformations of the data on staging table before copying to
destination - a feature which was needed here, and makes it easier to
support datatypes in QRep for BQ

Tests added for QRep and CDC
@serprex serprex deleted the postgis-support branch July 19, 2024 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants