Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEOS-1692:add in transform uuid transformer #3079

Merged
merged 10 commits into from
Jan 6, 2025
Merged

NEOS-1692:add in transform uuid transformer #3079

merged 10 commits into from
Jan 6, 2025

Conversation

evisdrenova
Copy link
Contributor

@evisdrenova evisdrenova commented Dec 23, 2024

This feature adds in a Transform UUID transformer. By default, it generates new UUID v4s. You can pass in a seed value to deterministically output UUIDs based on the input UUID. This is useful if you consider UUIDs to be sensitive data but want users to maintain their UUID consistency across tables/databases. So given a seed value, an input UUID will generate the same output UUID.

Demo:
https://www.loom.com/share/cc48d55850e246a488f90976b90e2b3f?sid=965d7b94-31fd-412b-93de-d90db181bde5

@evisdrenova evisdrenova added the Feature Created by Linear-GitHub Sync label Dec 23, 2024
Copy link

linear bot commented Dec 23, 2024

Copy link

vercel bot commented Dec 23, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
neosync-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 6, 2025 7:47pm

Copy link

github-actions bot commented Dec 23, 2024

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedJan 6, 2025, 7:45 PM

Copy link

codecov bot commented Dec 23, 2024

Codecov Report

Attention: Patch coverage is 38.46154% with 88 lines in your changes missing coverage. Please review.

Project coverage is 31.90%. Comparing base (5723274) to head (cdd33bd).
Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
...ker/pkg/benthos/transformers/gen_transform_uuid.go 16.27% 35 Missing and 1 partial ⚠️
worker/pkg/benthos/transformers/transform_uuid.go 60.00% 17 Missing and 9 partials ⚠️
backend/sql/postgresql/models/transformers.go 0.00% 8 Missing ⚠️
worker/pkg/rng/rng.go 0.00% 8 Missing ⚠️
...nal/benthos/benthos-builder/builders/processors.go 0.00% 6 Missing ⚠️
...kg/benthos/transformers/transformer_initializer.go 75.00% 2 Missing and 1 partial ⚠️
...g/benthos/transformers/gen_neosync_transformers.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3079      +/-   ##
==========================================
+ Coverage   31.88%   31.90%   +0.01%     
==========================================
  Files         358      360       +2     
  Lines       41672    41815     +143     
==========================================
+ Hits        13288    13340      +52     
- Misses      26839    26918      +79     
- Partials     1545     1557      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@nickzelei nickzelei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments regarding the algorithm for transforming the uuids

binary.LittleEndian.PutUint64(seedBytes[8:], uint64(randomInt))

// Create a new UUID using SHA1 namespace
output := uuid.NewSHA1(uuid.Nil, seedBytes).String()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually want to put all of the generated uuids in the same Nil namespace?
The original design we came up with puts the new uuids inside of the original uuid's namespace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Comment on lines 95 to 101
seedBytes := make([]byte, 16)

// Use the first 8 bytes from the input UUID
copy(seedBytes, inputUuid[:8])

randomInt := randomizer.Float64()
binary.LittleEndian.PutUint64(seedBytes[8:], uint64(randomInt))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious as to why we are going through this whole song and dance to generate the input data when it accepts any arbitrary amount.

In other words, why not just uuid.NewSha1(inputUuid, []byte(randomizer.Float64()))

Copy link
Contributor Author

@evisdrenova evisdrenova Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the newSha1() takes in a byte array for the second argument and you can't transforma a float64 to a byte array by wrapping it in []byte(). You have to encode it in order to convert it. With that being said, we can simplify it.

worker/pkg/benthos/transformers/transform_uuid_test.go Outdated Show resolved Hide resolved
@evisdrenova evisdrenova requested a review from nickzelei January 2, 2025 21:19
// Create a new UUID using SHA1 namespace
output := uuid.NewSHA1(uuid.Nil, seedBytes).String()
bytes := make([]byte, 16)
binary.LittleEndian.PutUint64(bytes, uint64(randomInt))
Copy link
Member

@nickzelei nickzelei Jan 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you go into the randomizer you can expose the Int63 or Int() function, then the Float64 can just go away entirely.

Then it can become uuid.NewSHA1(inputUuid, []byte(randomizer.Int63()))

@nickzelei nickzelei added the enhancement New feature or request label Jan 6, 2025
@evisdrenova evisdrenova merged commit 0ef11ea into main Jan 6, 2025
19 checks passed
@evisdrenova evisdrenova deleted the transformUuid branch January 6, 2025 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Feature Created by Linear-GitHub Sync
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants