Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table Generation Guaranteed Values #30

Open
stanbrub opened this issue Feb 8, 2023 · 0 comments
Open

Table Generation Guaranteed Values #30

stanbrub opened this issue Feb 8, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@stanbrub
Copy link
Collaborator

stanbrub commented Feb 8, 2023

During table generation for benchmark tests, random is often used as a lazy way to provide a non-sequential distribution of data. The problem is that some tests require looking up generated values which may or may not be present. Consider the following snippet...

result = source.partition_by(['column3']).get_constituent(['random1'])

"random1" is a "column3" column value that is randomly generated. Depending on the scale selected, there is no guarantee that "random1" will exist as a value in the "column3" column.

Possible solutions:

  • Replace random() on table generation with a random that always injects the first value in the defined range, then does random from then on
  • Don't do random on columns at all. Do incremental data with overlapping ranges (ex. col1=[1-100], col2=[1-101] then shuffle the rows)
@stanbrub stanbrub added the enhancement New feature or request label Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant