Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(root): introduce schema-seed #376

Merged

Conversation

dkropachev
Copy link
Collaborator

@dkropachev dkropachev commented Jul 2, 2023

Statement data and schema seed are the same at the moment.
We need to split them in order to be able to run different data flow on the same schema

@dkropachev dkropachev linked an issue Jul 2, 2023 that may be closed by this pull request
@dkropachev dkropachev force-pushed the dk/370-gemini-generates-mutations-non-deterministically branch from 74d1bb3 to b7d3e79 Compare July 2, 2023 02:55
@dkropachev dkropachev requested review from vponomaryov and nuivall July 2, 2023 03:01
@dkropachev dkropachev force-pushed the dk/370-gemini-generates-mutations-non-deterministically branch from b7d3e79 to d9160c1 Compare July 2, 2023 13:01
@nuivall
Copy link
Member

nuivall commented Jul 4, 2023

I don't immediately see why splitting is a must. Could you elaborate?

@dkropachev
Copy link
Collaborator Author

dkropachev commented Jul 4, 2023

I don't immediately see why splitting is a must. Could you elaborate?

Say you want to change payload, but keep the schema, in current implementation you have to provide a schema file.
Better solution would be to split schema seed from payload seed, which allow you to pick schema by schema-seed and pick payload by payload-seed.

As of now we use global Rand instance from rnd to generate it, changing it's seed, which is not straightforward and could be unexpectedly accidentially poisoned.
This PR changes schema generation code to use local instance of rnd.Rand and intrduce --schema-seed command line option.

@nuivall
Copy link
Member

nuivall commented Jul 4, 2023

If there is an option to provide schema file then I guess it will be different than one generated by seed, so no need to be compatible here.

I think the problem is subtle, and solving it needs to take concurrency into consideration

@dkropachev
Copy link
Collaborator Author

dkropachev commented Jul 4, 2023

If there is an option to provide schema file then I guess it will be different than one generated by seed, so no need to be compatible here.

Not exactly, when gemini starts it outputs schema it generated, you can take it as it is or updated and feed back to gemini, it allows you to have either exact same schema or something different.
Not quite get compatibility issue, could you please elaborate?

I think the problem is subtle, and solving it needs to take concurrency into consideration

Agree, it is not major issue.
There is no concurency in the schema generation.

@nuivall
Copy link
Member

nuivall commented Jul 4, 2023

You can but why would you do it? If you want to achieve the same behaviour you need to provide the same seed. In such case you don't need to provide any file because generated schema will be the same anyway.

@dkropachev
Copy link
Collaborator Author

dkropachev commented Jul 4, 2023

You can but why would you do it? If you want to achieve the same behaviour you need to provide the same seed. In such case you don't need to provide any file because generated schema will be the same anyway.

Ahh, 100% agree, but there is one problem - if we change way schema is generated, change order of invoking rnd.Rand methods or add additional things there, then same schema seed will give different schema, while schema file gives you persistent result.

@nuivall
Copy link
Member

nuivall commented Jul 4, 2023

Such changes are breaking and need to be announced (once generation is stable, now it doesn't matter much). And you won't achieve the same result if you've changed something the requests likely also will be different.

Moreover this can be alleviated by pinning version of gemini to test run, and it would be a solution covering more cases.

@dkropachev
Copy link
Collaborator Author

dkropachev commented Jul 4, 2023

Such changes are breaking and need to be announced (once generation is stable, now it doesn't matter much). And you won't achieve the same result if you've changed something the requests likely also will be different.

True, but it does not relate to schema, queries is another topic.

Moreover this can be alleviated by pinning version of gemini to test run, and it would be a solution covering more cases.

Agree.
schema file is option to have persistent schema from version to version
schema seed is option to have persistent schema, but it's persistence could be violated when you move from version to version

@dkropachev dkropachev force-pushed the dk/370-gemini-generates-mutations-non-deterministically branch from d9160c1 to 46ef027 Compare July 7, 2023 10:18
@dkropachev dkropachev requested a review from roydahan July 7, 2023 10:19
@dkropachev dkropachev force-pushed the dk/370-gemini-generates-mutations-non-deterministically branch 2 times, most recently from 9d5b1b7 to 674400b Compare July 7, 2023 10:32
@dkropachev
Copy link
Collaborator Author

@nuivall, are you ok with merging it ?

@dkropachev dkropachev force-pushed the dk/370-gemini-generates-mutations-non-deterministically branch from 674400b to 6d737e4 Compare July 7, 2023 12:48
@dkropachev dkropachev force-pushed the dk/370-gemini-generates-mutations-non-deterministically branch from 6d737e4 to ab185de Compare July 8, 2023 00:13
@dkropachev dkropachev merged commit c819a9a into master Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gemini generates mutations non-deterministically
2 participants