Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use configuration file to set database collation. #235

Merged
merged 2 commits into from
Sep 25, 2023

Conversation

rmar3a
Copy link
Contributor

@rmar3a rmar3a commented Sep 25, 2023

Looking at how we pass around the values in the configuration file, there's probably a cleaner way to do it if we need to add/remove the list of options in the configuration file. However, this is probably good enough for now.

In addition, it looks like the collation might have been causing the non-deterministic results we were seeing in the build. Making sure the collation is consistent for all the databases (input and output) seems to make the output in the build consistent now. Guess we will have a better idea after more builds.

Update: Looks like the non-deterministic issue is still there and seems to have a different set of possible outcomes (each collation seems to have its own distinct set of possible non-deterministic outcomes) 😢

The issue is with the order of the numbers for this "table" in the BIF file that gets generated:

<DEFINITION>
	<FOR>capability(prof0,student0)</FOR>
	<GIVEN>RA(prof0,student0)</GIVEN>
	<GIVEN>salary(prof0,student0)</GIVEN>
	<TABLE> 0.166667 0.166667 0.166667 0.166667 0.166665 0.166667 0.166667 0.166667 0.166667 0.166667 0.166665 0.166667 0.166667 0.166667 0.166667 0.166667 0.166665 0.166667 0.000491 0.000491 0.000491 0.000491 0.000491 0.997545 0.008621 0.008621 0.181034 0.439655 0.353448 0.008621 0.375 0.375 0.196429 0.017857 0.017857 0.017857 0.322917 0.21875 0.427082 0.010417 0.010417 0.010417 0.166667 0.166667 0.166667 0.166667 0.166665 0.166667</TABLE>
</DEFINITION>

Not sure if the order matters, will need to look into what the values in this table are exactly (some sort of conditional probabilities?).

@rmar3a rmar3a force-pushed the collation-configuration branch 8 times, most recently from de90a4c to 4816895 Compare September 25, 2023 05:06
- The executable comments found in the SQL files were being removed due to the
  logic in the prepareFile() method.
- Updated areas in the codebase that set the collation to set the collation via
  a value drawn from the configuration file passed to FactorBase.
@rmar3a rmar3a force-pushed the collation-configuration branch from 4816895 to 444eb3d Compare September 25, 2023 05:09
@rmar3a rmar3a requested a review from oschulte September 25, 2023 05:12
@oschulte
Copy link
Collaborator

yes they are conditional probabilities. In this case P(capability|RA, salary). That's too bad about the non-determinism. we should discuss this with Mingyi too.

@oschulte oschulte merged commit c63bbf6 into master Sep 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants