Skip to content

Commit

Permalink
Yield the same order regardless of order in the file and database
Browse files Browse the repository at this point in the history
  • Loading branch information
ViacheslavP committed Nov 28, 2024
1 parent b605610 commit f8c2ac9
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions soda/scientific/soda/scientific/distribution/comparison.py
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,10 @@ def evaluate(self) -> dict[str, float]:
ref_data_frequencies = ref_data_frequencies + 1
test_data_frequencies = test_data_frequencies + 1

# sort the data to make sure that the categories are in the same order before cast to array_like
ref_data_frequencies = ref_data_frequencies.sort_index()
test_data_frequencies = test_data_frequencies.sort_index()

# Normalise because scipy wants sums of observed and reference counts to be equal
# workaround found and discussed in: https://github.com/UDST/synthpop/issues/75#issuecomment-907137304
stat_value, p_value = chisquare(
Expand Down

0 comments on commit f8c2ac9

Please sign in to comment.