Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to observe skew data, what am i doing wrong ? #3

Open
archit-rastogi opened this issue Feb 7, 2025 · 0 comments
Open

Unable to observe skew data, what am i doing wrong ? #3

archit-rastogi opened this issue Feb 7, 2025 · 0 comments

Comments

@archit-rastogi
Copy link

archit-rastogi commented Feb 7, 2025

Steps:

mkdir -p data $$ cd data && ../dbgen -f -v -k -s 1 -b ../dists.dss
# load data into all tables
# using copy <table_name> from syntax

But when i try to inspect, for example c-n join, i observe equal distribution of customers.
I expect in each of 5 regions, one nation to have 18% while remaining 4 nations to get 0.4 % of total customers.

fyi:

select n_nationkey, n_regionkey, count(c_custkey) from customer join nation on nation.n_nationkey = customer.c_nationkey group by n_regionkey, n_nationkey  order by n_regionkey;;
 n_nationkey | n_regionkey | count 
-------------+-------------+-------
           5 |           0 |  5952
          16 |           0 |  5974
          15 |           0 |  5921
          14 |           0 |  5992
           0 |           0 |  5925
          24 |           1 |  5983
           1 |           1 |  5975
           2 |           1 |  5999
           3 |           1 |  6020
          17 |           1 |  5975
          21 |           2 |  6008
           9 |           2 |  6161
          12 |           2 |  5948
          18 |           2 |  6024
           8 |           2 |  6042
          19 |           3 |  6100
          22 |           3 |  6078
          23 |           3 |  6011
           7 |           3 |  5908
           6 |           3 |  6100
           4 |           4 |  5995
          11 |           4 |  5963
          13 |           4 |  6033
          10 |           4 |  6009
          20 |           4 |  5904

Would be great if you point out what i am missing ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant