** The cuda version of algorithm are present in the cuda folder.
- cd into cuda folder
- Do make build. This generates "main_gpu" binary
- Run ./main_gpu power_of_10 func_name file1 file2 operator numberOfThread(redundant but required) numberOfPartition
** make sure file1 and file2 exists
Available func_name: partitionedSortParallel, sortParallel (un-partitioned), partitionedHashParallel
Build process: Run make (generates "main" as executable). Run make build_generator to create data generator. Make sure file1 and file2 exists before execution.
Run ./main power_of_10 func_name file1 file2 operator numberOfThread numberOfPartition
Example: ./main 7 partitionedSortParallel file1 file2 = numberOfThread numberOfPartition
-
The following command reads 10^7 elements from file1 and file2 and joins using parallel partitioned sort algorithm
-
Available operator: "=" only equijoin is supported but this can be extended to >, < operator too.
Available func_name: basicNestedLoop, blockedNestedLoop partitionedSortSerial,partitionedSortSerialSIMD ( with simd sort), partitionedSortParallel, partitionedSortParallelSIMD, basicSort, partitionedHashSerial, partitionedHashSerialSIMD (with simd probe), basicHash
To generate file1 and file2, do: make build_generator and ./generate power_of_10
- For example: ./generate 8. This command generates 2 set of 10^8 unique and random elements and writes them in files file1 and file2
make stat arg=num_element func_name=function_name p=parition_number threads=thread_number
Example: make stat arg=7 func_name=partitionedSortParallel p=1021 threads=4