Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metricTest and variablesTest fail at setup #3

Open
dieexbr opened this issue Dec 16, 2020 · 2 comments
Open

metricTest and variablesTest fail at setup #3

dieexbr opened this issue Dec 16, 2020 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@dieexbr
Copy link

dieexbr commented Dec 16, 2020

I compiled the HTR code with environment variables

module load gcc
module load cuda/10.1
module load openmpi/2.1.6

module load python3/3.8.5 
module load python3-as-python

export CC=gcc
export CXX=g++
export CONDUIT=ibv
export GPU_ARCH=volta


export USE_CUDA=1
export USE_OPENMP=1
export USE_GASNET=1
export USE_HDF=1
export MAX_DIM=3

and scripts/setup_env.py --llvm-version 60 --terra-url 'https://github.com/mariodirenzo/terra.git' --terra-branch 'luajit2.1'. When running python3 testAll.py in unitTests, all tests fail except if I export to the library path

export LD_LIBRARY_PATH="$LEGION_DIR"/bindings/regent:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="$LEGION_DIR"/language/hdf/install/lib:$LD_LIBRARY_PATH

in which case only metricTest and variablesTest fail at setup. There might be another library missing, but I can't figure out which one it could be. The solver does not run nor give any meaningful error output. The error from metricTest is

    /scratch/n23/db6768/HTR-solver/src/prometeo_metric_ConstPropMix_cpu.o: In function `void Legion::LegionTaskWrapper::legion_task_wrapper<&(void TaskHelper::base_gpu_wrapper<InitializeMetricTask>(Legion::Task const*, std::vector<Legion::PhysicalRegion, std::allocator<Legion::PhysicalRegion> > const&, Legion::Internal::TaskContext*, Legion::Runtime*))>(void const*, unsigned long, void const*, unsigned long, Realm::Processor)':
    prometeo_metric.cc:(.text._ZN6Legion17LegionTaskWrapper19legion_task_wrapperIXadL_ZN10TaskHelper16base_gpu_wrapperI20InitializeMetricTaskEEvPKNS_4TaskERKSt6vectorINS_14PhysicalRegionESaIS9_EEPNS_8Internal11TaskContextEPNS_7RuntimeEEEEEvPKvmSK_mN5Realm9ProcessorE[_ZN6Legion17LegionTaskWrapper19legion_task_wrapperIXadL_ZN10TaskHelper16base_gpu_wrapperI20InitializeMetricTaskEEvPKNS_4TaskERKSt6vectorINS_14PhysicalRegionESaIS9_EEPNS_8Internal11TaskContextEPNS_7RuntimeEEEEEvPKvmSK_mN5Realm9ProcessorE]+0x5a): undefined reference to `InitializeMetricTask::gpu_base_impl(InitializeMetricTask::Args const&, std::vector<Legion::PhysicalRegion, std::allocator<Legion::PhysicalRegion> > const&, std::vector<Legion::Future, std::allocator<Legion::Future> > const&, Legion::Internal::TaskContext*, Legion::Runtime*)'
    /scratch/n23/db6768/HTR-solver/src/prometeo_metric_ConstPropMix_cpu.o: In function `void Legion::LegionTaskWrapper::legion_task_wrapper<&(void TaskHelper::base_gpu_wrapper<CorrectGhostMetricTask<(direction)0> >(Legion::Task const*, std::vector<Legion::PhysicalRegion, std::allocator<Legion::PhysicalRegion> > const&, Legion::Internal::TaskContext*, Legion::Runtime*))>(void const*, unsigned long, void const*, unsigned long, Realm::Processor)':
    prometeo_metric.cc:(.text._ZN6Legion17LegionTaskWrapper19legion_task_wrapperIXadL_ZN10TaskHelper16base_gpu_wrapperI22CorrectGhostMetricTaskIL9direction0EEEEvPKNS_4TaskERKSt6vectorINS_14PhysicalRegionESaISB_EEPNS_8Internal11TaskContextEPNS_7RuntimeEEEEEvPKvmSM_mN5Realm9ProcessorE[_ZN6Legion17LegionTaskWrapper19legion_task_wrapperIXadL_ZN10TaskHelper16base_gpu_wrapperI22CorrectGhostMetricTaskIL9direction0EEEEvPKNS_4TaskERKSt6vectorINS_14PhysicalRegionESaISB_EEPNS_8Internal11TaskContextEPNS_7RuntimeEEEEEvPKvmSM_mN5Realm9ProcessorE]+0x5a): undefined reference to `CorrectGhostMetricTask<(direction)0>::gpu_base_impl(CorrectGhostMetricTask<(direction)0>::Args const&, std::vector<Legion::PhysicalRegion, std::allocator<Legion::PhysicalRegion> > const&, std::vector<Legion::Future, std::allocator<Legion::Future> > const&, Legion::Internal::TaskContext*, Legion::Runtime*)'
    /scratch/n23/db6768/HTR-solver/src/prometeo_metric_ConstPropMix_cpu.o: In function `void Legion::LegionTaskWrapper::legion_task_wrapper<&(void TaskHelper::base_gpu_wrapper<CorrectGhostMetricTask<(direction)1> >(Legion::Task const*, std::vector<Legion::PhysicalRegion, std::allocator<Legion::PhysicalRegion> > const&, Legion::Internal::TaskContext*, Legion::Runtime*))>(void const*, unsigned long, void const*, unsigned long, Realm::Processor)':
    prometeo_metric.cc:(.text._ZN6Legion17LegionTaskWrapper19legion_task_wrapperIXadL_ZN10TaskHelper16base_gpu_wrapperI22CorrectGhostMetricTaskIL9direction1EEEEvPKNS_4TaskERKSt6vectorINS_14PhysicalRegionESaISB_EEPNS_8Internal11TaskContextEPNS_7RuntimeEEEEEvPKvmSM_mN5Realm9ProcessorE[_ZN6Legion17LegionTaskWrapper19legion_task_wrapperIXadL_ZN10TaskHelper16base_gpu_wrapperI22CorrectGhostMetricTaskIL9direction1EEEEvPKNS_4TaskERKSt6vectorINS_14PhysicalRegionESaISB_EEPNS_8Internal11TaskContextEPNS_7RuntimeEEEEEvPKvmSM_mN5Realm9ProcessorE]+0x5a): undefined reference to `CorrectGhostMetricTask<(direction)1>::gpu_base_impl(CorrectGhostMetricTask<(direction)1>::Args const&, std::vector<Legion::PhysicalRegion, std::allocator<Legion::PhysicalRegion> > const&, std::vector<Legion::Future, std::allocator<Legion::Future> > const&, Legion::Internal::TaskContext*, Legion::Runtime*)'
    /scratch/n23/db6768/HTR-solver/src/prometeo_metric_ConstPropMix_cpu.o: In function `void Legion::LegionTaskWrapper::legion_task_wrapper<&(void TaskHelper::base_gpu_wrapper<CorrectGhostMetricTask<(direction)2> >(Legion::Task const*, std::vector<Legion::PhysicalRegion, std::allocator<Legion::PhysicalRegion> > const&, Legion::Internal::TaskContext*, Legion::Runtime*))>(void const*, unsigned long, void const*, unsigned long, Realm::Processor)':
    prometeo_metric.cc:(.text._ZN6Legion17LegionTaskWrapper19legion_task_wrapperIXadL_ZN10TaskHelper16base_gpu_wrapperI22CorrectGhostMetricTaskIL9direction2EEEEvPKNS_4TaskERKSt6vectorINS_14PhysicalRegionESaISB_EEPNS_8Internal11TaskContextEPNS_7RuntimeEEEEEvPKvmSM_mN5Realm9ProcessorE[_ZN6Legion17LegionTaskWrapper19legion_task_wrapperIXadL_ZN10TaskHelper16base_gpu_wrapperI22CorrectGhostMetricTaskIL9direction2EEEEvPKNS_4TaskERKSt6vectorINS_14PhysicalRegionESaISB_EEPNS_8Internal11TaskContextEPNS_7RuntimeEEEEEvPKvmSM_mN5Realm9ProcessorE]+0x5a): undefined reference to `CorrectGhostMetricTask<(direction)2>::gpu_base_impl(CorrectGhostMetricTask<(direction)2>::Args const&, std::vector<Legion::PhysicalRegion, std::allocator<Legion::PhysicalRegion> > const&, std::vector<Legion::Future, std::allocator<Legion::Future> > const&, Legion::Internal::TaskContext*, Legion::Runtime*)'
    collect2: error: ld returned 1 exit status
    make: *** [Makefile:86: operatorsTest_Periodic.exec] Error 1

and the error from variableTest is

    /scratch/n23/db6768/HTR-solver/src/prometeo_variables_AirMix_cpu.o: In function `void Legion::LegionTaskWrapper::legion_task_wrapper<&(void TaskHelper::base_gpu_wrapper<UpdatePropertiesFromPrimitiveTask>(Legion::Task const*, std::vector<Legion::PhysicalRegion, std::allocator<Legion::PhysicalRegion> > const&, Legion::Internal::TaskContext*, Legion::Runtime*))>(void const*, unsigned long, void const*, unsigned long, Realm::Processor)':
    prometeo_variables.cc:(.text._ZN6Legion17LegionTaskWrapper19legion_task_wrapperIXadL_ZN10TaskHelper16base_gpu_wrapperI33UpdatePropertiesFromPrimitiveTaskEEvPKNS_4TaskERKSt6vectorINS_14PhysicalRegionESaIS9_EEPNS_8Internal11TaskContextEPNS_7RuntimeEEEEEvPKvmSK_mN5Realm9ProcessorE[_ZN6Legion17LegionTaskWrapper19legion_task_wrapperIXadL_ZN10TaskHelper16base_gpu_wrapperI33UpdatePropertiesFromPrimitiveTaskEEvPKNS_4TaskERKSt6vectorINS_14PhysicalRegionESaIS9_EEPNS_8Internal11TaskContextEPNS_7RuntimeEEEEEvPKvmSK_mN5Realm9ProcessorE]+0x5a): undefined reference to `UpdatePropertiesFromPrimitiveTask::gpu_base_impl(UpdatePropertiesFromPrimitiveTask::Args const&, std::vector<Legion::PhysicalRegion, std::allocator<Legion::PhysicalRegion> > const&, std::vector<Legion::Future, std::allocator<Legion::Future> > const&, Legion::Internal::TaskContext*, Legion::Runtime*)'
    /scratch/n23/db6768/HTR-solver/src/prometeo_variables_AirMix_cpu.o: In function `void Legion::LegionTaskWrapper::legion_task_wrapper<&(void TaskHelper::base_gpu_wrapper<GetVelocityGradientsTask>(Legion::Task const*, std::vector<Legion::PhysicalRegion, std::allocator<Legion::PhysicalRegion> > const&, Legion::Internal::TaskContext*, Legion::Runtime*))>(void const*, unsigned long, void const*, unsigned long, Realm::Processor)':
    prometeo_variables.cc:(.text._ZN6Legion17LegionTaskWrapper19legion_task_wrapperIXadL_ZN10TaskHelper16base_gpu_wrapperI24GetVelocityGradientsTaskEEvPKNS_4TaskERKSt6vectorINS_14PhysicalRegionESaIS9_EEPNS_8Internal11TaskContextEPNS_7RuntimeEEEEEvPKvmSK_mN5Realm9ProcessorE[_ZN6Legion17LegionTaskWrapper19legion_task_wrapperIXadL_ZN10TaskHelper16base_gpu_wrapperI24GetVelocityGradientsTaskEEvPKNS_4TaskERKSt6vectorINS_14PhysicalRegionESaIS9_EEPNS_8Internal11TaskContextEPNS_7RuntimeEEEEEvPKvmSK_mN5Realm9ProcessorE]+0x5a): undefined reference to `GetVelocityGradientsTask::gpu_base_impl(GetVelocityGradientsTask::Args const&, std::vector<Legion::PhysicalRegion, std::allocator<Legion::PhysicalRegion> > const&, std::vector<Legion::Future, std::allocator<Legion::Future> > const&, Legion::Internal::TaskContext*, Legion::Runtime*)'
    /scratch/n23/db6768/HTR-solver/src/prometeo_variables_AirMix_cpu.o: In function `void Legion::LegionTaskWrapper::legion_task_wrapper<&(void TaskHelper::base_gpu_wrapper<GetTemperatureGradientTask>(Legion::Task const*, std::vector<Legion::PhysicalRegion, std::allocator<Legion::PhysicalRegion> > const&, Legion::Internal::TaskContext*, Legion::Runtime*))>(void const*, unsigned long, void const*, unsigned long, Realm::Processor)':
    prometeo_variables.cc:(.text._ZN6Legion17LegionTaskWrapper19legion_task_wrapperIXadL_ZN10TaskHelper16base_gpu_wrapperI26GetTemperatureGradientTaskEEvPKNS_4TaskERKSt6vectorINS_14PhysicalRegionESaIS9_EEPNS_8Internal11TaskContextEPNS_7RuntimeEEEEEvPKvmSK_mN5Realm9ProcessorE[_ZN6Legion17LegionTaskWrapper19legion_task_wrapperIXadL_ZN10TaskHelper16base_gpu_wrapperI26GetTemperatureGradientTaskEEvPKNS_4TaskERKSt6vectorINS_14PhysicalRegionESaIS9_EEPNS_8Internal11TaskContextEPNS_7RuntimeEEEEEvPKvmSK_mN5Realm9ProcessorE]+0x5a): undefined reference to `GetTemperatureGradientTask::gpu_base_impl(GetTemperatureGradientTask::Args const&, std::vector<Legion::PhysicalRegion, std::allocator<Legion::PhysicalRegion> > const&, std::vector<Legion::Future, std::allocator<Legion::Future> > const&, Legion::Internal::TaskContext*, Legion::Runtime*)'
    collect2: error: ld returned 1 exit status
    make: *** [Makefile:84: variablesTest.exec] Error 1
@mariodirenzo
Copy link
Collaborator

mariodirenzo commented Dec 21, 2020

Thanks for reporting the issue. Unfortunately neither the unitTest nor the solverTests are ready to be executed on GPUs. I hope to be able to fix this issue in the future releases.
In the meantime, I would suggest to run the tests with USE_CUDA=0 in your environment

@mariodirenzo mariodirenzo self-assigned this Dec 21, 2020
@mariodirenzo mariodirenzo added the enhancement New feature or request label Dec 21, 2020
@mariodirenzo
Copy link
Collaborator

The test suite has been updated in the latest release of the solver. It should now execute correctly both on CPUs and GPUs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants