Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tpetra: copy and permute improvements #13714

Draft
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

tjfulle
Copy link
Contributor

@tjfulle tjfulle commented Jan 8, 2025

@trilinos/tpetra

Work completed by @skennon10

Motivation

In some simulations, up to 70-80 of time can be spent in Tpetra::CrsMatrix::copy_and_permute. This PR reduces that time dramatically.

Supersedes #13682

Stakeholder Feedback

From an affected customer:

Great news! I was able to run our code against Steve’s branch and it looks like everything is working great. I ran two short GPU test problems and the time spent in panzer::AssemblyEngine::evaluate_scatter decreased from 69 seconds to 0.8 seconds in one case and 8.8 seconds to 0.1 seconds in the other. In both cases, evaluate_scatter has gone from 70-80% of the total Jacobian construction cost to now <5%. This appears to fully resolve our Tpetra performance issue -- please let us know when the branch has been merged onto Trilinos develop.

Testing

Local Tpetra tests passed.

@tjfulle tjfulle added pkg: Tpetra PA: Data Services Issues that fall under the Trilinos Data Services Product Area labels Jan 8, 2025
@tjfulle tjfulle requested review from rppawlo, csiefer2 and jhux2 January 8, 2025 23:41
@tjfulle tjfulle self-assigned this Jan 8, 2025
@tjfulle tjfulle requested a review from a team as a code owner January 8, 2025 23:41
@jhux2 jhux2 added the AT: PRE-TEST INSPECTED Required to test outside contributions. This label alone will not allow a PR to merge. label Jan 8, 2025
@tjfulle tjfulle marked this pull request as draft January 14, 2025 18:36
@cgcgcg
Copy link
Contributor

cgcgcg commented Jan 14, 2025

Should this PR be tested? Or is it WIP?

Copy link
Contributor

@cgcgcg cgcgcg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve so that the AT runs..

@cgcgcg cgcgcg added the AT: RETEST Causes the PR autotester to run a new round of PR tests on the next iteration label Jan 14, 2025
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: PR_gcc-openmpi-openmp

  • Build Num: 1008
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-openmp_release-debug_static_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;AT: RETEST;PA: Data Services;AT: PRE-TEST INSPECTED
PULLREQUESTNUM 13714
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA 5dd8609
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA cc1fe7a

Build Information

Test Name: PR_gcc

  • Build Num: 1058
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_no-mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;AT: RETEST;PA: Data Services;AT: PRE-TEST INSPECTED
PULLREQUESTNUM 13714
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA 5dd8609
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA cc1fe7a

Build Information

Test Name: PR_gcc-openmpi_debug

  • Build Num: 1059
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-serial_debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;AT: RETEST;PA: Data Services;AT: PRE-TEST INSPECTED
PULLREQUESTNUM 13714
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA 5dd8609
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA cc1fe7a

Build Information

Test Name: PR_clang

  • Build Num: 1057
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-clang-11.0.1-openmpi-4.0.5-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;AT: RETEST;PA: Data Services;AT: PRE-TEST INSPECTED
PULLREQUESTNUM 13714
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA 5dd8609
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA cc1fe7a

Build Information

Test Name: PR_cuda

  • Build Num: 1056
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-cuda-11.4.2-gnu-10.1.0-openmpi-4.1.6_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;AT: RETEST;PA: Data Services;AT: PRE-TEST INSPECTED
PULLREQUESTNUM 13714
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8-gpu
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA 5dd8609
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA cc1fe7a

Build Information

Test Name: PR_intel

  • Build Num: 977
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-intel-2021.3-sems-openmpi-4.1.6_release-debug_shared_no-kokkos-arch_no-asan_no-complex_fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;AT: RETEST;PA: Data Services;AT: PRE-TEST INSPECTED
PULLREQUESTNUM 13714
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA 5dd8609
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA cc1fe7a

Build Information

Test Name: PR_cuda-uvm

  • Build Num: 1056
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-cuda-11.4.2-gnu-10.1.0-openmpi-4.1.6_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;AT: RETEST;PA: Data Services;AT: PRE-TEST INSPECTED
PULLREQUESTNUM 13714
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA 5dd8609
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA cc1fe7a

Using Repos:

Repo: TRILINOS (trilinos/Trilinos)
  • Branch: tpetra/copy-and-permute-3
  • SHA: 5dd8609
  • Mode: TEST_REPO

Pull Request Author: tjfulle

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED

Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run.

Pull Request Auto Testing has FAILED (click to expand)

Build Information

Test Name: PR_gcc-openmpi-openmp

  • Build Num: 1008
  • Status: FAILED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-openmp_release-debug_static_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;AT: RETEST;PA: Data Services;AT: PRE-TEST INSPECTED
PULLREQUESTNUM 13714
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA 5dd8609
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA cc1fe7a

Build Information

Test Name: PR_gcc

  • Build Num: 1058
  • Status: FAILED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_no-mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;AT: RETEST;PA: Data Services;AT: PRE-TEST INSPECTED
PULLREQUESTNUM 13714
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA 5dd8609
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA cc1fe7a

Build Information

Test Name: PR_gcc-openmpi_debug

  • Build Num: 1059
  • Status: FAILED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-serial_debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;AT: RETEST;PA: Data Services;AT: PRE-TEST INSPECTED
PULLREQUESTNUM 13714
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA 5dd8609
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA cc1fe7a

Build Information

Test Name: PR_clang

  • Build Num: 1057
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-clang-11.0.1-openmpi-4.0.5-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;AT: RETEST;PA: Data Services;AT: PRE-TEST INSPECTED
PULLREQUESTNUM 13714
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA 5dd8609
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA cc1fe7a

Build Information

Test Name: PR_cuda

  • Build Num: 1056
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-cuda-11.4.2-gnu-10.1.0-openmpi-4.1.6_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;AT: RETEST;PA: Data Services;AT: PRE-TEST INSPECTED
PULLREQUESTNUM 13714
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8-gpu
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA 5dd8609
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA cc1fe7a

Build Information

Test Name: PR_intel

  • Build Num: 977
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-intel-2021.3-sems-openmpi-4.1.6_release-debug_shared_no-kokkos-arch_no-asan_no-complex_fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;AT: RETEST;PA: Data Services;AT: PRE-TEST INSPECTED
PULLREQUESTNUM 13714
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA 5dd8609
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA cc1fe7a

Build Information

Test Name: PR_cuda-uvm

  • Build Num: 1056
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-cuda-11.4.2-gnu-10.1.0-openmpi-4.1.6_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;AT: RETEST;PA: Data Services;AT: PRE-TEST INSPECTED
PULLREQUESTNUM 13714
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA 5dd8609
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA cc1fe7a


CDash Test Results for PR# 13714.


Wiki: How to Reproduce PR Testing Builds and Errors.

@trilinos-autotester trilinos-autotester removed the AT: RETEST Causes the PR autotester to run a new round of PR tests on the next iteration label Jan 15, 2025
@jhux2
Copy link
Member

jhux2 commented Jan 15, 2025

@cgcgcg I though pre-test inspection was sufficient to start the AT. Is approval now needed?

@cgcgcg
Copy link
Contributor

cgcgcg commented Jan 15, 2025

@jhux2 I'm confused by this as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AT: PRE-TEST INSPECTED Required to test outside contributions. This label alone will not allow a PR to merge. PA: Data Services Issues that fall under the Trilinos Data Services Product Area pkg: Tpetra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants