Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn on potential data races #1541

Closed
wants to merge 21 commits into from
Closed

Warn on potential data races #1541

wants to merge 21 commits into from

Conversation

luca-patrignani
Copy link
Contributor

@luca-patrignani luca-patrignani commented Mar 4, 2024

This PR aims to raise a warning if two or more memlets are writing to the same access node and their ranges are overlapping. It also adds a new config entry to activate this check.
This PR tries to close issue #424.

Needs confirmation:

  • Are the new tests overcomplicated?
  • Do we need some unit tests for are_intersecting?

@luca-patrignani
Copy link
Contributor Author

@tbennun when you got some free time can you review this? Thanks

Copy link
Collaborator

@tbennun tbennun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution! Overall this is a good start, but it will not cover the majority of cases of data races that may occur in dace programs (see comments). Also, please use the subsets.intersects API.

dace/config_schema.yml Outdated Show resolved Hide resolved
dace/config_schema.yml Outdated Show resolved Hide resolved
dace/sdfg/validation.py Outdated Show resolved Hide resolved
dace/subsets.py Outdated Show resolved Hide resolved
dace/viewer/webclient Outdated Show resolved Hide resolved
tests/warn_on_potential_data_race_test.py Show resolved Hide resolved
@luca-patrignani luca-patrignani marked this pull request as draft March 12, 2024 18:37
@luca-patrignani luca-patrignani marked this pull request as ready for review March 15, 2024 08:57
Copy link
Collaborator

@tbennun tbennun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much improved! The data race check still needs one more change to be correct (see comment)

tests/warn_on_potential_data_race_test.py Outdated Show resolved Hide resolved
tests/warn_on_potential_data_race_test.py Outdated Show resolved Hide resolved
tests/warn_on_potential_data_race_test.py Outdated Show resolved Hide resolved
tests/warn_on_potential_data_race_test.py Outdated Show resolved Hide resolved
tests/warn_on_potential_data_race_test.py Outdated Show resolved Hide resolved
tests/warn_on_potential_data_race_test.py Outdated Show resolved Hide resolved
tests/warn_on_potential_data_race_test.py Outdated Show resolved Hide resolved
dace/sdfg/validation.py Outdated Show resolved Hide resolved
dace/sdfg/validation.py Outdated Show resolved Hide resolved
write_accesses[node.label].extend([e.data.dst_subset for e in state.in_edges(node)])
read_accesses[node.label].extend([e.data.src_subset for e in state.out_edges(node)])

for node_label in node_labels:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few notes on this test:

  1. You should keep the actual access nodes in your dictionary. If a map that reads from A[i] also writes to A[i], that would trigger a validation warning incorrectly. Instead you need to check if:
  • The node is the same (in that case, write-write conflicts may still happen)
  • If the nodes are not the same, whether there is a path from the first to the second (use has_path to check that). If there is a path, then there is no conflict (read/write or write/write). This is another reason for this check being expensive. There may be some analysis passes in dace.transformation.passes that give you the paths between dependent access nodes, but creating a dictionary would be an optimization. To be safe you can use has_path for now and leave a comment that this could be optimized.
  1. Write-conflict resolution memlets (memlet.wcr is not None) are exempt from write-write conflicts (maybe not read-write)

Please also add an elementwise map test (A->mapentry-A[i]->tasklet-A[i]->mapexit->A) just to make sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to follow your notes but I felt kind of lost, especially for point 2. Consider this two test cases:

def test1():
    sdfg = dace.SDFG('memlet_overlap_with_wcr')
    state = sdfg.add_state()
    sdfg.add_array("A", (20,), dace.int32)
    sdfg.add_array("B", (1,), dace.int32)
    A = state.add_read("A")
    B = state.add_write("B")

    state.add_mapped_tasklet(
        name="first_reduction",
        code="b = a",
        inputs={"a": dace.Memlet(data="A", subset="k")},
        outputs={"b": dace.Memlet(data="B", subset="0", wcr="lambda old, new: old + new")},
        map_ranges={"k": "0:20"},
        external_edges=True,
        input_nodes={"A": A},
        output_nodes={"B": B}
    )

    state.add_mapped_tasklet(
        name="second_reduction",
        code="b = a",
        inputs={"a": dace.Memlet(data="A", subset="k")},
        outputs={"b": dace.Memlet(data="B", subset="0", wcr="lambda old, new: old * new")},
        map_ranges={"k": "0:20"},
        external_edges=True,
        input_nodes={"A": A},
        output_nodes={"B": B}
    )

    with warnings.catch_warnings():
        warnings.simplefilter("error", UserWarning)
        with dace.config.set_temporary('experimental', 'check_race_conditions', value=True):
            sdfg.validate()

def test2():
    sdfg = dace.SDFG('memlet_overlap_with_wcr')
    state = sdfg.add_state()
    sdfg.add_array("A", (20,), dace.int32)
    sdfg.add_array("B", (1,), dace.int32)
    A = state.add_read("A")
    B = state.add_write("B")

    state.add_mapped_tasklet(
        name="first_reduction",
        code="b = a",
        inputs={"a": dace.Memlet(data="A", subset="k")},
        outputs={"b": dace.Memlet(data="B", subset="0", wcr="lambda old, new: old + new")},
        map_ranges={"k": "0:20"},
        external_edges=True,
        input_nodes={"A": A},
        output_nodes={"B": B}
    )

    state.add_mapped_tasklet(
        name="second_reduction",
        code="b = a",
        inputs={"a": dace.Memlet(data="A", subset="k")},
        outputs={"b": dace.Memlet(data="B", subset="0")},
        map_ranges={"k": "0:20"},
        external_edges=True,
        input_nodes={"A": A},
        output_nodes={"B": B}
    )

    with warnings.catch_warnings():
        warnings.simplefilter("error", UserWarning)
        with dace.config.set_temporary('experimental', 'check_race_conditions', value=True):
            sdfg.validate()

Are they considered data races? At this moment my code raises no warnings.
I'm probably missing something important, sorry for this long PR review.

@luca-patrignani
Copy link
Contributor Author

@tbennun when I got some free time can you review the last changes? I left a reply on your last comment.

@phschaad
Copy link
Collaborator

phschaad commented May 16, 2024

Pinging @tbennun , I assume this was burried

Copy link
Collaborator

@tbennun tbennun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making the corrections, LGTM!

@luca-patrignani luca-patrignani closed this by deleting the head repository Jun 10, 2024
@tbennun
Copy link
Collaborator

tbennun commented Oct 29, 2024

@luca-patrignani @phschaad if possible, can we revive this so we can merge it? Thanks!

@phschaad
Copy link
Collaborator

phschaad commented Oct 29, 2024

@tbennun The PR is revived in #1712 :)

github-merge-queue bot pushed a commit that referenced this pull request Oct 29, 2024
Re-implement changes made by @luca-patrignani in #1541, see that PR for
more information.

Author: @luca-patrignani
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants