Skip to content

Commit

Permalink
[SYCL][Graph] Refine barrier semantics
Browse files Browse the repository at this point in the history
Changes the semantics of
[sycl_ext_oneapi_enqueue_barrier](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/supported/sycl_ext_oneapi_enqueue_barrier.asciidoc) commands recorded from a queue into a
SYCL-Graph.

The current semantics are that a barrier without a wait-list will
depend on all leaf nodes in the graph, and be a dependency of any
subsequent node added to the graph.

After discussion with users, this change updates the behavior of
barriers to only depend on leaf nodes from the same recording queue,
and only commands recorded from the same queue will have a dependency
on the barrier.
  • Loading branch information
EwanC committed Jul 1, 2024
1 parent 1f3f02b commit 58cb380
Show file tree
Hide file tree
Showing 6 changed files with 518 additions and 124 deletions.
23 changes: 16 additions & 7 deletions sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -1736,15 +1736,24 @@ passed an invalid event.
The new handler methods, and queue shortcuts, defined by
link:../supported/sycl_ext_oneapi_enqueue_barrier.asciidoc[sycl_ext_oneapi_enqueue_barrier]
can only be used in graph nodes created using the Record & Replay API, as
barriers rely on events to enforce dependencies. For barriers with an empty
wait list parameter, the semantics are that the barrier node being added to
will depend on all the existing graph leaf nodes, not only the leaf nodes
that were added from the queue being recorded.
barriers rely on events to enforce dependencies.

A synchronous exception will be thrown with error code `invalid` if a user
tries to add them to a graph using the Explicit API. Empty nodes created with
the `node::depends_on_all_leaves` property can be used instead of barriers when
a user is building a graph with the explicit API.
tries to add a barrier command to a graph using the explicit API. Empty nodes
created with the `node::depends_on_all_leaves` property can be used instead of
barriers when a user is building a graph with the explicit API.

The semantics of barriers are defined in `sycl_ext_oneapi_enqueue_barrier` for
a single command-queue, and correlate as follows to a graph that may contain
nodes that are recorded from multiple queues and/or added by the explicit API:

* Barriers with an empty wait list parameter will only depend on the leaf nodes
that were added to the graph from the queue the barrier command is being
recorded from.

* The only commands which have an implicit dependency on the barrier command
added to the graph are those recorded from the same queue the barrier command
was submitted to.

==== sycl_ext_oneapi_memcpy2d

Expand Down
20 changes: 7 additions & 13 deletions sycl/source/detail/graph_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -353,9 +353,6 @@ graph_impl::add(const std::shared_ptr<graph_impl> &Impl,

const std::shared_ptr<node_impl> &NodeImpl = std::make_shared<node_impl>();

// Add any deps from the vector of extra dependencies
Deps.insert(Deps.end(), MExtraDependencies.begin(), MExtraDependencies.end());

MNodeStorage.push_back(NodeImpl);

addDepsToNode(NodeImpl, Deps);
Expand Down Expand Up @@ -488,20 +485,12 @@ graph_impl::add(node_type NodeType,
// list
Deps.insert(Deps.end(), UniqueDeps.begin(), UniqueDeps.end());

// Add any deps from the extra dependencies vector
Deps.insert(Deps.end(), MExtraDependencies.begin(), MExtraDependencies.end());

const std::shared_ptr<node_impl> &NodeImpl =
std::make_shared<node_impl>(NodeType, std::move(CommandGroup));
MNodeStorage.push_back(NodeImpl);

addDepsToNode(NodeImpl, Deps);

// Set barrier nodes as prerequisites (new start points) for subsequent nodes
if (NodeImpl->MCGType == sycl::detail::CG::Barrier) {
MExtraDependencies.push_back(NodeImpl);
}

return NodeImpl;
}

Expand Down Expand Up @@ -610,12 +599,17 @@ void graph_impl::makeEdge(std::shared_ptr<node_impl> Src,
removeRoot(Dest); // remove receiver from root node list
}

std::vector<sycl::detail::EventImplPtr> graph_impl::getExitNodesEvents() {
std::vector<sycl::detail::EventImplPtr> graph_impl::getExitNodesEvents(
std::weak_ptr<sycl::detail::queue_impl> RecordedQueue) {
std::vector<sycl::detail::EventImplPtr> Events;

auto RecordedQueueSP = RecordedQueue.lock();
for (auto &Node : MNodeStorage) {
if (Node->MSuccessors.empty()) {
Events.push_back(getEventForNode(Node));
auto EventForNode = getEventForNode(Node);
if (EventForNode->getSubmittedQueue() == RecordedQueueSP) {
Events.push_back(getEventForNode(Node));
}
}
}

Expand Down
46 changes: 23 additions & 23 deletions sycl/source/detail/graph_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1184,26 +1184,26 @@ class graph_impl {
size_t getNumberOfNodes() const { return MNodeStorage.size(); }

/// Traverse the graph recursively to get the events associated with the
/// output nodes of this graph.
/// output nodes of this graph associated with a specific queue.
/// @param[in] Queue The queue exit nodes must have been recorded from.
/// @return vector of events associated to exit nodes.
std::vector<sycl::detail::EventImplPtr> getExitNodesEvents();

/// Removes all Barrier nodes from the list of extra dependencies
/// MExtraDependencies.
/// @return vector of events associated to previous barrier nodes.
std::vector<sycl::detail::EventImplPtr>
removeBarriersFromExtraDependencies() {
std::vector<sycl::detail::EventImplPtr> Events;
for (auto It = MExtraDependencies.begin();
It != MExtraDependencies.end();) {
if ((*It)->MCGType == sycl::detail::CG::Barrier) {
Events.push_back(getEventForNode(*It));
It = MExtraDependencies.erase(It);
} else {
++It;
}
}
return Events;
getExitNodesEvents(std::weak_ptr<sycl::detail::queue_impl> Queue);

/// Store the last barrier node that was submitted to the queue.
/// @param[in] Queue The queue the barrier was recorded from.
/// @param[in] BarrierNodeImpl The created barrier node.
void setBarrierDep(std::weak_ptr<sycl::detail::queue_impl> Queue,
std::shared_ptr<node_impl> BarrierNodeImpl) {
MBarrierDependencyMap[Queue] = BarrierNodeImpl;
}

/// Get the last barrier node that was submitted to the queue.
/// @param[in] Queue The queue to find the last barrier node of. An empty
/// shared_ptr is returned if no barrier node has been recorded to the queue.
std::shared_ptr<node_impl>
getBarrierDep(std::weak_ptr<sycl::detail::queue_impl> Queue) {
return MBarrierDependencyMap[Queue];
}

private:
Expand Down Expand Up @@ -1281,11 +1281,11 @@ class graph_impl {
/// presence of the assume_buffer_outlives_graph property.
bool MAllowBuffers = false;

/// List of nodes that must be added as extra dependencies to new nodes when
/// added to this graph.
/// This list is mainly used by barrier nodes which must be considered
/// as predecessors for all nodes subsequently added to the graph.
std::list<std::shared_ptr<node_impl>> MExtraDependencies;
/// Mapping from queues to barrier nodes. For each queue the last barrier
/// node recorded to the graph from the queue is stored.
std::map<std::weak_ptr<sycl::detail::queue_impl>, std::shared_ptr<node_impl>,
std::owner_less<std::weak_ptr<sycl::detail::queue_impl>>>
MBarrierDependencyMap;
};

/// Class representing the implementation of command_graph<executable>.
Expand Down
26 changes: 12 additions & 14 deletions sycl/source/handler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -403,19 +403,6 @@ event handler::finalize() {
case detail::CG::Barrier:
case detail::CG::BarrierWaitlist: {
if (auto GraphImpl = getCommandGraph(); GraphImpl != nullptr) {
// if no event to wait for was specified, we add all exit
// nodes/events of the graph
if (MEventsWaitWithBarrier.size() == 0) {
MEventsWaitWithBarrier = GraphImpl->getExitNodesEvents();
// Graph-wide barriers take precedence over previous one.
// We therefore remove the previous ones from ExtraDependencies list.
// The current barrier is then added to this list in the graph_impl.
std::vector<detail::EventImplPtr> EventsBarriers =
GraphImpl->removeBarriersFromExtraDependencies();
MEventsWaitWithBarrier.insert(std::end(MEventsWaitWithBarrier),
std::begin(EventsBarriers),
std::end(EventsBarriers));
}
CGData.MEvents.insert(std::end(CGData.MEvents),
std::begin(MEventsWaitWithBarrier),
std::end(MEventsWaitWithBarrier));
Expand Down Expand Up @@ -533,6 +520,7 @@ event handler::finalize() {
// it to the graph to create a node, rather than submit it to the scheduler.
if (auto GraphImpl = MQueue->getCommandGraph(); GraphImpl) {
auto EventImpl = std::make_shared<detail::event_impl>();
EventImpl->setSubmittedQueue(MQueue);
std::shared_ptr<ext::oneapi::experimental::detail::node_impl> NodeImpl =
nullptr;

Expand Down Expand Up @@ -564,7 +552,17 @@ event handler::finalize() {
// queue.
GraphImpl->setLastInorderNode(MQueue, NodeImpl);
} else {
NodeImpl = GraphImpl->add(NodeType, std::move(CommandGroup));
auto LastBarrierRecordedFromQueue = GraphImpl->getBarrierDep(MQueue);
if (LastBarrierRecordedFromQueue) {
NodeImpl = GraphImpl->add(NodeType, std::move(CommandGroup),
{LastBarrierRecordedFromQueue});
} else {
NodeImpl = GraphImpl->add(NodeType, std::move(CommandGroup));
}

if (NodeImpl->MCGType == sycl::detail::CG::Barrier) {
GraphImpl->setBarrierDep(MQueue, NodeImpl);
}
}

// Associate an event with this new node and return the event.
Expand Down
Loading

0 comments on commit 58cb380

Please sign in to comment.