Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Remove memory allocation/free nodes #79

Merged
merged 3 commits into from
Mar 16, 2023

Conversation

EwanC
Copy link
Collaborator

@EwanC EwanC commented Feb 20, 2023

There are concerns with the currently specified approach so remove the API and advise users to allocate memory before submitting the graph. Marked as draft for now as this requires meeting discussion.

Conflicts with semantics of these PRs, which should be closed if we merge this:

Addressed #39

@EwanC EwanC added the Graph Specification Extension Specification related label Feb 20, 2023
Copy link
Collaborator

@Bensuo Bensuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a minor comment.

@sommerlukas
Copy link

As already discussed a bit over in the graph fusion proposal, the current proposal for graph fusion builds on top of add_malloc_device to specify internalization of pointers through a property.

Removing the add_malloc_device API from the graph proposal does not make internalization of USM pointers impossible, as the same property could be passed to the core USM functions malloc_device/malloc_shared, to mark a USM pointer for internalization (although this might slightly complicate the implementation).

However, attaching the property to add_malloc_device has two advantages over attaching it to malloc_device/malloc_shared:

  • Granularity: When attaching the property to add_malloc_device, it is possible to limit the scope of the internalization semantics to just the graph. When attaching the same property to malloc_device/malloc_shared, the internalization semantics would apply to all graphs using this USM pointer. Of course, this can be avoided by allocating device memory just for the use in the graph by the user, but users would need to make sure to not accidentally use a pointer with internalization semantics in a graph where this is not intended.
  • Optimization: When the internalization property is attached to add_malloc_device and the JIT compiler is successful in his attempt to internalize during graph::finalize, it is possible to completely remove the device allocation from the graph. This can reduce the number of device allocations, potentially leading to higher performance. If the USM allocation happens outside of the graph and the graph has no control over it, this optimization won't be possible.

tl;dr: Removing add_malloc_device will not make USM internalization in graph fusion impossible, but it may hinder some optimizations and require extra care from the user.

Copy link
Owner

@reble reble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Some minor comments.

There are [concerns with the currently specified
approach](intel#5626 (comment)) so
remove the API and advise users to allocate memory before submitting the
graph.
@EwanC EwanC force-pushed the ewan/remove-usm-alloc-node branch from 02b0218 to 0c9f37a Compare March 16, 2023 11:12
@EwanC EwanC marked this pull request as ready for review March 16, 2023 13:21
@EwanC
Copy link
Collaborator Author

EwanC commented Mar 16, 2023

Merging this, since although we're in the middle of considering other ideas of the memory allocation API, the existing API isn't one we currently want to promote, so removing it better represents the current status of the feature.

Once we have a concrete idea of the memory allocation API we want to advocate, then we can create a PR for that. Even reverting this commit from the git history, in the case it does turn out we want to bring back what we had.

@EwanC EwanC merged commit 8cef0c2 into sycl-graph-update Mar 16, 2023
@Bensuo Bensuo deleted the ewan/remove-usm-alloc-node branch August 7, 2023 16:27
mfrancepillois pushed a commit that referenced this pull request Oct 2, 2023
…… (#67069)

We noticed some performance issue while in lldb-vscode for grabing the
name of the SBValue. Profiling shows SBValue::GetName() can cause
synthetic children provider of shared/unique_ptr to deference underlying
object and complete it type.

This patch lazily moves the dereference from synthetic child provider's
Update() method to GetChildAtIndex() so that SBValue::GetName() won't
trigger the slow code path.

Here is the culprit slow code path:
```
...
frame #59: 0x00007ff4102e0660 liblldb.so.15`SymbolFileDWARF::CompleteType(this=<unavailable>, compiler_type=0x00007ffdd9829450) at SymbolFileDWARF.cpp:1567:25 [opt]
...
frame #67: 0x00007ff40fdf9bd4 liblldb.so.15`lldb_private::ValueObject::Dereference(this=0x0000022bb5dfe980, error=0x00007ffdd9829970) at ValueObject.cpp:2672:41 [opt]
frame #68: 0x00007ff41011bb0a liblldb.so.15`(anonymous namespace)::LibStdcppSharedPtrSyntheticFrontEnd::Update(this=0x000002298fb94380) at LibStdcpp.cpp:403:40 [opt]
frame #69: 0x00007ff41011af9a liblldb.so.15`lldb_private::formatters::LibStdcppSharedPtrSyntheticFrontEndCreator(lldb_private::CXXSyntheticChildren*, std::shared_ptr<lldb_private::ValueObject>) [inlined] (anonymous namespace)::LibStdcppSharedPtrSyntheticFrontEnd::LibStdcppSharedPtrSyntheticFrontEnd(this=0x000002298fb94380, valobj_sp=<unavailable>) at LibStdcpp.cpp:371:5 [opt]
...
frame #78: 0x00007ff40fdf6e42 liblldb.so.15`lldb_private::ValueObject::CalculateSyntheticValue(this=0x000002296c66a500) at ValueObject.cpp:1836:27 [opt]
frame #79: 0x00007ff40fdf1939 liblldb.so.15`lldb_private::ValueObject::GetSyntheticValue(this=<unavailable>) at ValueObject.cpp:1867:3 [opt]
frame #80: 0x00007ff40fc89008 liblldb.so.15`ValueImpl::GetSP(this=0x0000022c71b90de0, stop_locker=0x00007ffdd9829d00, lock=0x00007ffdd9829d08, error=0x00007ffdd9829d18) at SBValue.cpp:141:46 [opt]
frame #81: 0x00007ff40fc7d82a liblldb.so.15`lldb::SBValue::GetSP(ValueLocker&) const [inlined] ValueLocker::GetLockedSP(this=0x00007ffdd9829d00, in_value=<unavailable>) at SBValue.cpp:208:21 [opt]
frame #82: 0x00007ff40fc7d817 liblldb.so.15`lldb::SBValue::GetSP(this=0x00007ffdd9829d90, locker=0x00007ffdd9829d00) const at SBValue.cpp:1047:17 [opt]
frame #83: 0x00007ff40fc7da6f liblldb.so.15`lldb::SBValue::GetName(this=0x00007ffdd9829d90) at SBValue.cpp:294:32 [opt]
...
```

Differential Revision: https://reviews.llvm.org/D159542
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Graph Specification Extension Specification related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants