-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Develop zerocopy #1053
Develop zerocopy #1053
Conversation
…lper XLink functions for FDs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Filippo!
I left a small nitpicks.
You can run clangformat on the code with cmake --build build --target clangformat
.
- I think we should extend
SharedMemory
so it can be allocated on core side (conditionally for linux builds). - And to extend
writingThread
inDataQueue.cpp
to send out frames with shared memory when possible.
To allow for fast transfer in both directions.
CMakeLists.txt
Outdated
@@ -15,7 +15,7 @@ if(WIN32) | |||
set(DEPTHAI_CURL_USE_OPENSSL OFF) | |||
else() | |||
set(DEPTHAI_CURL_USE_SCHANNEL OFF) | |||
set(DEPTHAI_CURL_USE_OPENSSL ON) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove this one before the merge.
@@ -6,6 +6,8 @@ | |||
#include "depthai/utility/Memory.hpp" | |||
#include "depthai/utility/Serialization.hpp" | |||
#include "depthai/utility/VectorMemory.hpp" | |||
#include "depthai/utility/SharedMemory.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, removed
|
||
/** | ||
* @param data Moves data to internal buffer | ||
*/ | ||
void setData(std::vector<std::uint8_t>&& data); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Run clangformat target
SharedMemory() { | ||
kind = MemoryKinds::MEMORY_KIND_SHARED_MEMORY; | ||
fd = -1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to allow this?
I think it would make more sense to only allow to pass in a size and allocate the shared memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah probably useless, i removed it.
I think it would make more sense to only allow to pass in a size and allocate the shared memory.
Well, yes, that makes sense. So both passing the FD externally or automatically allocating with a shared memory buffer.
Also, i think it would be worth it looking into memfd_create()
, since it is automatically closed when all references are released, which would be better than shm_open
, which must be unlinked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please look into memfd_create()
in that case so we can send data zero copy in the other direction as well then yeah.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is pretty nice. It's like shm_open()
, but better. You can do sealing and it has no need for unlinking.
The only issue would be name generation, which is the same for shm_open()
.
I'd wager we could use a static class variable to create a string like "shared_memory_" and then add the number of instances created. I think that should last us up until 2^64 instances :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh btw, that constructor is actually needed because of the GstBufferMemory
constructor in DepthAI device, as we set the FD in the constructor itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we set it in the constructor, can we just initialize shared memory in the initializer list?
GstBufferMemory(int fd): SharedMemory(fd) {}...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see what you mean (checked the source).
Let's leave it then.
void setSize(std::size_t size) override { | ||
ftruncate(fd, size); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if size is bigger than current size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, technically ftruncate extends it, but how that would interact with the various kinds of shared memory depends on the implementation
If it's for example memory created with shm_open, it will just expand it. If it's a DMA buffer, it depends on the implementation. For Gst, it seems to be the same as shm_open.
This behavior should be overridden by implementations if it's different.
However, we probably need to unmap and map the FD so that we can actually have a mapping of the correct size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I though truncate only downsizes.
VectorMemory() { kind = MemoryKinds::MEMORY_KIND_VECTOR_MEMORY; } | ||
VectorMemory(const std::vector<std::uint8_t>& d) : vector(std::move(d)) { kind = MemoryKinds::MEMORY_KIND_VECTOR_MEMORY; } | ||
VectorMemory(std::vector<std::uint8_t>&& d) : vector(std::move(d)) { kind = MemoryKinds::MEMORY_KIND_VECTOR_MEMORY; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove the kind I think and use casts instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the kind is important, as for example in XLinkOut in DepthAI device, we need it to discern how to output. Unless we want to understand which it is by using sizeof(), which is imperfect, this system works better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use if(std::dynamic_ptr_cast<SharedMemory>(data))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't think about that...
That should work yes. I've tested it on a separate program and it works. I'm not sure about overhead, but it should be comparable anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like it doesn't work. I've reinstated the kind variables and they work now.
Branch is ready for merge |
There is a missing |
This update leverages the new FD protocol in XLink to avoid unecessary copying, allowing for a zerocopy runtime on localhost.
This is allowed by the new memory class, SharedMemory, which instread of containing the data itself, like VectorMemory, it only holds the FD and the mapping.