-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LibOS] Implement mmap
for encrypted and tmpfs files
#550
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 16 of 17 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 18 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @pwmarcz)
-- commits
line 9 at r2:
on an
-> on
LibOS/shim/include/shim_fs.h
line 113 at r2 (raw file):
*/ int (*msync)(struct shim_handle* hdl, void* addr, size_t size, int prot, int flags, uint64_t offset);
Why do we need prot
argument here? UPDATE: Ok, because we can have a write-only protection and we want to read the contents of the VMA to write them into the file. So we memorize the original protections, then add R and perform msync, then restore original protections.
LibOS/shim/src/bookkeep/shim_vma.c
line 1266 at r2 (raw file):
} static int _msync_all(uintptr_t begin, uintptr_t end, struct shim_handle* hdl) {
Looks like a potentially very slow operation because it iterates through all VMAs every time it's called. I guess nothing can be done about it, but do you think it can be a perf bottleneck? I can imagine big applications with thousands of VMAs allocated...
LibOS/shim/src/bookkeep/shim_vma.c
line 1285 at r2 (raw file):
uintptr_t msync_end = MIN(end, (uintptr_t)vma_info->addr + vma_info->length); ret = file->fs->fs_ops->msync(file, (void*)msync_begin, msync_end - msync_begin, vma_info->prot, vma_info->flags, vma_info->file_offset);
It's unclear to me if any data races are possible here. Say you collected the relevant VMAs into the vma_infos
array and started calling msync()
on each of these VMAs. What if another thread unmapped this VMA, mapped another VMA in the same memory range, and now your msync()
writes wrong values back into the file?
Hm, on the other hand, looks like it's normal behavior of Linux, where no guarantees for such cases are given for msync
.
LibOS/shim/src/fs/shim_fs_mem.c
line 98 at r2 (raw file):
assert((size_t)size <= buf_size); if (buf_size != mem->buf_size) { int ret = mem_file_resize(mem, buf_size);
Actually, what happens if the user supplies size = 0
? Won't we end up with buf_size = 0
and then mem_file_resize()
will fail with -ENOMEM
? Which is not how truncate is supposed to work.
LibOS/shim/src/fs/shim_fs_util.c
line 173 at r2 (raw file):
assert((size_t)count <= read_size); read_size -= count; read_addr += count;
Technically we shouldn't use void*
because it doesn't allow pointer arithmetic. We should technically use char*
, but I don't insist.
LibOS/shim/src/fs/shim_fs_util.c
line 195 at r2 (raw file):
int generic_emulated_msync(struct shim_handle* hdl, void* addr, size_t size, int prot, int flags, uint64_t offset) { assert(!(flags & MAP_PRIVATE));
Well, you could add more flags here, e.g. ANONYMOUS
. Why did you decide on this particular assert?
LibOS/shim/src/fs/shim_fs_util.c
line 222 at r2 (raw file):
} if (count == 0)
Can this happen?
LibOS/shim/src/fs/shim_fs_util.c
line 227 at r2 (raw file):
assert((size_t)count <= write_size); write_size -= count; write_addr += count;
ditto
LibOS/shim/src/fs/chroot/encrypted.c
line 31 at r2 (raw file):
* - truncate (the current `truncate` operation works only for extending the file, support for * truncation needs to be added to the `protected_files` module) * - flush all files on process exit
Why didn't you implement it in this PR? It will be a follow-up?
LibOS/shim/src/fs/chroot/encrypted.c
line 373 at r2 (raw file):
assert(hdl->inode->type == S_IFREG); int ret = msync_handle(hdl);
I'm confused here. Why do we have both msync_handle()
and encrypted_file_flush()
? Aren't they doing the same?
LibOS/shim/src/fs/chroot/encrypted.c
line 405 at r2 (raw file):
lock(&hdl->inode->lock); int ret = encrypted_file_read(enc, buf, count, *pos, &actual_count);
Ouch. So this bug went unnoticed because the only user of this callback was shim_do_read()
(which anyhow supplies &hdl->pos
in the pos
argument)? And now you added the mmap
emulation that uses this callback too, but from an arbitrary pos
, and that's how you exposed the bug?
Same question for the bug in chroot_encrypted_write()
below.
LibOS/shim/src/fs/tmpfs/fs.c
line 295 at r2 (raw file):
are currently not flexible enough for correct tmpfs implementation. In particular, shim_do_mmap() pre-allocates memory region at a specific address (making it impossible to have two mmaps on the same tmpfs file), and shim_do_munmap() doesn't have a callback into tmpfs at all. */
Actually, is it possible now to have two mmaps on the same tmpfs file? Did we change anything that enables this scenario now?
And since we're here, could we add such a case in our tests?
LibOS/shim/src/sys/shim_mmap.c
line 459 at r2 (raw file):
if (flags & MS_SYNC) { return msync_range(start, start + len);
Nice side-effect!
LibOS/shim/test/fs/test_tmpfs.py
line 69 at r2 (raw file):
pass # This overrides parent class to remove @expectedFailureIf(HAS_SGX)
What's the deal with this expectedFailureIf
? Why would the parent class have it at all? Isn't it better to change the parent class and remove this attribute completely? After all, now all mmaps should work. And then we won't need to overwrite these methods (well, maybe we'll need to adjust the timeouts, but it may be done in the parent class).
LibOS/shim/test/regression/mmap_file_emulated.c
line 7 at r2 (raw file):
/* * Test file mapping emulated by Gramine (encrypted: tmpfs): try reading and writing a file through
ecnrypted, tmpfs
(comma instead of colon)
LibOS/shim/test/regression/mmap_file_emulated.c
line 35 at r2 (raw file):
const char* path = argv[1]; setbuf(stdout, NULL);
Why do we care?
LibOS/shim/test/regression/mmap_file_emulated.c
line 67 at r2 (raw file):
if (fd == -1) err(1, "open");
Please remove one empty line
LibOS/shim/test/regression/mmap_file_emulated.c
line 69 at r2 (raw file):
void* addr = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
Actually, a test with offset != 0 would be also good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 12 of 17 files reviewed, 15 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "squash! " and "fixup! " found in commit messages' one-liners (waiting on @dimakuv)
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
on an
->on
I added a squash commit.
LibOS/shim/include/shim_fs.h
line 113 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why do we need
prot
argument here? UPDATE: Ok, because we can have a write-only protection and we want to read the contents of the VMA to write them into the file. So we memorize the original protections, then add R and perform msync, then restore original protections.
That's right, we need to know how the memory is currently mapped.
LibOS/shim/src/bookkeep/shim_vma.c
line 1266 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Looks like a potentially very slow operation because it iterates through all VMAs every time it's called. I guess nothing can be done about it, but do you think it can be a perf bottleneck? I can imagine big applications with thousands of VMAs allocated...
You're right. I used the function for dumping all VMAs, but it can be adapted to dump all VMAs in range (and they're stored in an AVL tree, so lookup will be cheap). Fixed.
LibOS/shim/src/bookkeep/shim_vma.c
line 1285 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
It's unclear to me if any data races are possible here. Say you collected the relevant VMAs into the
vma_infos
array and started callingmsync()
on each of these VMAs. What if another thread unmapped this VMA, mapped another VMA in the same memory range, and now yourmsync()
writes wrong values back into the file?Hm, on the other hand, looks like it's normal behavior of Linux, where no guarantees for such cases are given for
msync
.
Yes, unfortunately... This msync
could write wrong values, or even crash if the memory is unmapped by another thread.
I don't see an easy way around it: VMA operations take a lock, but you cannot do anything while holding that lock (since even malloc
might trigger mapping more memory).
LibOS/shim/src/fs/shim_fs_mem.c
line 98 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Actually, what happens if the user supplies
size = 0
? Won't we end up withbuf_size = 0
and thenmem_file_resize()
will fail with-ENOMEM
? Which is not how truncate is supposed to work.
Yeah... And I don't particularly want to deal with mem_file_resize
to 0. I'll introduce a minimum size for resizing.
LibOS/shim/src/fs/shim_fs_util.c
line 173 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Technically we shouldn't use
void*
because it doesn't allow pointer arithmetic. We should technically usechar*
, but I don't insist.
OK, let's use char*
.
LibOS/shim/src/fs/shim_fs_util.c
line 195 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Well, you could add more flags here, e.g.
ANONYMOUS
. Why did you decide on this particular assert?
It's hard for me to imagine that we'll call msync
on a non-file mapping, since hdl
and flags
likely come from the same source. But it's possible to make a mistake and call it on a private mapping, which should not be persisted to a file.
LibOS/shim/src/fs/shim_fs_util.c
line 222 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Can this happen?
I think it cannot, the current cases (tmpfs and protected-files). I'm not so sure about other filesystems that might use this in the future (especially if we're writing to host).
Still, this should be an error, not success.
LibOS/shim/src/fs/shim_fs_util.c
line 227 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ditto
Converted to char*
.
LibOS/shim/src/fs/chroot/encrypted.c
line 31 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why didn't you implement it in this PR? It will be a follow-up?
Actually, I'm not sure how to even do that. I would like to call flush()
on all the handles in the process, but do we have a list of them? There is the "handle map" which maps FDs to handles, but we might keep some open handles without FDs in the memory mappings (if someone did mmap()
and then close()
).
LibOS/shim/src/fs/chroot/encrypted.c
line 373 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I'm confused here. Why do we have both
msync_handle()
andencrypted_file_flush()
? Aren't they doing the same?
flush
is a callback for the sync
system call, and it's supposed to also flush file mappings. We could call chroot_encrypted_msync
here, but we don't know what mappings we should call it on (and I thought it would be better NOT to duplicate this information in handle). So we call msync_handle
, means basically "find all shared mappings of this file and call msync
on them".
LibOS/shim/src/fs/chroot/encrypted.c
line 405 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Ouch. So this bug went unnoticed because the only user of this callback was
shim_do_read()
(which anyhow supplies&hdl->pos
in thepos
argument)? And now you added themmap
emulation that uses this callback too, but from an arbitrarypos
, and that's how you exposed the bug?Same question for the bug in
chroot_encrypted_write()
below.
It would also be tested by the LibOS/shim/test/fs
tests, since they use sendfile
, but I haven't switched them to test the encrypted
filesystem (instead of protected files) yet.
LibOS/shim/src/fs/tmpfs/fs.c
line 295 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Actually, is it possible now to have two mmaps on the same tmpfs file? Did we change anything that enables this scenario now?
And since we're here, could we add such a case in our tests?
It's possible, but unlike Linux, they will not be synchronized: if you modify one, the other will stay the same.
I added a second mapping to the test.
LibOS/shim/test/fs/test_tmpfs.py
line 69 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
What's the deal with this
expectedFailureIf
? Why would the parent class have it at all? Isn't it better to change the parent class and remove this attribute completely? After all, now all mmaps should work. And then we won't need to overwrite these methods (well, maybe we'll need to adjust the timeouts, but it may be done in the parent class).
Unfortunately, no, because test uses "allowed" files and tries to write to them using mmap
, which is something PAL doesn't support:
if (!(prot & PAL_PROT_WRITECOPY) && (prot & PAL_PROT_WRITE)) {
log_warning(
"file_map does not currently support writable pass-through mappings on SGX. You "
"may add the PAL_PROT_WRITECOPY (MAP_PRIVATE) flag to your file mapping to keep "
"the writes inside the enclave but they won't be reflected outside of the "
"enclave.");
return -PAL_ERROR_DENIED;
}
I think the current solution could be extended to allowed files, although I don't want to do it right in this PR.
LibOS/shim/test/regression/mmap_file_emulated.c
line 7 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ecnrypted, tmpfs
(comma instead of colon)
Fixed.
LibOS/shim/test/regression/mmap_file_emulated.c
line 35 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why do we care?
Otherwise, all the output from printf
is at the bottom (probably because libc doesn't see Gramine's stdout as a terminal, and sets it to block-buffered; and stderr is always unbuffered).
I guess we could output everything to stderr, since Gramine joins the streams anyway. But that's what we ended up doing in most of these tests: "test ok" using stdout, errors using stderr.
LibOS/shim/test/regression/mmap_file_emulated.c
line 67 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please remove one empty line
Done.
LibOS/shim/test/regression/mmap_file_emulated.c
line 69 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Actually, a test with offset != 0 would be also good.
I added a second mapping with offset != 0. This covers the following cases: two mappings, shared/private, and different offsets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r3, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "squash! " and "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski, @mkow, and @pwmarcz)
LibOS/shim/src/bookkeep/shim_vma.c
line 1266 at r2 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
You're right. I used the function for dumping all VMAs, but it can be adapted to dump all VMAs in range (and they're stored in an AVL tree, so lookup will be cheap). Fixed.
Awesome solution!
LibOS/shim/src/bookkeep/shim_vma.c
line 1285 at r2 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
Yes, unfortunately... This
msync
could write wrong values, or even crash if the memory is unmapped by another thread.I don't see an easy way around it: VMA operations take a lock, but you cannot do anything while holding that lock (since even
malloc
might trigger mapping more memory).
Yep. Anyway, resolved.
LibOS/shim/src/fs/chroot/encrypted.c
line 31 at r2 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
Actually, I'm not sure how to even do that. I would like to call
flush()
on all the handles in the process, but do we have a list of them? There is the "handle map" which maps FDs to handles, but we might keep some open handles without FDs in the memory mappings (if someone didmmap()
and thenclose()
).
Oh no. @boryspoplawski @mkow Do you know how we can collect all handles, even the ones without an atteched FD?
But anyway, I think this is for a follow-up PR.
LibOS/shim/src/fs/chroot/encrypted.c
line 373 at r2 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
flush
is a callback for thesync
system call, and it's supposed to also flush file mappings. We could callchroot_encrypted_msync
here, but we don't know what mappings we should call it on (and I thought it would be better NOT to duplicate this information in handle). So we callmsync_handle
, means basically "find all shared mappings of this file and callmsync
on them".
Ok. Pawel's explanation didn't help me much, but I explained it to myself like this:
msync_handle()
ends up inipf_write()
-- which performs memcpy from mmapped memory region into the corresponding nodes of the protected/encrypted file (i.e., from the mmap memory region into the file-blocks-in-plaintext memory region)encrypted_file_flush()
ends up inipf_internal_flush()
-- which performs writes of file-blocks-in-plaintext into the host-level files.
So we have two levels of indirection here: first sync from mmap region to file-blocks region, then sync from file-blocks region to host-level file. That's why two functions are used together.
Maybe will be helpful for other reviewers or for future.
LibOS/shim/src/fs/tmpfs/fs.c
line 295 at r2 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
It's possible, but unlike Linux, they will not be synchronized: if you modify one, the other will stay the same.
I added a second mapping to the test.
Wait, but is it Ok?
LibOS/shim/test/regression/mmap_file_emulated.c
line 69 at r2 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
I added a second mapping with offset != 0. This covers the following cases: two mappings, shared/private, and different offsets.
Thanks, looks awesome!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 11 of 17 files at r1, 1 of 1 files at r2, 5 of 5 files at r3, all commit messages.
Reviewable status: all files reviewed, 33 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "squash! " and "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @pwmarcz)
-- commits
line 3 at r3:
Most of these commits seem unrelated to each other, would be better to have them in separate PRs, but I guess too late now.
LibOS/shim/include/shim_fs.h
line 89 at r3 (raw file):
* * \param hdl File handle. * \param addr Address of the memory region. Cannot be NULL.
Why do we need this artificial disallowance of NULL
?
Update: is it because it's passed as is to PAL API which still requires it?
LibOS/shim/include/shim_fs.h
line 91 at r3 (raw file):
* \param addr Address of the memory region. Cannot be NULL. * \param size Size of the memory region. * \param prot Permissions for the memory region (`PAL_PROT_*`).
Why these are PAL_PROT_*
??
Update: from what I can see these are Linux prot, not PAL...
LibOS/shim/include/shim_fs.h
line 112 at r3 (raw file):
* region. */ int (*msync)(struct shim_handle* hdl, void* addr, size_t size, int prot, int flags,
Please comment all args.
LibOS/shim/include/shim_fs.h
line 112 at r3 (raw file):
* region. */ int (*msync)(struct shim_handle* hdl, void* addr, size_t size, int prot, int flags,
Why do we need flags
argument?
LibOS/shim/include/shim_vma.h
line 131 at r3 (raw file):
/* Call `msync` for file mappings of `hdl` */ int msync_handle(struct shim_handle* hdl);
Why do we need this? In which cases we need to call msync based on handle not address range?
LibOS/shim/src/bookkeep/shim_vma.c
line 1119 at r3 (raw file):
for (vma = _lookup_vma(begin); vma && vma->begin < end; vma = _get_next_vma(vma)) { if (vma->end <= begin)
How is this case possible?
LibOS/shim/src/bookkeep/shim_vma.c
line 1173 at r3 (raw file):
int dump_all_vmas(struct shim_vma_info** ret_infos, size_t* ret_count, bool include_unmapped) { return dump_vmas(ret_infos, ret_count, /*begin=*/0, /*end=*/(uintptr_t)-1,
Please use UINTPTR_MAX
.
LibOS/shim/src/bookkeep/shim_vma.c
line 1174 at r3 (raw file):
int dump_all_vmas(struct shim_vma_info** ret_infos, size_t* ret_count, bool include_unmapped) { return dump_vmas(ret_infos, ret_count, /*begin=*/0, /*end=*/(uintptr_t)-1, include_unmapped ? &vma_filter_all : &vma_filter_exclude_unmapped,
No need for &
with function pointers.
LibOS/shim/src/bookkeep/shim_vma.c
line 1246 at r3 (raw file):
return false; if (!vma->file)
Is this case possible? We check for MAP_ANONYMOUS
above.
LibOS/shim/src/bookkeep/shim_vma.c
line 1252 at r3 (raw file):
return false; if (!(vma->file->acc_mode & MAY_WRITE))
Don't we need a lock for this?
Conceptually we shouldn't, but on the current master I think it's required, but I'm not sure.
LibOS/shim/src/bookkeep/shim_vma.c
line 1261 at r3 (raw file):
} static int _msync_all(uintptr_t begin, uintptr_t end, struct shim_handle* hdl) {
Please assert that begin
and end
are page aligned.
LibOS/shim/src/bookkeep/shim_vma.c
line 1261 at r3 (raw file):
} static int _msync_all(uintptr_t begin, uintptr_t end, struct shim_handle* hdl) {
We usually use names with leading underscore to highlight functions which must be called with some locks taken.
Not blocking - this is rather informal convention.
LibOS/shim/src/bookkeep/shim_vma.c
line 1265 at r3 (raw file):
size_t count; int ret = dump_vmas(&vma_infos, &count, begin, end, &vma_filter_needs_msync, hdl);
No need for &
with function pointer
LibOS/shim/src/bookkeep/shim_vma.c
line 1296 at r3 (raw file):
int msync_handle(struct shim_handle* hdl) { return _msync_all(/*begin=*/0, /*end=*/(uintptr_t)-1, hdl);
UINTPTR_MAX
LibOS/shim/src/fs/shim_fs_mem.c
line 91 at r3 (raw file):
size_t buf_size = mem->buf_size; if ((size_t)size > buf_size) {
Could we have this condition in the same order as the one below?
LibOS/shim/src/fs/shim_fs_mem.c
line 98 at r3 (raw file):
} } else { while (buf_size / 2 > (size_t)size)
Shouldn't this be >=
?
LibOS/shim/src/fs/shim_fs_util.c
line 238 at r3 (raw file):
if (pal_prot != pal_prot_readable) { int protect_ret = DkVirtualMemoryProtect(addr, size, pal_prot); if (ret < 0) {
This should be protect_ret
LibOS/shim/src/fs/shim_fs_util.c
line 240 at r3 (raw file):
if (ret < 0) { log_debug("%s: DkVirtualMemoryProtect failed on cleanup: %d; memory will be left " "readable", __func__, protect_ret);
I think we should kill the app, since the state is inconsistent (and now the app won't even know about it).
LibOS/shim/src/fs/chroot/encrypted.c
line 31 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Oh no. @boryspoplawski @mkow Do you know how we can collect all handles, even the ones without an atteched FD?
But anyway, I think this is for a follow-up PR.
I think the only way is to get rid of all handle references and have flushing on release. To do that we would probably need to close all fds and unmmap all user memory.
LibOS/shim/src/fs/chroot/encrypted.c
line 373 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Ok. Pawel's explanation didn't help me much, but I explained it to myself like this:
msync_handle()
ends up inipf_write()
-- which performs memcpy from mmapped memory region into the corresponding nodes of the protected/encrypted file (i.e., from the mmap memory region into the file-blocks-in-plaintext memory region)encrypted_file_flush()
ends up inipf_internal_flush()
-- which performs writes of file-blocks-in-plaintext into the host-level files.So we have two levels of indirection here: first sync from mmap region to file-blocks region, then sync from file-blocks region to host-level file. That's why two functions are used together.
Maybe will be helpful for other reviewers or for future.
I'm still a bit confused. Why we don't have to call msync_handle
inside encrypted_file_flush
?
LibOS/shim/src/fs/tmpfs/fs.c
line 295 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Wait, but is it Ok?
+1 it seems that it's impossible to have correct mmap on tmpfs with SGX PAL.
LibOS/shim/src/sys/shim_exit.c
line 18 at r3 (raw file):
#include "shim_thread.h" #include "shim_utils.h" #include "shim_vma.h"
Why we need this?
LibOS/shim/src/sys/shim_mmap.c
line 140 at r3 (raw file):
goto out_handle; } /* Flush any file mappings we're about to replace */
Is this correct? In case of MAP_FIXED_NOREPLACE
we might fail without doing anything. I guess random msync
is still correct, because it only affects existing mappings (so we not need to worry if the memory is present), which are shared, so they can be flushed at any point?
LibOS/shim/src/sys/shim_mmap.c
line 449 at r3 (raw file):
} if (flags & MS_INVALIDATE) {
NP and not blocking: maybe move this to the if below, which kind of iterates through all flags?
LibOS/shim/test/regression/mmap_file_emulated.c
line 29 at r3 (raw file):
int main(int argc, char** argv) { ssize_t ret; int fd;
Let's move these to lines with first usage (we use c11 so we can :) )
LibOS/shim/test/regression/mmap_file_emulated.c
line 62 at r3 (raw file):
err(1, "failed to write file"); if ((size_t)ret < MESSAGE_LEN) err(1, "not enough bytes written");
NP: errx
LibOS/shim/test/regression/mmap_file_emulated.c
line 74 at r3 (raw file):
err(1, "failed to write file"); if ((size_t)ret < MESSAGE_LEN) err(1, "not enough bytes written");
ditto
LibOS/shim/test/regression/mmap_file_emulated.c
line 80 at r3 (raw file):
err(1, "close"); printf("CREATE OK\n");
NP: puts
LibOS/shim/test/regression/mmap_file_emulated.c
line 85 at r3 (raw file):
fd = open(path, O_RDWR, 0); if (fd == -1)
NP: this check differs from all others (< 0
)
LibOS/shim/test/regression/mmap_file_emulated.c
line 104 at r3 (raw file):
for (size_t i = MESSAGE_LEN; i < mmap_size; i++) { if (addr_private[i] != 0) errx(1, "unexpected non-zero byte at addr_private[%zu]", i);
Can we also check addr_shared
?
LibOS/shim/test/regression/mmap_file_emulated.c
line 107 at r3 (raw file):
} printf("MAP OK\n");
NP: puts
LibOS/shim/test/regression/mmap_file_emulated.c
line 122 at r3 (raw file):
err(1, "munmap"); ret = close(fd);
Could you move close
to right after mmaps? To make sure these changes are flushed on munmap
, not close
.
LibOS/shim/test/regression/mmap_file_emulated.c
line 126 at r3 (raw file):
err(1, "close"); printf("WRITE OK\n");
ditto (puts
)
LibOS/shim/test/regression/mmap_file_emulated.c
line 132 at r3 (raw file):
char buf[file_size]; ret = posix_file_read(path, buf, sizeof(buf));
Shouldn't we also check ret < 0
?
LibOS/shim/test/regression/mmap_file_emulated.c
line 142 at r3 (raw file):
errx(1, "wrong file content"); printf("TEST OK\n");
ditto (puts
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 8 of 17 files reviewed, 33 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski, @dimakuv, and @mkow)
LibOS/shim/include/shim_fs.h
line 89 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Why do we need this artificial disallowance of
NULL
?
Update: is it because it's passed as is to PAL API which still requires it?
Yes. PAL interprets NULL as "use any address".
LibOS/shim/include/shim_fs.h
line 91 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Why these are
PAL_PROT_*
??
Update: from what I can see these are Linux prot, not PAL...
Right. Fixed.
LibOS/shim/include/shim_fs.h
line 112 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Please comment all args.
OK, they're the same as mmap
though. I modified the description to make that clear.
LibOS/shim/include/shim_fs.h
line 112 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Why do we need
flags
argument?
See the implementation. We sometimes need to call PAL to change permissions, and that requires both prot
and flags
:
pal_prot_flags_t pal_prot = LINUX_PROT_TO_PAL(prot, flags);
LibOS/shim/include/shim_vma.h
line 131 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Why do we need this? In which cases we need to call msync based on handle not address range?
When the user requests to sync a specific handle, using the sync
syscall (our flush
callback).
LibOS/shim/src/bookkeep/shim_vma.c
line 1119 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
How is this case possible?
My mistake, somehow I thought _lookup_vma
returns the VMA before address, not after. Removed.
LibOS/shim/src/bookkeep/shim_vma.c
line 1173 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Please use
UINTPTR_MAX
.
Done.
LibOS/shim/src/bookkeep/shim_vma.c
line 1174 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
No need for
&
with function pointers.
I thought we use &
for clarity; I pretty universally used it in my code so far and nobody complained (most notably, all the code where we setup pseudofs). On the other hand, I see @mkow not using &
in his sysfs code with callbacks.
Removed &
for now.
LibOS/shim/src/bookkeep/shim_vma.c
line 1246 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Is this case possible? We check for
MAP_ANONYMOUS
above.
Yeah, it makes sense that vma->file
would always be present. I wasn't sure because this module never enforces or documents this, so we depend on callers.
Replaced with assertion.
LibOS/shim/src/bookkeep/shim_vma.c
line 1252 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Don't we need a lock for this?
Conceptually we shouldn't, but on the current master I think it's required, but I'm not sure.
Yeah, we still modify it in socket code. I'll add a lock and a TODO here.
LibOS/shim/src/bookkeep/shim_vma.c
line 1261 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Please assert that
begin
andend
are page aligned.
OK. This is a bit awkward because of UINTPTR_MAX
, but I don't know how to do it better.
LibOS/shim/src/bookkeep/shim_vma.c
line 1261 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
We usually use names with leading underscore to highlight functions which must be called with some locks taken.
Not blocking - this is rather informal convention.
I called this _msync_all
back when I actually had a function called msync_all
(it wasn't about a lock, but about defining a helper function). That's not necessary now, so I'm removing the underscore.
LibOS/shim/src/bookkeep/shim_vma.c
line 1265 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
No need for
&
with function pointer
Done.
LibOS/shim/src/bookkeep/shim_vma.c
line 1296 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
UINTPTR_MAX
Done.
LibOS/shim/src/fs/shim_fs_mem.c
line 91 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Could we have this condition in the same order as the one below?
Done (and same in the function above).
LibOS/shim/src/fs/shim_fs_mem.c
line 98 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Shouldn't this be
>=
?
Makes sense, but then I have to also add a case for size == 0
. Done.
LibOS/shim/src/fs/shim_fs_util.c
line 238 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
This should be
protect_ret
Fixed.
LibOS/shim/src/fs/shim_fs_util.c
line 240 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
I think we should kill the app, since the state is inconsistent (and now the app won't even know about it).
Added BUG()
.
LibOS/shim/src/fs/chroot/encrypted.c
line 31 at r2 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
I think the only way is to get rid of all handle references and have flushing on release. To do that we would probably need to close all fds and unmmap all user memory.
Sounds good.
LibOS/shim/src/fs/chroot/encrypted.c
line 373 at r2 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
I'm still a bit confused. Why we don't have to call
msync_handle
insideencrypted_file_flush
?
encrypted_file_flush
only works with the encrypted-file object. It has no idea that we're also keeping emulated mappings (and in fact, does not depend on shim_handle
or shim_vma
or anything else in the system; it just takes care of that one object).
I added comments here to explain that better.
LibOS/shim/src/fs/tmpfs/fs.c
line 295 at r2 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
+1 it seems that it's impossible to have correct mmap on tmpfs with SGX PAL.
What do you mean? Emulated mappings should work exactly the same for tmpfs and encrypted fs:
- it is possible to have many mappings of a single file,
- the mappings are not updated automatically, we just fill them on mmap and write back to file on mmap.
This is not "correct", but a useful approximation, and what we always had for protected files.
LibOS/shim/src/sys/shim_mmap.c
line 140 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Is this correct? In case of
MAP_FIXED_NOREPLACE
we might fail without doing anything. I guess randommsync
is still correct, because it only affects existing mappings (so we not need to worry if the memory is present), which are shared, so they can be flushed at any point?
Yes. But you're right, we can skip this on MAP_FIXED_NOREPLACE
.
LibOS/shim/src/sys/shim_mmap.c
line 449 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
NP and not blocking: maybe move this to the if below, which kind of iterates through all flags?
Moved below.
LibOS/shim/test/regression/mmap_file_emulated.c
line 29 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Let's move these to lines with first usage (we use c11 so we can :) )
OK.
LibOS/shim/test/regression/mmap_file_emulated.c
line 62 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
NP:
errx
Done.
LibOS/shim/test/regression/mmap_file_emulated.c
line 74 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
ditto
Done.
LibOS/shim/test/regression/mmap_file_emulated.c
line 80 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
NP:
puts
Done.
LibOS/shim/test/regression/mmap_file_emulated.c
line 85 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
NP: this check differs from all others (
< 0
)
Fixed.
LibOS/shim/test/regression/mmap_file_emulated.c
line 104 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Can we also check
addr_shared
?
OK, sure.
LibOS/shim/test/regression/mmap_file_emulated.c
line 107 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
NP:
puts
Done.
LibOS/shim/test/regression/mmap_file_emulated.c
line 122 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Could you move
close
to right after mmaps? To make sure these changes are flushed onmunmap
, notclose
.
Done.
LibOS/shim/test/regression/mmap_file_emulated.c
line 126 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
ditto (
puts
)
Done.
LibOS/shim/test/regression/mmap_file_emulated.c
line 132 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Shouldn't we also check
ret < 0
?
Right. Fixed.
LibOS/shim/test/regression/mmap_file_emulated.c
line 142 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
ditto (
puts
)
Done.
LibOS/shim/src/sys/shim_exit.c
line 18 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Why we need this?
Leftover from earlier attempt to flush-on-exit, before I decided I don't want to do it now. Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 9 files at r4, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @pwmarcz)
LibOS/shim/include/shim_fs.h
line 89 at r3 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
Yes. PAL interprets NULL as "use any address".
Right
LibOS/shim/include/shim_fs.h
line 112 at r3 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
See the implementation. We sometimes need to call PAL to change permissions, and that requires both
prot
andflags
:pal_prot_flags_t pal_prot = LINUX_PROT_TO_PAL(prot, flags);
Not sure how much of an implementation details it is, but they only care for MAP_PRIVATE
, which we know to be false in msync
.
LibOS/shim/src/bookkeep/shim_vma.c
line 1174 at r3 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
I thought we use
&
for clarity; I pretty universally used it in my code so far and nobody complained (most notably, all the code where we setup pseudofs). On the other hand, I see @mkow not using&
in his sysfs code with callbacks.Removed
&
for now.
Hmm, I guess we have no official policy and it's a matter of preference. Should have mentioned it, but not blocked, my bad.
I personally prefer no &
, because these are function pointers, and that should be clear from the name itself and the &
is not needed, like, we already know it's a function/pointer.
LibOS/shim/src/bookkeep/shim_vma.c
line 1252 at r3 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
Yeah, we still modify it in socket code. I'll add a lock and a TODO here.
Hmmm, actually, are we allowed to take any locks here? We are inside vmas traversing function, under global vma lock, I think.
In reality acc_mode
can't change for anything other than sockets, so maybe we can just ignore the lock, leave a XXX
describing the situation, to be removed when we clear semantics of accessing acc_mode
(which we should do after sockets rewrite)
wdyt? Above I assumed that only sockets are the issue.
LibOS/shim/src/bookkeep/shim_vma.c
line 1261 at r3 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
OK. This is a bit awkward because of
UINTPTR_MAX
, but I don't know how to do it better.
Right. But I guess it's ok
LibOS/shim/src/fs/chroot/encrypted.c
line 373 at r2 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
encrypted_file_flush
only works with the encrypted-file object. It has no idea that we're also keeping emulated mappings (and in fact, does not depend onshim_handle
orshim_vma
or anything else in the system; it just takes care of that one object).I added comments here to explain that better.
Well, I'm not familiar with the new PF implementation, so I would have to check it out. Resolving.
LibOS/shim/src/fs/tmpfs/fs.c
line 295 at r2 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
What do you mean? Emulated mappings should work exactly the same for tmpfs and encrypted fs:
- it is possible to have many mappings of a single file,
- the mappings are not updated automatically, we just fill them on mmap and write back to file on mmap.
This is not "correct", but a useful approximation, and what we always had for protected files.
I guess nothing really relies on mapping the same file twice? Maybe the current version won't hurt anybody, idk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 9 files at r4, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski and @pwmarcz)
LibOS/shim/src/bookkeep/shim_vma.c
line 1252 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Hmmm, actually, are we allowed to take any locks here? We are inside vmas traversing function, under global vma lock, I think.
In realityacc_mode
can't change for anything other than sockets, so maybe we can just ignore the lock, leave aXXX
describing the situation, to be removed when we clear semantics of accessingacc_mode
(which we should do after sockets rewrite)
wdyt? Above I assumed that only sockets are the issue.
+1. I don't like the idea of having a lock in this function. I think we can safely assume that after the previous check on msync
callback existence, we can be sure that it's a file handle and then its acc_mode
is never changed. We can add a comment about this.
LibOS/shim/src/fs/tmpfs/fs.c
line 295 at r2 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
I guess nothing really relies on mapping the same file twice? Maybe the current version won't hurt anybody, idk
Yeah, nobody ever complained about this. So I think this is not a realistic scenario. Indeed, why would an app want to map the same file twice. I'm resolving.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 15 of 17 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski and @dimakuv)
LibOS/shim/include/shim_fs.h
line 112 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Not sure how much of an implementation details it is, but they only care for
MAP_PRIVATE
, which we know to be false inmsync
.
Oooh, actually, the PAL mapping should not be shared. The Linux-SGX PAL ignores that distinction anyway, but we don't want Linux PAL to create a shared region between processes. So right, the flags
parameter will be unused here.
I still want to leave the parameter, because:
- it makes for a nice symmetry with the
mmap
callback: they have the same parameters, omitting just one of them would be surprising, - we get to have an
assert
that we're not flushing a Gramine-levelMAP_PRIVATE
mapping.
LibOS/shim/src/bookkeep/shim_vma.c
line 1252 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
+1. I don't like the idea of having a lock in this function. I think we can safely assume that after the previous check on
msync
callback existence, we can be sure that it's a file handle and then itsacc_mode
is never changed. We can add a comment about this.
Oh, right, this is under vma_tree_lock
, I shouldn't do that.
Changed as suggested.
LibOS/shim/src/fs/tmpfs/fs.c
line 295 at r2 (raw file):
Indeed, why would an app want to map the same file twice.
Well, there's the magic ring buffer trick :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r5, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski)
LibOS/shim/src/fs/tmpfs/fs.c
line 295 at r2 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
Indeed, why would an app want to map the same file twice.
Well, there's the magic ring buffer trick :)
One memcpy
instead of two memcpy
s to allow the compiler to apply vectorization optimizations? Wow, I wonder if people are actually using such tricks to squeze out an additional... 0.001% performance? But maybe I'm wrong, maybe it's useful somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r5, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @pwmarcz)
LibOS/shim/include/shim_fs.h
line 112 at r3 (raw file):
Ok, let's leave it.
we don't want Linux PAL to create a shared region between processes
Why is that? Ideally most of the PALs should support shared mappings, so should msync. I guess some of filesystems (e.g. tmpfs) are fully emulated and as such won't create a shared mapping, but that's not part of this interface.
LibOS/shim/src/fs/shim_fs_util.c
line 147 at r5 (raw file):
int ret; /* The underlying PAL mapping should always be private. */
I think this deserves more explanation, i.e. why it should be private
LibOS/shim/src/fs/shim_fs_util.c
line 206 at r5 (raw file):
unlock(&hdl->inode->lock); /* The underlying PAL mapping should always be private. */
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 17 files at r1, 1 of 1 files at r2.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @pwmarcz)
LibOS/shim/src/fs/chroot/encrypted.c
line 31 at r2 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
Sounds good.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 16 of 17 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski, @dimakuv, and @pwmarcz)
LibOS/shim/include/shim_fs.h
line 112 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
Ok, let's leave it.
we don't want Linux PAL to create a shared region between processes
Why is that? Ideally most of the PALs should support shared mappings, so should msync. I guess some of filesystems (e.g. tmpfs) are fully emulated and as such won't create a shared mapping, but that's not part of this interface.
I agree that this interface (mmap, msync) should include flags
, and allow shared mappings.
I meant that the current implementation of emulated mappings (exactly the case you mention) should probably not share the underlying mappings between processes. It could end up flushing changes made by another process, and I'm not sure if it's a good idea.
However, I just realized that it actually doesn't matter: even if we create a shared anonymous mapping, it won't really be shared with another process. LibOS in the child process will just create an identical mapping during checkpoint restore.
So I think both solutions (using LINUX_PROT_TO_PAL(prot, flags)
and using LINUX_PROT_TO_PAL(prot, MAP_PRIVATE)
) make some sense. I prefer to use the first one, for simplicity: we do the same elsewhere, and I don't want a special case here that actually has zero effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r6, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @pwmarcz)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @pwmarcz)
LibOS/shim/include/shim_fs.h
line 112 at r3 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
I agree that this interface (mmap, msync) should include
flags
, and allow shared mappings.I meant that the current implementation of emulated mappings (exactly the case you mention) should probably not share the underlying mappings between processes. It could end up flushing changes made by another process, and I'm not sure if it's a good idea.
However, I just realized that it actually doesn't matter: even if we create a shared anonymous mapping, it won't really be shared with another process. LibOS in the child process will just create an identical mapping during checkpoint restore.
So I think both solutions (using
LINUX_PROT_TO_PAL(prot, flags)
and usingLINUX_PROT_TO_PAL(prot, MAP_PRIVATE)
) make some sense. I prefer to use the first one, for simplicity: we do the same elsewhere, and I don't want a special case here that actually has zero effect.
But these are 2 different versions, because (as per comment) flags
must contain MAP_SHARED
(not MAP_PRIVATE
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @pwmarcz)
LibOS/shim/include/shim_fs.h
line 112 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
But these are 2 different versions, because (as per comment)
flags
must containMAP_SHARED
(notMAP_PRIVATE
).
I mean, one day, when we have a fully working shared mappings on some PALs, how do one find that these places needs to be fixed? Or maybe the wouldn't need to be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r6.
Reviewable status: all files reviewed, all discussions resolved, "fixup! " found in commit messages' one-liners
LibOS/shim/include/shim_fs.h
line 112 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
I mean, one day, when we have a fully working shared mappings on some PALs, how do one find that these places needs to be fixed? Or maybe the wouldn't need to be?
Ok, I guess these are implementations of mmap
and msync
, so they will need to be revisited then.
23ea777
to
2628ef6
Compare
Signed-off-by: Paweł Marczewski <[email protected]>
Repeated `truncate` operations on in-memory files always ended up reallocating the buffer. As a result, the `copy_mmap_seq` FS test (which I'm about to enable for tmpfs) was very slow. Signed-off-by: Paweł Marczewski <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r7, all commit messages.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)
This adds support for "emulated" memory mappings in LibOS. Memory is written back to the file using a new `msync` callback. Signed-off-by: Paweł Marczewski <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 9 files at r4, 2 of 2 files at r7, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved
Description of the changes
This adds support for "emulated" memory mappings in LibOS. Memory is written back to the file using a new
msync
callback.This (and #539) bring us to feature parity with SGX protected files, so the next step will be to replace the old implementation (see issue: #371).
How to test this PR?
There is a new LibOS regression test,
mmap_file_emulated
, that checks reading and writing an emulated file. In addition, some tmpfs tests are now unskipped.This change is