Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LibOS] Implement mmap for encrypted and tmpfs files #550

Merged
merged 3 commits into from
Apr 29, 2022

Conversation

pwmarcz
Copy link
Contributor

@pwmarcz pwmarcz commented Apr 26, 2022

Description of the changes

This adds support for "emulated" memory mappings in LibOS. Memory is written back to the file using a new msync callback.

This (and #539) bring us to feature parity with SGX protected files, so the next step will be to replace the old implementation (see issue: #371).

How to test this PR?

There is a new LibOS regression test, mmap_file_emulated, that checks reading and writing an emulated file. In addition, some tmpfs tests are now unskipped.


This change is Reviewable

Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 16 of 17 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 18 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @pwmarcz)


-- commits line 9 at r2:
on an -> on


LibOS/shim/include/shim_fs.h line 113 at r2 (raw file):

     */
    int (*msync)(struct shim_handle* hdl, void* addr, size_t size, int prot, int flags,
                uint64_t offset);

Why do we need prot argument here? UPDATE: Ok, because we can have a write-only protection and we want to read the contents of the VMA to write them into the file. So we memorize the original protections, then add R and perform msync, then restore original protections.


LibOS/shim/src/bookkeep/shim_vma.c line 1266 at r2 (raw file):

}

static int _msync_all(uintptr_t begin, uintptr_t end, struct shim_handle* hdl) {

Looks like a potentially very slow operation because it iterates through all VMAs every time it's called. I guess nothing can be done about it, but do you think it can be a perf bottleneck? I can imagine big applications with thousands of VMAs allocated...


LibOS/shim/src/bookkeep/shim_vma.c line 1285 at r2 (raw file):

        uintptr_t msync_end = MIN(end, (uintptr_t)vma_info->addr + vma_info->length);
        ret = file->fs->fs_ops->msync(file, (void*)msync_begin, msync_end - msync_begin,
                                      vma_info->prot, vma_info->flags, vma_info->file_offset);

It's unclear to me if any data races are possible here. Say you collected the relevant VMAs into the vma_infos array and started calling msync() on each of these VMAs. What if another thread unmapped this VMA, mapped another VMA in the same memory range, and now your msync() writes wrong values back into the file?

Hm, on the other hand, looks like it's normal behavior of Linux, where no guarantees for such cases are given for msync.


LibOS/shim/src/fs/shim_fs_mem.c line 98 at r2 (raw file):

    assert((size_t)size <= buf_size);
    if (buf_size != mem->buf_size) {
        int ret = mem_file_resize(mem, buf_size);

Actually, what happens if the user supplies size = 0? Won't we end up with buf_size = 0 and then mem_file_resize() will fail with -ENOMEM? Which is not how truncate is supposed to work.


LibOS/shim/src/fs/shim_fs_util.c line 173 at r2 (raw file):

        assert((size_t)count <= read_size);
        read_size -= count;
        read_addr += count;

Technically we shouldn't use void* because it doesn't allow pointer arithmetic. We should technically use char*, but I don't insist.


LibOS/shim/src/fs/shim_fs_util.c line 195 at r2 (raw file):

int generic_emulated_msync(struct shim_handle* hdl, void* addr, size_t size, int prot, int flags,
                           uint64_t offset) {
    assert(!(flags & MAP_PRIVATE));

Well, you could add more flags here, e.g. ANONYMOUS. Why did you decide on this particular assert?


LibOS/shim/src/fs/shim_fs_util.c line 222 at r2 (raw file):

        }

        if (count == 0)

Can this happen?


LibOS/shim/src/fs/shim_fs_util.c line 227 at r2 (raw file):

        assert((size_t)count <= write_size);
        write_size -= count;
        write_addr += count;

ditto


LibOS/shim/src/fs/chroot/encrypted.c line 31 at r2 (raw file):

 * - truncate (the current `truncate` operation works only for extending the file, support for
 *   truncation needs to be added to the `protected_files` module)
 * - flush all files on process exit

Why didn't you implement it in this PR? It will be a follow-up?


LibOS/shim/src/fs/chroot/encrypted.c line 373 at r2 (raw file):

    assert(hdl->inode->type == S_IFREG);

    int ret = msync_handle(hdl);

I'm confused here. Why do we have both msync_handle() and encrypted_file_flush()? Aren't they doing the same?


LibOS/shim/src/fs/chroot/encrypted.c line 405 at r2 (raw file):

    lock(&hdl->inode->lock);
    int ret = encrypted_file_read(enc, buf, count, *pos, &actual_count);

Ouch. So this bug went unnoticed because the only user of this callback was shim_do_read() (which anyhow supplies &hdl->pos in the pos argument)? And now you added the mmap emulation that uses this callback too, but from an arbitrary pos, and that's how you exposed the bug?

Same question for the bug in chroot_encrypted_write() below.


LibOS/shim/src/fs/tmpfs/fs.c line 295 at r2 (raw file):

   are currently not flexible enough for correct tmpfs implementation. In particular, shim_do_mmap()
   pre-allocates memory region at a specific address (making it impossible to have two mmaps on the
   same tmpfs file), and shim_do_munmap() doesn't have a callback into tmpfs at all. */

Actually, is it possible now to have two mmaps on the same tmpfs file? Did we change anything that enables this scenario now?

And since we're here, could we add such a case in our tests?


LibOS/shim/src/sys/shim_mmap.c line 459 at r2 (raw file):

    if (flags & MS_SYNC) {
        return msync_range(start, start + len);

Nice side-effect!


LibOS/shim/test/fs/test_tmpfs.py line 69 at r2 (raw file):

        pass

    # This overrides parent class to remove @expectedFailureIf(HAS_SGX)

What's the deal with this expectedFailureIf? Why would the parent class have it at all? Isn't it better to change the parent class and remove this attribute completely? After all, now all mmaps should work. And then we won't need to overwrite these methods (well, maybe we'll need to adjust the timeouts, but it may be done in the parent class).


LibOS/shim/test/regression/mmap_file_emulated.c line 7 at r2 (raw file):

/*
 * Test file mapping emulated by Gramine (encrypted: tmpfs): try reading and writing a file through

ecnrypted, tmpfs (comma instead of colon)


LibOS/shim/test/regression/mmap_file_emulated.c line 35 at r2 (raw file):

    const char* path = argv[1];

    setbuf(stdout, NULL);

Why do we care?


LibOS/shim/test/regression/mmap_file_emulated.c line 67 at r2 (raw file):

    if (fd == -1)
        err(1, "open");

Please remove one empty line


LibOS/shim/test/regression/mmap_file_emulated.c line 69 at r2 (raw file):

    void* addr = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);

Actually, a test with offset != 0 would be also good.

Copy link
Contributor Author

@pwmarcz pwmarcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 12 of 17 files reviewed, 15 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "squash! " and "fixup! " found in commit messages' one-liners (waiting on @dimakuv)


-- commits line 9 at r2:

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

on an -> on

I added a squash commit.


LibOS/shim/include/shim_fs.h line 113 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why do we need prot argument here? UPDATE: Ok, because we can have a write-only protection and we want to read the contents of the VMA to write them into the file. So we memorize the original protections, then add R and perform msync, then restore original protections.

That's right, we need to know how the memory is currently mapped.


LibOS/shim/src/bookkeep/shim_vma.c line 1266 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Looks like a potentially very slow operation because it iterates through all VMAs every time it's called. I guess nothing can be done about it, but do you think it can be a perf bottleneck? I can imagine big applications with thousands of VMAs allocated...

You're right. I used the function for dumping all VMAs, but it can be adapted to dump all VMAs in range (and they're stored in an AVL tree, so lookup will be cheap). Fixed.


LibOS/shim/src/bookkeep/shim_vma.c line 1285 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

It's unclear to me if any data races are possible here. Say you collected the relevant VMAs into the vma_infos array and started calling msync() on each of these VMAs. What if another thread unmapped this VMA, mapped another VMA in the same memory range, and now your msync() writes wrong values back into the file?

Hm, on the other hand, looks like it's normal behavior of Linux, where no guarantees for such cases are given for msync.

Yes, unfortunately... This msync could write wrong values, or even crash if the memory is unmapped by another thread.

I don't see an easy way around it: VMA operations take a lock, but you cannot do anything while holding that lock (since even malloc might trigger mapping more memory).


LibOS/shim/src/fs/shim_fs_mem.c line 98 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Actually, what happens if the user supplies size = 0? Won't we end up with buf_size = 0 and then mem_file_resize() will fail with -ENOMEM? Which is not how truncate is supposed to work.

Yeah... And I don't particularly want to deal with mem_file_resize to 0. I'll introduce a minimum size for resizing.


LibOS/shim/src/fs/shim_fs_util.c line 173 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Technically we shouldn't use void* because it doesn't allow pointer arithmetic. We should technically use char*, but I don't insist.

OK, let's use char*.


LibOS/shim/src/fs/shim_fs_util.c line 195 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Well, you could add more flags here, e.g. ANONYMOUS. Why did you decide on this particular assert?

It's hard for me to imagine that we'll call msync on a non-file mapping, since hdl and flags likely come from the same source. But it's possible to make a mistake and call it on a private mapping, which should not be persisted to a file.


LibOS/shim/src/fs/shim_fs_util.c line 222 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Can this happen?

I think it cannot, the current cases (tmpfs and protected-files). I'm not so sure about other filesystems that might use this in the future (especially if we're writing to host).

Still, this should be an error, not success.


LibOS/shim/src/fs/shim_fs_util.c line 227 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

ditto

Converted to char*.


LibOS/shim/src/fs/chroot/encrypted.c line 31 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why didn't you implement it in this PR? It will be a follow-up?

Actually, I'm not sure how to even do that. I would like to call flush() on all the handles in the process, but do we have a list of them? There is the "handle map" which maps FDs to handles, but we might keep some open handles without FDs in the memory mappings (if someone did mmap() and then close()).


LibOS/shim/src/fs/chroot/encrypted.c line 373 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I'm confused here. Why do we have both msync_handle() and encrypted_file_flush()? Aren't they doing the same?

flush is a callback for the sync system call, and it's supposed to also flush file mappings. We could call chroot_encrypted_msync here, but we don't know what mappings we should call it on (and I thought it would be better NOT to duplicate this information in handle). So we call msync_handle, means basically "find all shared mappings of this file and call msync on them".


LibOS/shim/src/fs/chroot/encrypted.c line 405 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Ouch. So this bug went unnoticed because the only user of this callback was shim_do_read() (which anyhow supplies &hdl->pos in the pos argument)? And now you added the mmap emulation that uses this callback too, but from an arbitrary pos, and that's how you exposed the bug?

Same question for the bug in chroot_encrypted_write() below.

It would also be tested by the LibOS/shim/test/fs tests, since they use sendfile, but I haven't switched them to test the encrypted filesystem (instead of protected files) yet.


LibOS/shim/src/fs/tmpfs/fs.c line 295 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Actually, is it possible now to have two mmaps on the same tmpfs file? Did we change anything that enables this scenario now?

And since we're here, could we add such a case in our tests?

It's possible, but unlike Linux, they will not be synchronized: if you modify one, the other will stay the same.

I added a second mapping to the test.


LibOS/shim/test/fs/test_tmpfs.py line 69 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

What's the deal with this expectedFailureIf? Why would the parent class have it at all? Isn't it better to change the parent class and remove this attribute completely? After all, now all mmaps should work. And then we won't need to overwrite these methods (well, maybe we'll need to adjust the timeouts, but it may be done in the parent class).

Unfortunately, no, because test uses "allowed" files and tries to write to them using mmap, which is something PAL doesn't support:

    if (!(prot & PAL_PROT_WRITECOPY) && (prot & PAL_PROT_WRITE)) {
        log_warning(
            "file_map does not currently support writable pass-through mappings on SGX. You "
            "may add the PAL_PROT_WRITECOPY (MAP_PRIVATE) flag to your file mapping to keep "
            "the writes inside the enclave but they won't be reflected outside of the "
            "enclave.");
        return -PAL_ERROR_DENIED;
    }

I think the current solution could be extended to allowed files, although I don't want to do it right in this PR.


LibOS/shim/test/regression/mmap_file_emulated.c line 7 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

ecnrypted, tmpfs (comma instead of colon)

Fixed.


LibOS/shim/test/regression/mmap_file_emulated.c line 35 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why do we care?

Otherwise, all the output from printf is at the bottom (probably because libc doesn't see Gramine's stdout as a terminal, and sets it to block-buffered; and stderr is always unbuffered).

I guess we could output everything to stderr, since Gramine joins the streams anyway. But that's what we ended up doing in most of these tests: "test ok" using stdout, errors using stderr.


LibOS/shim/test/regression/mmap_file_emulated.c line 67 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please remove one empty line

Done.


LibOS/shim/test/regression/mmap_file_emulated.c line 69 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Actually, a test with offset != 0 would be also good.

I added a second mapping with offset != 0. This covers the following cases: two mappings, shared/private, and different offsets.

Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r3, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "squash! " and "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski, @mkow, and @pwmarcz)


LibOS/shim/src/bookkeep/shim_vma.c line 1266 at r2 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

You're right. I used the function for dumping all VMAs, but it can be adapted to dump all VMAs in range (and they're stored in an AVL tree, so lookup will be cheap). Fixed.

Awesome solution!


LibOS/shim/src/bookkeep/shim_vma.c line 1285 at r2 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

Yes, unfortunately... This msync could write wrong values, or even crash if the memory is unmapped by another thread.

I don't see an easy way around it: VMA operations take a lock, but you cannot do anything while holding that lock (since even malloc might trigger mapping more memory).

Yep. Anyway, resolved.


LibOS/shim/src/fs/chroot/encrypted.c line 31 at r2 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

Actually, I'm not sure how to even do that. I would like to call flush() on all the handles in the process, but do we have a list of them? There is the "handle map" which maps FDs to handles, but we might keep some open handles without FDs in the memory mappings (if someone did mmap() and then close()).

Oh no. @boryspoplawski @mkow Do you know how we can collect all handles, even the ones without an atteched FD?

But anyway, I think this is for a follow-up PR.


LibOS/shim/src/fs/chroot/encrypted.c line 373 at r2 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

flush is a callback for the sync system call, and it's supposed to also flush file mappings. We could call chroot_encrypted_msync here, but we don't know what mappings we should call it on (and I thought it would be better NOT to duplicate this information in handle). So we call msync_handle, means basically "find all shared mappings of this file and call msync on them".

Ok. Pawel's explanation didn't help me much, but I explained it to myself like this:

  • msync_handle() ends up in ipf_write() -- which performs memcpy from mmapped memory region into the corresponding nodes of the protected/encrypted file (i.e., from the mmap memory region into the file-blocks-in-plaintext memory region)
  • encrypted_file_flush() ends up in ipf_internal_flush() -- which performs writes of file-blocks-in-plaintext into the host-level files.

So we have two levels of indirection here: first sync from mmap region to file-blocks region, then sync from file-blocks region to host-level file. That's why two functions are used together.

Maybe will be helpful for other reviewers or for future.


LibOS/shim/src/fs/tmpfs/fs.c line 295 at r2 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

It's possible, but unlike Linux, they will not be synchronized: if you modify one, the other will stay the same.

I added a second mapping to the test.

Wait, but is it Ok?


LibOS/shim/test/regression/mmap_file_emulated.c line 69 at r2 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

I added a second mapping with offset != 0. This covers the following cases: two mappings, shared/private, and different offsets.

Thanks, looks awesome!

Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 11 of 17 files at r1, 1 of 1 files at r2, 5 of 5 files at r3, all commit messages.
Reviewable status: all files reviewed, 33 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "squash! " and "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @pwmarcz)


-- commits line 3 at r3:
Most of these commits seem unrelated to each other, would be better to have them in separate PRs, but I guess too late now.


LibOS/shim/include/shim_fs.h line 89 at r3 (raw file):

     *
     * \param hdl     File handle.
     * \param addr    Address of the memory region. Cannot be NULL.

Why do we need this artificial disallowance of NULL?
Update: is it because it's passed as is to PAL API which still requires it?


LibOS/shim/include/shim_fs.h line 91 at r3 (raw file):

     * \param addr    Address of the memory region. Cannot be NULL.
     * \param size    Size of the memory region.
     * \param prot    Permissions for the memory region (`PAL_PROT_*`).

Why these are PAL_PROT_*??
Update: from what I can see these are Linux prot, not PAL...


LibOS/shim/include/shim_fs.h line 112 at r3 (raw file):

     * region.
     */
    int (*msync)(struct shim_handle* hdl, void* addr, size_t size, int prot, int flags,

Please comment all args.


LibOS/shim/include/shim_fs.h line 112 at r3 (raw file):

     * region.
     */
    int (*msync)(struct shim_handle* hdl, void* addr, size_t size, int prot, int flags,

Why do we need flags argument?


LibOS/shim/include/shim_vma.h line 131 at r3 (raw file):

/* Call `msync` for file mappings of `hdl` */
int msync_handle(struct shim_handle* hdl);

Why do we need this? In which cases we need to call msync based on handle not address range?


LibOS/shim/src/bookkeep/shim_vma.c line 1119 at r3 (raw file):

    for (vma = _lookup_vma(begin); vma && vma->begin < end; vma = _get_next_vma(vma)) {
        if (vma->end <= begin)

How is this case possible?


LibOS/shim/src/bookkeep/shim_vma.c line 1173 at r3 (raw file):

int dump_all_vmas(struct shim_vma_info** ret_infos, size_t* ret_count, bool include_unmapped) {
    return dump_vmas(ret_infos, ret_count, /*begin=*/0, /*end=*/(uintptr_t)-1,

Please use UINTPTR_MAX.


LibOS/shim/src/bookkeep/shim_vma.c line 1174 at r3 (raw file):

int dump_all_vmas(struct shim_vma_info** ret_infos, size_t* ret_count, bool include_unmapped) {
    return dump_vmas(ret_infos, ret_count, /*begin=*/0, /*end=*/(uintptr_t)-1,
                     include_unmapped ? &vma_filter_all : &vma_filter_exclude_unmapped,

No need for & with function pointers.


LibOS/shim/src/bookkeep/shim_vma.c line 1246 at r3 (raw file):

        return false;

    if (!vma->file)

Is this case possible? We check for MAP_ANONYMOUS above.


LibOS/shim/src/bookkeep/shim_vma.c line 1252 at r3 (raw file):

        return false;

    if (!(vma->file->acc_mode & MAY_WRITE))

Don't we need a lock for this?
Conceptually we shouldn't, but on the current master I think it's required, but I'm not sure.


LibOS/shim/src/bookkeep/shim_vma.c line 1261 at r3 (raw file):

}

static int _msync_all(uintptr_t begin, uintptr_t end, struct shim_handle* hdl) {

Please assert that begin and end are page aligned.


LibOS/shim/src/bookkeep/shim_vma.c line 1261 at r3 (raw file):

}

static int _msync_all(uintptr_t begin, uintptr_t end, struct shim_handle* hdl) {

We usually use names with leading underscore to highlight functions which must be called with some locks taken.
Not blocking - this is rather informal convention.


LibOS/shim/src/bookkeep/shim_vma.c line 1265 at r3 (raw file):

    size_t count;

    int ret = dump_vmas(&vma_infos, &count, begin, end, &vma_filter_needs_msync, hdl);

No need for & with function pointer


LibOS/shim/src/bookkeep/shim_vma.c line 1296 at r3 (raw file):

int msync_handle(struct shim_handle* hdl) {
    return _msync_all(/*begin=*/0, /*end=*/(uintptr_t)-1, hdl);

UINTPTR_MAX


LibOS/shim/src/fs/shim_fs_mem.c line 91 at r3 (raw file):

    size_t buf_size = mem->buf_size;
    if ((size_t)size > buf_size) {

Could we have this condition in the same order as the one below?


LibOS/shim/src/fs/shim_fs_mem.c line 98 at r3 (raw file):

        }
    } else {
        while (buf_size / 2 > (size_t)size)

Shouldn't this be >=?


LibOS/shim/src/fs/shim_fs_util.c line 238 at r3 (raw file):

    if (pal_prot != pal_prot_readable) {
        int protect_ret = DkVirtualMemoryProtect(addr, size, pal_prot);
        if (ret < 0) {

This should be protect_ret


LibOS/shim/src/fs/shim_fs_util.c line 240 at r3 (raw file):

        if (ret < 0) {
            log_debug("%s: DkVirtualMemoryProtect failed on cleanup: %d; memory will be left "
                      "readable", __func__, protect_ret);

I think we should kill the app, since the state is inconsistent (and now the app won't even know about it).


LibOS/shim/src/fs/chroot/encrypted.c line 31 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Oh no. @boryspoplawski @mkow Do you know how we can collect all handles, even the ones without an atteched FD?

But anyway, I think this is for a follow-up PR.

I think the only way is to get rid of all handle references and have flushing on release. To do that we would probably need to close all fds and unmmap all user memory.


LibOS/shim/src/fs/chroot/encrypted.c line 373 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Ok. Pawel's explanation didn't help me much, but I explained it to myself like this:

  • msync_handle() ends up in ipf_write() -- which performs memcpy from mmapped memory region into the corresponding nodes of the protected/encrypted file (i.e., from the mmap memory region into the file-blocks-in-plaintext memory region)
  • encrypted_file_flush() ends up in ipf_internal_flush() -- which performs writes of file-blocks-in-plaintext into the host-level files.

So we have two levels of indirection here: first sync from mmap region to file-blocks region, then sync from file-blocks region to host-level file. That's why two functions are used together.

Maybe will be helpful for other reviewers or for future.

I'm still a bit confused. Why we don't have to call msync_handle inside encrypted_file_flush?


LibOS/shim/src/fs/tmpfs/fs.c line 295 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Wait, but is it Ok?

+1 it seems that it's impossible to have correct mmap on tmpfs with SGX PAL.


LibOS/shim/src/sys/shim_exit.c line 18 at r3 (raw file):

#include "shim_thread.h"
#include "shim_utils.h"
#include "shim_vma.h"

Why we need this?


LibOS/shim/src/sys/shim_mmap.c line 140 at r3 (raw file):

            goto out_handle;
        }
        /* Flush any file mappings we're about to replace */

Is this correct? In case of MAP_FIXED_NOREPLACE we might fail without doing anything. I guess random msync is still correct, because it only affects existing mappings (so we not need to worry if the memory is present), which are shared, so they can be flushed at any point?


LibOS/shim/src/sys/shim_mmap.c line 449 at r3 (raw file):

    }

    if (flags & MS_INVALIDATE) {

NP and not blocking: maybe move this to the if below, which kind of iterates through all flags?


LibOS/shim/test/regression/mmap_file_emulated.c line 29 at r3 (raw file):

int main(int argc, char** argv) {
    ssize_t ret;
    int fd;

Let's move these to lines with first usage (we use c11 so we can :) )


LibOS/shim/test/regression/mmap_file_emulated.c line 62 at r3 (raw file):

        err(1, "failed to write file");
    if ((size_t)ret < MESSAGE_LEN)
        err(1, "not enough bytes written");

NP: errx


LibOS/shim/test/regression/mmap_file_emulated.c line 74 at r3 (raw file):

        err(1, "failed to write file");
    if ((size_t)ret < MESSAGE_LEN)
        err(1, "not enough bytes written");

ditto


LibOS/shim/test/regression/mmap_file_emulated.c line 80 at r3 (raw file):

        err(1, "close");

    printf("CREATE OK\n");

NP: puts


LibOS/shim/test/regression/mmap_file_emulated.c line 85 at r3 (raw file):

    fd = open(path, O_RDWR, 0);
    if (fd == -1)

NP: this check differs from all others (< 0)


LibOS/shim/test/regression/mmap_file_emulated.c line 104 at r3 (raw file):

    for (size_t i = MESSAGE_LEN; i < mmap_size; i++) {
        if (addr_private[i] != 0)
            errx(1, "unexpected non-zero byte at addr_private[%zu]", i);

Can we also check addr_shared?


LibOS/shim/test/regression/mmap_file_emulated.c line 107 at r3 (raw file):

    }

    printf("MAP OK\n");

NP: puts


LibOS/shim/test/regression/mmap_file_emulated.c line 122 at r3 (raw file):

        err(1, "munmap");

    ret = close(fd);

Could you move close to right after mmaps? To make sure these changes are flushed on munmap, not close.


LibOS/shim/test/regression/mmap_file_emulated.c line 126 at r3 (raw file):

        err(1, "close");

    printf("WRITE OK\n");

ditto (puts)


LibOS/shim/test/regression/mmap_file_emulated.c line 132 at r3 (raw file):

    char buf[file_size];

    ret = posix_file_read(path, buf, sizeof(buf));

Shouldn't we also check ret < 0?


LibOS/shim/test/regression/mmap_file_emulated.c line 142 at r3 (raw file):

        errx(1, "wrong file content");

    printf("TEST OK\n");

ditto (puts)

Copy link
Contributor Author

@pwmarcz pwmarcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 17 files reviewed, 33 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski, @dimakuv, and @mkow)


LibOS/shim/include/shim_fs.h line 89 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Why do we need this artificial disallowance of NULL?
Update: is it because it's passed as is to PAL API which still requires it?

Yes. PAL interprets NULL as "use any address".


LibOS/shim/include/shim_fs.h line 91 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Why these are PAL_PROT_*??
Update: from what I can see these are Linux prot, not PAL...

Right. Fixed.


LibOS/shim/include/shim_fs.h line 112 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Please comment all args.

OK, they're the same as mmap though. I modified the description to make that clear.


LibOS/shim/include/shim_fs.h line 112 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Why do we need flags argument?

See the implementation. We sometimes need to call PAL to change permissions, and that requires both prot and flags:

pal_prot_flags_t pal_prot = LINUX_PROT_TO_PAL(prot, flags);

LibOS/shim/include/shim_vma.h line 131 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Why do we need this? In which cases we need to call msync based on handle not address range?

When the user requests to sync a specific handle, using the sync syscall (our flush callback).


LibOS/shim/src/bookkeep/shim_vma.c line 1119 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

How is this case possible?

My mistake, somehow I thought _lookup_vma returns the VMA before address, not after. Removed.


LibOS/shim/src/bookkeep/shim_vma.c line 1173 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Please use UINTPTR_MAX.

Done.


LibOS/shim/src/bookkeep/shim_vma.c line 1174 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

No need for & with function pointers.

I thought we use & for clarity; I pretty universally used it in my code so far and nobody complained (most notably, all the code where we setup pseudofs). On the other hand, I see @mkow not using & in his sysfs code with callbacks.

Removed & for now.


LibOS/shim/src/bookkeep/shim_vma.c line 1246 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Is this case possible? We check for MAP_ANONYMOUS above.

Yeah, it makes sense that vma->file would always be present. I wasn't sure because this module never enforces or documents this, so we depend on callers.

Replaced with assertion.


LibOS/shim/src/bookkeep/shim_vma.c line 1252 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Don't we need a lock for this?
Conceptually we shouldn't, but on the current master I think it's required, but I'm not sure.

Yeah, we still modify it in socket code. I'll add a lock and a TODO here.


LibOS/shim/src/bookkeep/shim_vma.c line 1261 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Please assert that begin and end are page aligned.

OK. This is a bit awkward because of UINTPTR_MAX, but I don't know how to do it better.


LibOS/shim/src/bookkeep/shim_vma.c line 1261 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

We usually use names with leading underscore to highlight functions which must be called with some locks taken.
Not blocking - this is rather informal convention.

I called this _msync_all back when I actually had a function called msync_all (it wasn't about a lock, but about defining a helper function). That's not necessary now, so I'm removing the underscore.


LibOS/shim/src/bookkeep/shim_vma.c line 1265 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

No need for & with function pointer

Done.


LibOS/shim/src/bookkeep/shim_vma.c line 1296 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

UINTPTR_MAX

Done.


LibOS/shim/src/fs/shim_fs_mem.c line 91 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Could we have this condition in the same order as the one below?

Done (and same in the function above).


LibOS/shim/src/fs/shim_fs_mem.c line 98 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Shouldn't this be >=?

Makes sense, but then I have to also add a case for size == 0. Done.


LibOS/shim/src/fs/shim_fs_util.c line 238 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

This should be protect_ret

Fixed.


LibOS/shim/src/fs/shim_fs_util.c line 240 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

I think we should kill the app, since the state is inconsistent (and now the app won't even know about it).

Added BUG().


LibOS/shim/src/fs/chroot/encrypted.c line 31 at r2 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

I think the only way is to get rid of all handle references and have flushing on release. To do that we would probably need to close all fds and unmmap all user memory.

Sounds good.


LibOS/shim/src/fs/chroot/encrypted.c line 373 at r2 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

I'm still a bit confused. Why we don't have to call msync_handle inside encrypted_file_flush?

encrypted_file_flush only works with the encrypted-file object. It has no idea that we're also keeping emulated mappings (and in fact, does not depend on shim_handle or shim_vma or anything else in the system; it just takes care of that one object).

I added comments here to explain that better.


LibOS/shim/src/fs/tmpfs/fs.c line 295 at r2 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

+1 it seems that it's impossible to have correct mmap on tmpfs with SGX PAL.

What do you mean? Emulated mappings should work exactly the same for tmpfs and encrypted fs:

  • it is possible to have many mappings of a single file,
  • the mappings are not updated automatically, we just fill them on mmap and write back to file on mmap.

This is not "correct", but a useful approximation, and what we always had for protected files.


LibOS/shim/src/sys/shim_mmap.c line 140 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Is this correct? In case of MAP_FIXED_NOREPLACE we might fail without doing anything. I guess random msync is still correct, because it only affects existing mappings (so we not need to worry if the memory is present), which are shared, so they can be flushed at any point?

Yes. But you're right, we can skip this on MAP_FIXED_NOREPLACE.


LibOS/shim/src/sys/shim_mmap.c line 449 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

NP and not blocking: maybe move this to the if below, which kind of iterates through all flags?

Moved below.


LibOS/shim/test/regression/mmap_file_emulated.c line 29 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Let's move these to lines with first usage (we use c11 so we can :) )

OK.


LibOS/shim/test/regression/mmap_file_emulated.c line 62 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

NP: errx

Done.


LibOS/shim/test/regression/mmap_file_emulated.c line 74 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

ditto

Done.


LibOS/shim/test/regression/mmap_file_emulated.c line 80 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

NP: puts

Done.


LibOS/shim/test/regression/mmap_file_emulated.c line 85 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

NP: this check differs from all others (< 0)

Fixed.


LibOS/shim/test/regression/mmap_file_emulated.c line 104 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Can we also check addr_shared?

OK, sure.


LibOS/shim/test/regression/mmap_file_emulated.c line 107 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

NP: puts

Done.


LibOS/shim/test/regression/mmap_file_emulated.c line 122 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Could you move close to right after mmaps? To make sure these changes are flushed on munmap, not close.

Done.


LibOS/shim/test/regression/mmap_file_emulated.c line 126 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

ditto (puts)

Done.


LibOS/shim/test/regression/mmap_file_emulated.c line 132 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Shouldn't we also check ret < 0?

Right. Fixed.


LibOS/shim/test/regression/mmap_file_emulated.c line 142 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

ditto (puts)

Done.


LibOS/shim/src/sys/shim_exit.c line 18 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Why we need this?

Leftover from earlier attempt to flush-on-exit, before I decided I don't want to do it now. Fixed.

Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 9 of 9 files at r4, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @pwmarcz)


LibOS/shim/include/shim_fs.h line 89 at r3 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

Yes. PAL interprets NULL as "use any address".

Right


LibOS/shim/include/shim_fs.h line 112 at r3 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

See the implementation. We sometimes need to call PAL to change permissions, and that requires both prot and flags:

pal_prot_flags_t pal_prot = LINUX_PROT_TO_PAL(prot, flags);

Not sure how much of an implementation details it is, but they only care for MAP_PRIVATE, which we know to be false in msync.


LibOS/shim/src/bookkeep/shim_vma.c line 1174 at r3 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

I thought we use & for clarity; I pretty universally used it in my code so far and nobody complained (most notably, all the code where we setup pseudofs). On the other hand, I see @mkow not using & in his sysfs code with callbacks.

Removed & for now.

Hmm, I guess we have no official policy and it's a matter of preference. Should have mentioned it, but not blocked, my bad.
I personally prefer no &, because these are function pointers, and that should be clear from the name itself and the & is not needed, like, we already know it's a function/pointer.


LibOS/shim/src/bookkeep/shim_vma.c line 1252 at r3 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

Yeah, we still modify it in socket code. I'll add a lock and a TODO here.

Hmmm, actually, are we allowed to take any locks here? We are inside vmas traversing function, under global vma lock, I think.
In reality acc_mode can't change for anything other than sockets, so maybe we can just ignore the lock, leave a XXX describing the situation, to be removed when we clear semantics of accessing acc_mode (which we should do after sockets rewrite)
wdyt? Above I assumed that only sockets are the issue.


LibOS/shim/src/bookkeep/shim_vma.c line 1261 at r3 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

OK. This is a bit awkward because of UINTPTR_MAX, but I don't know how to do it better.

Right. But I guess it's ok


LibOS/shim/src/fs/chroot/encrypted.c line 373 at r2 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

encrypted_file_flush only works with the encrypted-file object. It has no idea that we're also keeping emulated mappings (and in fact, does not depend on shim_handle or shim_vma or anything else in the system; it just takes care of that one object).

I added comments here to explain that better.

Well, I'm not familiar with the new PF implementation, so I would have to check it out. Resolving.


LibOS/shim/src/fs/tmpfs/fs.c line 295 at r2 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

What do you mean? Emulated mappings should work exactly the same for tmpfs and encrypted fs:

  • it is possible to have many mappings of a single file,
  • the mappings are not updated automatically, we just fill them on mmap and write back to file on mmap.

This is not "correct", but a useful approximation, and what we always had for protected files.

I guess nothing really relies on mapping the same file twice? Maybe the current version won't hurt anybody, idk

dimakuv
dimakuv previously approved these changes Apr 28, 2022
Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 9 files at r4, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski and @pwmarcz)


LibOS/shim/src/bookkeep/shim_vma.c line 1252 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Hmmm, actually, are we allowed to take any locks here? We are inside vmas traversing function, under global vma lock, I think.
In reality acc_mode can't change for anything other than sockets, so maybe we can just ignore the lock, leave a XXX describing the situation, to be removed when we clear semantics of accessing acc_mode (which we should do after sockets rewrite)
wdyt? Above I assumed that only sockets are the issue.

+1. I don't like the idea of having a lock in this function. I think we can safely assume that after the previous check on msync callback existence, we can be sure that it's a file handle and then its acc_mode is never changed. We can add a comment about this.


LibOS/shim/src/fs/tmpfs/fs.c line 295 at r2 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

I guess nothing really relies on mapping the same file twice? Maybe the current version won't hurt anybody, idk

Yeah, nobody ever complained about this. So I think this is not a realistic scenario. Indeed, why would an app want to map the same file twice. I'm resolving.

Copy link
Contributor Author

@pwmarcz pwmarcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 15 of 17 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski and @dimakuv)


LibOS/shim/include/shim_fs.h line 112 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Not sure how much of an implementation details it is, but they only care for MAP_PRIVATE, which we know to be false in msync.

Oooh, actually, the PAL mapping should not be shared. The Linux-SGX PAL ignores that distinction anyway, but we don't want Linux PAL to create a shared region between processes. So right, the flags parameter will be unused here.

I still want to leave the parameter, because:

  • it makes for a nice symmetry with the mmap callback: they have the same parameters, omitting just one of them would be surprising,
  • we get to have an assert that we're not flushing a Gramine-level MAP_PRIVATE mapping.

LibOS/shim/src/bookkeep/shim_vma.c line 1252 at r3 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

+1. I don't like the idea of having a lock in this function. I think we can safely assume that after the previous check on msync callback existence, we can be sure that it's a file handle and then its acc_mode is never changed. We can add a comment about this.

Oh, right, this is under vma_tree_lock, I shouldn't do that.

Changed as suggested.


LibOS/shim/src/fs/tmpfs/fs.c line 295 at r2 (raw file):

Indeed, why would an app want to map the same file twice.

Well, there's the magic ring buffer trick :)

dimakuv
dimakuv previously approved these changes Apr 28, 2022
Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r5, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski)


LibOS/shim/src/fs/tmpfs/fs.c line 295 at r2 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

Indeed, why would an app want to map the same file twice.

Well, there's the magic ring buffer trick :)

One memcpy instead of two memcpys to allow the compiler to apply vectorization optimizations? Wow, I wonder if people are actually using such tricks to squeze out an additional... 0.001% performance? But maybe I'm wrong, maybe it's useful somewhere.

Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r5, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @pwmarcz)


LibOS/shim/include/shim_fs.h line 112 at r3 (raw file):
Ok, let's leave it.

we don't want Linux PAL to create a shared region between processes

Why is that? Ideally most of the PALs should support shared mappings, so should msync. I guess some of filesystems (e.g. tmpfs) are fully emulated and as such won't create a shared mapping, but that's not part of this interface.


LibOS/shim/src/fs/shim_fs_util.c line 147 at r5 (raw file):

    int ret;

    /* The underlying PAL mapping should always be private. */

I think this deserves more explanation, i.e. why it should be private


LibOS/shim/src/fs/shim_fs_util.c line 206 at r5 (raw file):

    unlock(&hdl->inode->lock);

    /* The underlying PAL mapping should always be private. */

ditto

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 17 files at r1, 1 of 1 files at r2.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @pwmarcz)


LibOS/shim/src/fs/chroot/encrypted.c line 31 at r2 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

Sounds good.

+1

Copy link
Contributor Author

@pwmarcz pwmarcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 16 of 17 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski, @dimakuv, and @pwmarcz)


LibOS/shim/include/shim_fs.h line 112 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Ok, let's leave it.

we don't want Linux PAL to create a shared region between processes

Why is that? Ideally most of the PALs should support shared mappings, so should msync. I guess some of filesystems (e.g. tmpfs) are fully emulated and as such won't create a shared mapping, but that's not part of this interface.

I agree that this interface (mmap, msync) should include flags, and allow shared mappings.

I meant that the current implementation of emulated mappings (exactly the case you mention) should probably not share the underlying mappings between processes. It could end up flushing changes made by another process, and I'm not sure if it's a good idea.

However, I just realized that it actually doesn't matter: even if we create a shared anonymous mapping, it won't really be shared with another process. LibOS in the child process will just create an identical mapping during checkpoint restore.

So I think both solutions (using LINUX_PROT_TO_PAL(prot, flags) and using LINUX_PROT_TO_PAL(prot, MAP_PRIVATE)) make some sense. I prefer to use the first one, for simplicity: we do the same elsewhere, and I don't want a special case here that actually has zero effect.

dimakuv
dimakuv previously approved these changes Apr 29, 2022
Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r6, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @pwmarcz)

Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @pwmarcz)


LibOS/shim/include/shim_fs.h line 112 at r3 (raw file):

Previously, pwmarcz (Paweł Marczewski) wrote…

I agree that this interface (mmap, msync) should include flags, and allow shared mappings.

I meant that the current implementation of emulated mappings (exactly the case you mention) should probably not share the underlying mappings between processes. It could end up flushing changes made by another process, and I'm not sure if it's a good idea.

However, I just realized that it actually doesn't matter: even if we create a shared anonymous mapping, it won't really be shared with another process. LibOS in the child process will just create an identical mapping during checkpoint restore.

So I think both solutions (using LINUX_PROT_TO_PAL(prot, flags) and using LINUX_PROT_TO_PAL(prot, MAP_PRIVATE)) make some sense. I prefer to use the first one, for simplicity: we do the same elsewhere, and I don't want a special case here that actually has zero effect.

But these are 2 different versions, because (as per comment) flags must contain MAP_SHARED (not MAP_PRIVATE).

Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @pwmarcz)


LibOS/shim/include/shim_fs.h line 112 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

But these are 2 different versions, because (as per comment) flags must contain MAP_SHARED (not MAP_PRIVATE).

I mean, one day, when we have a fully working shared mappings on some PALs, how do one find that these places needs to be fixed? Or maybe the wouldn't need to be?

boryspoplawski
boryspoplawski previously approved these changes Apr 29, 2022
Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r6.
Reviewable status: all files reviewed, all discussions resolved, "fixup! " found in commit messages' one-liners


LibOS/shim/include/shim_fs.h line 112 at r3 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

I mean, one day, when we have a fully working shared mappings on some PALs, how do one find that these places needs to be fixed? Or maybe the wouldn't need to be?

Ok, I guess these are implementations of mmap and msync, so they will need to be revisited then.

Repeated `truncate` operations on in-memory files always ended up
reallocating the buffer. As a result, the `copy_mmap_seq` FS test (which
I'm about to enable for tmpfs) was very slow.

Signed-off-by: Paweł Marczewski <[email protected]>
Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r7, all commit messages.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)

This adds support for "emulated" memory mappings in LibOS. Memory is
written back to the file using a new `msync` callback.

Signed-off-by: Paweł Marczewski <[email protected]>
Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 9 files at r4, 2 of 2 files at r7, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants