Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LibOS] Add support for timerfd system calls #1734

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kailun-qin
Copy link
Contributor

@kailun-qin kailun-qin commented Jan 26, 2024

Description of the changes

This commit adds support for system calls that create and operate on a timer that delivers timer expiration notifications via a file descriptor, specifically: timerfd_create(), timerfd_settime() and timerfd_gettime(). The timerfd object is associated with a dummy eventfd created on the host to trigger notifications (e.g., in epoll). The object is created inside Gramine, with all it operations resolved entirely inside Gramine.

The emulation is currently implemented at the level of a single process. However, it may sometimes work for multi-process applications, e.g., if the child process inherits the timerfd object but doesn't use it; to support these cases, we introduce the
sys.experimental__allow_timerfd_fork manifest option.

LibOS regression tests are also added.

How to test this PR?

CI + newly added regression tests. TODO: test it on some real workloads. Also tested w/ stress-ng --timerfd.


This change is Reviewable

@kailun-qin kailun-qin force-pushed the kailun-qin/timerfd-support branch 2 times, most recently from 617d0aa to 56310a1 Compare January 26, 2024 15:27
Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 27 of 33 files at r1, 9 of 11 files at r2, all commit messages.
Reviewable status: 36 of 39 files reviewed, 24 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " and "WIP" found in commit messages' one-liners (waiting on @kailun-qin)

a discussion (no related file):
I did a partial review. Looks fine generally. Closely related to #1728



libos/include/libos_fs.h line 187 at r2 (raw file):

    /* Verify a single handle after poll. Must update `pal_ret_events` in-place with only allowed
     * ones. Used in e.g. secure timerfd FS. */
    void (*post_poll)(struct libos_handle* hdl, pal_wait_flags_t* pal_ret_events);

This is a shared change as in #1728.


libos/include/libos_handle.h line 139 at r2 (raw file):

struct libos_timerfd_handle {
    spinlock_t expiration_lock; /* protecting below fields */

TODO(myself): understand why we need two locks.


libos/include/libos_handle.h line 234 at r2 (raw file):

     * `libos_inode.lock`. Must be used *only* via lock_pos_handle() and unlock_pos_handle(); these
     * functions make sure that the lock is acquired only on those handle types that can change the
     * position (e.g. not on eventfds or pipes). */

This change is taken from here: #1736. Blocking as a prerequisite PR.


libos/include/linux_abi/timerfd.h line 8 at r2 (raw file):

#pragma once

/* Types and structures used by various Linux ABIs (e.g. syscalls). */

Please modify or remove this comment.


libos/include/linux_abi/timerfd.h line 11 at r2 (raw file):

/* These need to be binary-identical with the ones used by Linux. */

#include <linux/timerfd.h>

Wait, but the point of this header is to remove Linux-host ones. Please copy only the relevant bits from that header into this file.


libos/src/libos_async.c line 27 at r2 (raw file):

    void* arg;
    PAL_HANDLE object;       /* handle (async IO) to wait on */
    PAL_HANDLE timer_object; /* handle to identify timer object; currently used for timerfd */

Why not re-use object?

Alternatively, please rename object to something more specific like async_io_object.


libos/src/meson.build line 47 at r2 (raw file):

    'fs/sys/node_info.c',
    'fs/tmpfs/fs.c',
    'fs/timerfd/fs.c',

Doesn't look sorted


libos/src/bookkeep/libos_handle.c line 34 at r2 (raw file):

#define INIT_HANDLE_MAP_SIZE 32

static void lock_unlock_pos_handle(struct libos_handle* hdl, bool is_lock) {

This whole file is taken from #1736.

Blocking as a prerequisite PR.


libos/src/fs/timerfd/fs.c line 1 at r2 (raw file):

/* SPDX-License-Identifier: LGPL-3.0-or-later */

TODO(myself): carefully review this file.


libos/src/fs/timerfd/fs.c line 83 at r2 (raw file):

    if (*pal_ret_events & (PAL_WAIT_ERROR | PAL_WAIT_HANG_UP | PAL_WAIT_WRITE)) {
        /* impossible: we control eventfd inside the LibOS, and we never raise such conditions */

eventfd -> timerfd


libos/src/sys/libos_timerfd.c line 1 at r2 (raw file):

/* SPDX-License-Identifier: LGPL-3.0-or-later */

TODO(myself): carefully review this file.


libos/src/sys/libos_timerfd.c line 25 at r2 (raw file):

 * polling mechanisms (select/poll/epoll):
 *
 * a. Malicious host may inject the notification too early: POLLIN when nothing was written

nothing was written sounds wrong -- you should use smth like no timer expired yet


libos/src/sys/libos_timerfd.c line 26 at r2 (raw file):

 *
 * a. Malicious host may inject the notification too early: POLLIN when nothing was written
 *    yet. This may lead to a synchronization failure of the app. To prevent this, eventfd

eventfd -> timerfd


libos/src/sys/libos_timerfd.c line 27 at r2 (raw file):

 * a. Malicious host may inject the notification too early: POLLIN when nothing was written
 *    yet. This may lead to a synchronization failure of the app. To prevent this, eventfd
 *    implements a callback `post_poll()` where it verifies that some data was indeed written (i.e.,

ditto (not written but timer expired)


libos/src/sys/libos_timerfd.c line 32 at r2 (raw file):

 *    This is a Denial of Service (DoS), which we don't care about.
 * c. Malicious host may inject POLLERR, POLLHUP, POLLRDHUP, POLLNVAL, POLLOUT. This is impossible
 *    as we control eventfd objects inside the LibOS, and we never raise such conditions. So the

timerfd objects


libos/src/sys/libos_timerfd.c line 112 at r2 (raw file):

    if (clockid != CLOCK_REALTIME) {
        if (FIRST_TIME()) {
            log_warning("Unsupported clockid; replaced by the system-wide real-time clock.");

Please add ...in timerfd_create()


libos/src/sys/libos_timerfd.c line 138 at r2 (raw file):

    spinlock_lock(&hdl->info.timerfd.expiration_lock);

    if (hdl->info.timerfd.num_expirations < UINT64_MAX) {

What happens in Linux kernel is number of expirations overflows?


libos/src/sys/libos_timerfd.c line 152 at r2 (raw file):

static void callback_itimer(IDTYPE caller, void* arg) {
    // XXX: Can we simplify this code or streamline with the other callback?

What's this comment?


libos/src/sys/libos_timerfd.c line 198 at r2 (raw file):

    /* NOTE: cancelable timer (for the case where reads on timerfd would return `ECANCELED` when the
     * real-time clock undergoes a discontinuous change) is currently unsupported; needs to be
     * specified along with `TFD_TIMER_ABSTIME`. */

But why not? Gramine doesn't implement clock_settime(), so this flag becomes always a no-op. So it looks like it's benign to allow this flag.


libos/test/ltp/ltp.cfg line 2446 at r2 (raw file):

skip = yes

# cancelable timer (set with `TFD_TIMER_CANCEL_ON_SET` flag) is unsupported

Why not supported? I think we can have a dummy support for this flag -- Gramine doesn't allow clock_settime() anyway, so this flag will just never be "triggered" anyway.

Or maybe you mean that Gramine doesn't support clock_settime() -- in this case modify the comment.


libos/test/regression/tests_musl.toml line 126 at r2 (raw file):

  "tcp_msg_peek",
  "timerfd",
  "timerfd_fork",

Looks like this line must be removed


libos/test/regression/timerfd.c line 1 at r2 (raw file):

/* SPDX-License-Identifier: LGPL-3.0-or-later */

TODO(myself): carefully review this file.


libos/test/regression/timerfd_fork.c line 28 at r2 (raw file):

static void set_timerfd(int fd) {
    struct itimerspec new_value;

You can use the struct init syntax of C:

struct itimerspec new_value = { .it_value.tv_sec = TIMEOUT_VALUE };

The other fields will be by default set to zeros.


libos/test/regression/timerfd_fork.c line 42 at r2 (raw file):

    if (pid == 0) {
        uint64_t expirations;
        /* child: wait for the timer to expire and then read the timerfd */

A more correct comment would be

child: wait on a blocking read for the timer to expire

Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 36 of 39 files reviewed, 25 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " and "WIP" found in commit messages' one-liners (waiting on @kailun-qin)


libos/src/sys/libos_timerfd.c line 233 at r2 (raw file):

    if (next_value) {
        int64_t install_ret = install_async_event(hdl->pal_handle, next_value, absolute_time,
                                                  &callback_itimer, (void*)hdl);

We share ownership of the timerfd LibOS handle object with the Async Helper thread here.

What happens on close(timerfd)? Looks like Async Helper thread will be left with a dangling pointer to the object.

This seems to be a similar problem as #1721. If it is, please help review #1721 or suggest an alternative solution.

@kailun-qin kailun-qin force-pushed the kailun-qin/timerfd-support branch 5 times, most recently from 6390b0b to f362403 Compare June 3, 2024 09:23
Copy link
Contributor Author

@kailun-qin kailun-qin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 40 files reviewed, 25 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "WIP" found in commit messages' one-liners (waiting on @dimakuv)


libos/include/libos_handle.h line 234 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

This change is taken from here: #1736. Blocking as a prerequisite PR.

Rebased, done.


libos/src/libos_async.c line 27 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why not re-use object?

Alternatively, please rename object to something more specific like async_io_object.

Reusing object now but adding a event type to distinguish between an async IO event and timerfd (both can have object == NULL and time_us ==0).


libos/src/meson.build line 47 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Doesn't look sorted

Done.


libos/src/fs/timerfd/fs.c line 83 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

eventfd -> timerfd

Done.


libos/src/sys/libos_timerfd.c line 25 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

nothing was written sounds wrong -- you should use smth like no timer expired yet

Done.


libos/src/sys/libos_timerfd.c line 26 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

eventfd -> timerfd

Done.


libos/src/sys/libos_timerfd.c line 27 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

ditto (not written but timer expired)

Done.


libos/src/sys/libos_timerfd.c line 32 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

timerfd objects

Done.


libos/src/sys/libos_timerfd.c line 112 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add ...in timerfd_create()

Done.


libos/src/sys/libos_timerfd.c line 138 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

What happens in Linux kernel is number of expirations overflows?

Explained in the comment.


libos/src/sys/libos_timerfd.c line 152 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

What's this comment?

copied from

// XXX: Can we simplify this code or streamline with the other callback?
-- I removed it now.


libos/src/sys/libos_timerfd.c line 198 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

But why not? Gramine doesn't implement clock_settime(), so this flag becomes always a no-op. So it looks like it's benign to allow this flag.

Done.


libos/src/sys/libos_timerfd.c line 233 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

We share ownership of the timerfd LibOS handle object with the Async Helper thread here.

What happens on close(timerfd)? Looks like Async Helper thread will be left with a dangling pointer to the object.

This seems to be a similar problem as #1721. If it is, please help review #1721 or suggest an alternative solution.

Did not fix this yet but I agree w/ the problem -- I'll look into the linked PR.


libos/test/ltp/ltp.cfg line 2446 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why not supported? I think we can have a dummy support for this flag -- Gramine doesn't allow clock_settime() anyway, so this flag will just never be "triggered" anyway.

Or maybe you mean that Gramine doesn't support clock_settime() -- in this case modify the comment.

Done.


libos/test/regression/tests_musl.toml line 126 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Looks like this line must be removed

Not relevant any more.


libos/test/regression/timerfd_fork.c line 28 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

You can use the struct init syntax of C:

struct itimerspec new_value = { .it_value.tv_sec = TIMEOUT_VALUE };

The other fields will be by default set to zeros.

Done.


libos/test/regression/timerfd_fork.c line 42 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

A more correct comment would be

child: wait on a blocking read for the timer to expire

Done.


libos/include/linux_abi/timerfd.h line 8 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please modify or remove this comment.

Moved to time.h done.


libos/include/linux_abi/timerfd.h line 11 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Wait, but the point of this header is to remove Linux-host ones. Please copy only the relevant bits from that header into this file.

Moved to time.h done.


libos/src/bookkeep/libos_handle.c line 34 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

This whole file is taken from #1736.

Blocking as a prerequisite PR.

Rebased, done.

Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 34 of 38 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: 37 of 40 files reviewed, 15 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "WIP" found in commit messages' one-liners (waiting on @kailun-qin)


-- commits line 2 at r4:
Why is it still WIP? It should be ready for review, right?

If yes, please update the PR description too.


-- commits line 16 at r4:
is not -> are not


Documentation/devel/features.md line 2884 at r4 (raw file):

multi-process applications, e.g., if the child process inherits the timerfd object but doesn't use
it. However, all timerfds created in the parent process are marked as invalid in child processes,
i.e. inter-process timing signals via timerfds is not allowed.

is not allowed -> are not allowed


libos/src/libos_async.c line 62 at r4 (raw file):

                            uint64_t time_us, bool absolute_time,
                            void (*callback)(IDTYPE caller, void* arg), void* arg) {
    /* if event happens on async IO object, time_us must be zero */

I would remove this comment, as this is misleadingly short. And we already have a proper explanation in the top-level comment (The async event type is specified ...)


libos/src/libos_async.c line 94 at r4 (raw file):

        struct async_event* n;
        LISTP_FOR_EACH_ENTRY_SAFE(tmp, n, &async_list, list) {
            if (tmp->object == object && tmp->expire_time_us) {

Do I understand correctly that this silently assumes that alarm() and setitimer() have object == NULL, and thus the comparison becomes NULL == NULL (thus only the second check is interesting for these two syscalls)?

In other words, check for the object is only meaningful for timer_settime().


libos/src/libos_parser.c line 520 at r4 (raw file):

    [__NR_signalfd] = {.slow = false, .name = "signalfd", .parser = {NULL}},
    [__NR_timerfd_create] = {.slow = false, .name = "timerfd_create", .parser = {parse_long_arg,
                             parse_integer_arg, parse_integer_arg}},

Here and below: would be good to add proper parsers (at least for flags, but also for struct itimerspec args). But not blocking.


libos/src/sys/libos_epoll.c line 195 at r4 (raw file):

            needs_et = true;
            if (!in)
                __atomic_store_n(&handle->needs_et_poll_in, true, __ATOMIC_RELEASE);

Hm, I can't find on the internet what happens with edge-triggered timerfd objects. @kailun-qin Could you explain your reasoning behind this change?

I think timerfd is actually unsupported with EPOLLET?


libos/test/regression/timerfd_fork.c line 39 at r4 (raw file):

        uint64_t expirations;
        /* child: wait on a blocking read for the timer to expire */
        CHECK(read(fd, &expirations, sizeof(expirations)));

The test currently fails here, because read() returns a negative error code, right? I would put a comment about the current state of affairs.

@kailun-qin kailun-qin force-pushed the kailun-qin/timerfd-support branch from f362403 to fb57269 Compare June 5, 2024 08:47
Copy link
Contributor Author

@kailun-qin kailun-qin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 32 of 40 files reviewed, 14 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @dimakuv)


-- commits line 2 at r4:

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why is it still WIP? It should be ready for review, right?

If yes, please update the PR description too.

Yeah, I rebased it and expected to look into #1721 first. Anyway, I removed WIP now.


-- commits line 16 at r4:

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

is not -> are not

Done.


Documentation/devel/features.md line 2884 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

is not allowed -> are not allowed

Done.


libos/src/libos_async.c line 62 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I would remove this comment, as this is misleadingly short. And we already have a proper explanation in the top-level comment (The async event type is specified ...)

Done.


libos/src/libos_async.c line 94 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Do I understand correctly that this silently assumes that alarm() and setitimer() have object == NULL, and thus the comparison becomes NULL == NULL (thus only the second check is interesting for these two syscalls)?

In other words, check for the object is only meaningful for timer_settime().

Yes, correct.


libos/src/libos_parser.c line 520 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Here and below: would be good to add proper parsers (at least for flags, but also for struct itimerspec args). But not blocking.

Done.


libos/src/sys/libos_epoll.c line 195 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Hm, I can't find on the internet what happens with edge-triggered timerfd objects. @kailun-qin Could you explain your reasoning behind this change?

I think timerfd is actually unsupported with EPOLLET?

I explained a bit in the comment.


libos/test/regression/timerfd_fork.c line 39 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

The test currently fails here, because read() returns a negative error code, right? I would put a comment about the current state of affairs.

Done.

@kailun-qin kailun-qin force-pushed the kailun-qin/timerfd-support branch from fb57269 to d840592 Compare June 5, 2024 10:48
@kailun-qin kailun-qin marked this pull request as ready for review June 5, 2024 14:11
@kailun-qin kailun-qin changed the title [WIP] [LibOS] Add support for timerfd system calls [LibOS] Add support for timerfd system calls Jun 5, 2024
Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 8 files at r5, 3 of 3 files at r6, all commit messages.
Reviewable status: all files reviewed, 42 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @kailun-qin)


Documentation/devel/features.md line 2879 at r6 (raw file):

the host. This is purely for triggering read notifications (e.g., in epoll); timerfd data is
verified inside Gramine and is never exposed to the host. Since the host is used purely for
notifications, a malicious host can only induce Denial of Service (DoS) attacks.

We should have a note somewhere here that TFD_TIMER_ABSTIME is silently ignored, and we do this because we don't have "discontinuous changes of time" in Gramine (via e.g. settimeofday()).


Documentation/devel/features.md line 2880 at r6 (raw file):

verified inside Gramine and is never exposed to the host. Since the host is used purely for
notifications, a malicious host can only induce Denial of Service (DoS) attacks.

We should mention that TFD_IOC_SET_TICKS is not supported.


Documentation/devel/features.md line 2881 at r6 (raw file):

notifications, a malicious host can only induce Denial of Service (DoS) attacks.

The emulation is currently implemented at the level of a single process. The emulation *may* work for

Exceeds 100 chars per line


libos/include/libos_handle.h line 139 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

TODO(myself): understand why we need two locks.

Done:

  • expiration_lock synchronizes the operations of reading/writing/polling on the timerfd object (and its underlying eventfd host object)
  • timer_lock synchronizes the accesses to the time fields: absolute timeout and relative reset.

This makes sense to me, even though I think a single lock would also work totally fine.


libos/include/libos_handle.h line 154 at r6 (raw file):

    spinlock_t timer_lock; /* protecting below fields */
    uint64_t timeout;

Please add that timeout is always an absolute time in LibOS logic


libos/src/fs/timerfd/fs.c line 27 at r6 (raw file):

}

static void timerfd_dummy_host_read(struct libos_handle* hdl) {

Could you add a comment that this implementation is the same as in fs/eventfd.c


libos/src/fs/timerfd/fs.c line 41 at r6 (raw file):

}

static void timerfd_dummy_host_wait(struct libos_handle* hdl) {

Could you add a comment that this implementation is the same as in fs/eventfd.c


libos/src/fs/timerfd/fs.c line 81 at r6 (raw file):

    /* perform a read (not supposed to block) to clear the event from polling threads and to send an
     * event to writing threads */

We don't have writing threads in timerfds -- they don't implement the write() system call. So please remove that second part of the comment ...and to send an event to writing threads.

We only do this dummy host read to reset the polling threads.


libos/src/fs/timerfd/fs.c line 125 at r6 (raw file):

    .checkin   = &timerfd_checkin,
    .read      = &timerfd_read,
    .post_poll = &timerfd_post_poll,

Aren't you supposed to also implement .close callback? Otherwise IIUC a periodic timerfd will just continue firing, even though its corresponding FD was closed.


libos/src/sys/libos_timerfd.c line 19 at r6 (raw file):

 * child processes, i.e. inter-process timing signals via timerfds are not allowed.
 *
 * The host's timerfd object is "dummy" and used purely for notifications -- to unblock blocking

host's timerfd object -> host's eventfd object


libos/src/sys/libos_timerfd.c line 21 at r6 (raw file):

 * The host's timerfd object is "dummy" and used purely for notifications -- to unblock blocking
 * read/select/poll/epoll system calls. The read notify logic is already hardened, by
 * double-checking that the object was indeed updated. However, there are three possible attacks on

that the object was indeed updated -> that the timerfd object indeed expired


libos/src/sys/libos_timerfd.c line 31 at r6 (raw file):

 *    This is a Denial of Service (DoS), which we don't care about.
 * c. Malicious host may inject POLLERR, POLLHUP, POLLRDHUP, POLLNVAL, POLLOUT. This is impossible
 *    as we control timerfd objects inside the LibOS, and we never raise such conditions. So the

In other parts of this text you said in Gramine, but here you say in inside LibOS. Please unify to one term.


libos/src/sys/libos_timerfd.c line 45 at r6 (raw file):

#include "linux_eventfd.h"
#include "pal.h"
#include "toml_utils.h"

Not needed?


libos/src/sys/libos_timerfd.c line 47 at r6 (raw file):

#include "toml_utils.h"

static void timerfd_dummy_host_write(struct libos_handle* hdl) {

Could you add a comment that this implementation is the same as in fs/eventfd.c


libos/src/sys/libos_timerfd.c line 62 at r6 (raw file):

static int create_timerfd_pal_handle(PAL_HANDLE* out_pal_handle) {
    int ret;

Merge this variable declaration with its first usage.


libos/src/sys/libos_timerfd.c line 101 at r6 (raw file):

    hdl->flags = O_RDONLY | (flags & TFD_NONBLOCK ? O_NONBLOCK : 0);
    hdl->acc_mode = MAY_READ;
    hdl->info.timerfd.broken_in_child = false;

Could you set to NULL/zeros all other fields? Like num_expirations, dummy_host_val, ...


libos/src/sys/libos_timerfd.c line 117 at r6 (raw file):

        log_warning("Child process tried to access timerfd created by parent process. This is "
                    "disallowed in Gramine.");
        die_or_inf_loop();

Can we end up in this code path?

If yes (meaning that the timerfd got expired in the child process), isn't it too drastic to just kill the child process? I would prefer to have a log_warning() and that's it -- not killing the process.


libos/src/sys/libos_timerfd.c line 129 at r6 (raw file):

        /* perform a write (not supposed to block) to send an event to reading/polling threads */
        timerfd_dummy_host_write(hdl);

If I understand the comment about overflow correctly, you actually need to keep only num_expirations++ inside the IF body. The other two actions (incrementing the host value and performing a host write) must always be executed (to inform about the firing timer). Or am I wrong?


libos/src/sys/libos_timerfd.c line 134 at r6 (raw file):

    spinlock_unlock(&hdl->info.timerfd.expiration_lock);

    maybe_epoll_et_trigger(hdl, /*ret=*/0, /*in=*/false, /*unused was_partial=*/false);

This looks useless, because we don't have POLLOUT events for timerfd objects. I think you can safely remove this line.


libos/src/sys/libos_timerfd.c line 184 at r6 (raw file):

    }

    if (flags & ~TFD_SETTIME_FLAGS) {

We should have a comment somewhere here that TFD_TIMER_ABSTIME is silently ignored, and we do this because we don't have "discontinuous changes of time" in Gramine (via e.g. settimeofday()).


libos/src/sys/libos_timerfd.c line 196 at r6 (raw file):

    }

    uint64_t next_value = timespec_to_us(&value->it_value);

Since you have current_timeout, it would be more uniform to call this variable new_timeout


libos/src/sys/libos_timerfd.c line 197 at r6 (raw file):

    uint64_t next_value = timespec_to_us(&value->it_value);
    uint64_t next_reset = timespec_to_us(&value->it_interval);

I think new_reset is a better name


libos/src/sys/libos_timerfd.c line 225 at r6 (raw file):

        }
    }

I don't see a "disarm the timer" logic, when next_value == 0. Gramine currently cannot support the disarming?


libos/test/regression/timerfd.c line 10 at r6 (raw file):

 * `timerfd_gettime()`).
 *
 * The tests involve cases including reading a blocking/non-blocking timerfd, poll/epoll/selecting

selecting -> select


libos/test/regression/timerfd.c line 84 at r6 (raw file):

    }

    int max_fd = MAX(fds[0], fds[1]) + 1;

Here you assume that NUM_FDS == 2. Please detect the maximum FD in the for loop above, otherwise this code will glitch if we ever change to NUM_FDS = 3.

Then you'll also be able to remove the macro MAX.


libos/test/regression/timerfd.c line 85 at r6 (raw file):

    int max_fd = MAX(fds[0], fds[1]) + 1;
    CHECK(select(max_fd, &rfds, NULL, NULL, NULL));

Can you also check the return value -- that it is not zero (meaning that the timeout expired).


libos/test/regression/timerfd.c line 91 at r6 (raw file):

            uint64_t expirations;
            CHECK(read(fds[i], &expirations, sizeof(expirations)));
            if (expirations != 1)

Aren't you supposed to use EXPECTED_EXPIRATIONS instead of 1? Ditto for other places like this below. Alternatively, remove that macro.


libos/test/regression/timerfd.c line 105 at r6 (raw file):

    }

    CHECK(poll(pfds, NUM_FDS, -1));

Can you also check the return value -- that it is not zero (meaning that the timeout expired).


libos/test/regression/timerfd.c line 129 at r6 (raw file):

    struct epoll_event events[NUM_FDS];
    int nfds = CHECK(epoll_wait(epfd, events, NUM_FDS, -1));

Can you also check this return value nfds -- that it is not zero (meaning that the timeout expired).


libos/test/regression/timerfd.c line 131 at r6 (raw file):

    int nfds = CHECK(epoll_wait(epfd, events, NUM_FDS, -1));

    for (int n = 0; n < nfds; ++n) {

Can you replace with a classic:

for (int i = 0; i < nfds; i++) {

Otherwise it looks like a copy-paste from some other project :)


libos/test/regression/timerfd.c line 138 at r6 (raw file):

    }

    close(epfd);

CHECK()


libos/test/regression/timerfd.c line 141 at r6 (raw file):

}

static void test_epoll_modes(int fd) {

Please add a comment that this test expects the fd timerfd to be a periodic timer


libos/test/regression/timerfd.c line 156 at r6 (raw file):

    /* waiting for another event without reading the expiration count */
    nfds = CHECK(epoll_wait(epfd, events, 1, /*timeout=*/2000));

You can't hard-code 2000 here because what if we change PERIODIC_INTERVAL to say 5 seconds? I suggest to use smth like PERIODIC_INTERVAL * 1000 * 2.


libos/test/regression/timerfd.c line 169 at r6 (raw file):

        errx(1, "epoll: unexpected number of fds (expected 1, got %u)", nfds);

    /* waiting for another event without reading the expiration count */

Please add a comment that here, even though the timer expired at least one, there is no event reported because we're in edge-triggered mode (which does not "reset" the event since there was no read)


libos/test/regression/timerfd.c line 170 at r6 (raw file):

    /* waiting for another event without reading the expiration count */
    nfds = CHECK(epoll_wait(epfd, events, 1, /*timeout=*/2000));

ditto


libos/test/regression/timerfd.c line 175 at r6 (raw file):

             "(expected 0, got %u)", nfds);

    close(epfd);

CHECK()


libos/test/regression/timerfd.c line 199 at r6 (raw file):

static void test_periodic_timer(int fd) {
    pthread_t thread;
    CHECK(pthread_create(&thread, NULL, timerfd_read_thread_periodic_timer, &fd));

I'm confused. Why did you have to create a separate thread here? Why can't you do this while loop of read(timerfd) in the main thread? I don't mind checking the two-thread scenario, just miss a comment as to why you decide to test like this.


libos/test/regression/timerfd.c line 208 at r6 (raw file):

    pthread_mutex_unlock(&mutex);

    if (expiration_count != EXPECTED_PERIODIC_TIMER_EXPIRATION_COUNT)

Don't you want to read expiration_count under the mutex? Otherwise in a super-improbable case, the other thread may increase expiration_count and we'll have a error here.


libos/test/regression/timerfd.c line 226 at r6 (raw file):

}

static void test_threaded_read(int fd) {

You must comment that this test requires a periodic timer (so that all NUM_THREADS threads have something to read).


libos/test/regression/timerfd.c line 283 at r6 (raw file):

}

static void test_read(int fd, bool non_blocking) {

You must put a comment here that this func must be executed twice: first with blocking to make the non-periodic timer expire, and then with non-blocking to timeout (because the timer is disarmed at that point). Without this comment it's unclear why we would expect -EAGAIN in the non-blocking case.


libos/test/regression/timerfd.c line 309 at r6 (raw file):

    create_timerfds(fds);

    set_timerfds_relative(fds, /*periodic*/false);

Here and below /*periodic=*/false (you forgot the = sign)


libos/test/regression/timerfd.c line 335 at r6 (raw file):

    test_absolute_time(fds[1]);

Can we have a small test on disarming the timer? I'd like to test timerfd_settime(it_interval = {0,0}, &old_value). In other words, the test can be like this:

  1. First set timerfd normally (2 second timeout)
  2. Then immediately disarm the timer (by supplying it_interval = {0, 0}).
  3. At the same time, get the old value of the timer and check that it's around 2 seconds -- same as you do in the test_timerfd_gettime() test.
  4. Finally, do a test_poll(timeout=3sec) that will fail with a timeout -- this proves that the timer was disarmed.

Copy link
Contributor Author

@kailun-qin kailun-qin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 42 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @dimakuv)


Documentation/devel/features.md line 2879 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

We should have a note somewhere here that TFD_TIMER_ABSTIME is silently ignored, and we do this because we don't have "discontinuous changes of time" in Gramine (via e.g. settimeofday()).

Done.


Documentation/devel/features.md line 2880 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

We should mention that TFD_IOC_SET_TICKS is not supported.

Done.


Documentation/devel/features.md line 2881 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Exceeds 100 chars per line

Done.


libos/include/libos_handle.h line 154 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add that timeout is always an absolute time in LibOS logic

Done.


libos/src/fs/timerfd/fs.c line 27 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Could you add a comment that this implementation is the same as in fs/eventfd.c

Done.


libos/src/fs/timerfd/fs.c line 41 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Could you add a comment that this implementation is the same as in fs/eventfd.c

Done.


libos/src/fs/timerfd/fs.c line 81 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

We don't have writing threads in timerfds -- they don't implement the write() system call. So please remove that second part of the comment ...and to send an event to writing threads.

We only do this dummy host read to reset the polling threads.

Done.


libos/src/fs/timerfd/fs.c line 125 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Aren't you supposed to also implement .close callback? Otherwise IIUC a periodic timerfd will just continue firing, even though its corresponding FD was closed.

Done.


libos/src/sys/libos_timerfd.c line 19 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

host's timerfd object -> host's eventfd object

Done.


libos/src/sys/libos_timerfd.c line 21 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

that the object was indeed updated -> that the timerfd object indeed expired

Done.


libos/src/sys/libos_timerfd.c line 31 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

In other parts of this text you said in Gramine, but here you say in inside LibOS. Please unify to one term.

aligned to LibOS. Done.


libos/src/sys/libos_timerfd.c line 45 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Not needed?

Done.


libos/src/sys/libos_timerfd.c line 47 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Could you add a comment that this implementation is the same as in fs/eventfd.c

Done.


libos/src/sys/libos_timerfd.c line 62 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Merge this variable declaration with its first usage.

Done.


libos/src/sys/libos_timerfd.c line 101 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Could you set to NULL/zeros all other fields? Like num_expirations, dummy_host_val, ...

Done.


libos/src/sys/libos_timerfd.c line 117 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Can we end up in this code path?

If yes (meaning that the timerfd got expired in the child process), isn't it too drastic to just kill the child process? I would prefer to have a log_warning() and that's it -- not killing the process.

No we can't (as long as the child does not reconfigure the timerfd -- which we should disallow).


libos/src/sys/libos_timerfd.c line 129 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

If I understand the comment about overflow correctly, you actually need to keep only num_expirations++ inside the IF body. The other two actions (incrementing the host value and performing a host write) must always be executed (to inform about the firing timer). Or am I wrong?

Your understanding is correct, fixed.


libos/src/sys/libos_timerfd.c line 134 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

This looks useless, because we don't have POLLOUT events for timerfd objects. I think you can safely remove this line.

Done.


libos/src/sys/libos_timerfd.c line 184 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

We should have a comment somewhere here that TFD_TIMER_ABSTIME is silently ignored, and we do this because we don't have "discontinuous changes of time" in Gramine (via e.g. settimeofday()).

Done.


libos/src/sys/libos_timerfd.c line 196 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Since you have current_timeout, it would be more uniform to call this variable new_timeout

Done.


libos/src/sys/libos_timerfd.c line 197 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I think new_reset is a better name

Done.


libos/src/sys/libos_timerfd.c line 225 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I don't see a "disarm the timer" logic, when next_value == 0. Gramine currently cannot support the disarming?

Done.


libos/test/regression/timerfd.c line 10 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

selecting -> select

Done.


libos/test/regression/timerfd.c line 84 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Here you assume that NUM_FDS == 2. Please detect the maximum FD in the for loop above, otherwise this code will glitch if we ever change to NUM_FDS = 3.

Then you'll also be able to remove the macro MAX.

Done.


libos/test/regression/timerfd.c line 85 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Can you also check the return value -- that it is not zero (meaning that the timeout expired).

Done.


libos/test/regression/timerfd.c line 91 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Aren't you supposed to use EXPECTED_EXPIRATIONS instead of 1? Ditto for other places like this below. Alternatively, remove that macro.

Done.


libos/test/regression/timerfd.c line 105 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Can you also check the return value -- that it is not zero (meaning that the timeout expired).

Done.


libos/test/regression/timerfd.c line 129 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Can you also check this return value nfds -- that it is not zero (meaning that the timeout expired).

Done.


libos/test/regression/timerfd.c line 131 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Can you replace with a classic:

for (int i = 0; i < nfds; i++) {

Otherwise it looks like a copy-paste from some other project :)

Done.


libos/test/regression/timerfd.c line 138 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

CHECK()

Done.


libos/test/regression/timerfd.c line 141 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add a comment that this test expects the fd timerfd to be a periodic timer

Done.


libos/test/regression/timerfd.c line 156 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

You can't hard-code 2000 here because what if we change PERIODIC_INTERVAL to say 5 seconds? I suggest to use smth like PERIODIC_INTERVAL * 1000 * 2.

Done.


libos/test/regression/timerfd.c line 169 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add a comment that here, even though the timer expired at least one, there is no event reported because we're in edge-triggered mode (which does not "reset" the event since there was no read)

Done.


libos/test/regression/timerfd.c line 170 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

ditto

Done.


libos/test/regression/timerfd.c line 175 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

CHECK()

Done.


libos/test/regression/timerfd.c line 199 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I'm confused. Why did you have to create a separate thread here? Why can't you do this while loop of read(timerfd) in the main thread? I don't mind checking the two-thread scenario, just miss a comment as to why you decide to test like this.

I can't recall why I tested in this way -- changing to do this while loop of read(timerfd) in the main thread.


libos/test/regression/timerfd.c line 208 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Don't you want to read expiration_count under the mutex? Otherwise in a super-improbable case, the other thread may increase expiration_count and we'll have a error here.

Not relevant any more.


libos/test/regression/timerfd.c line 226 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

You must comment that this test requires a periodic timer (so that all NUM_THREADS threads have something to read).

Done.


libos/test/regression/timerfd.c line 283 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

You must put a comment here that this func must be executed twice: first with blocking to make the non-periodic timer expire, and then with non-blocking to timeout (because the timer is disarmed at that point). Without this comment it's unclear why we would expect -EAGAIN in the non-blocking case.

Done.


libos/test/regression/timerfd.c line 309 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Here and below /*periodic=*/false (you forgot the = sign)

Done.


libos/test/regression/timerfd.c line 335 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Can we have a small test on disarming the timer? I'd like to test timerfd_settime(it_interval = {0,0}, &old_value). In other words, the test can be like this:

  1. First set timerfd normally (2 second timeout)
  2. Then immediately disarm the timer (by supplying it_interval = {0, 0}).
  3. At the same time, get the old value of the timer and check that it's around 2 seconds -- same as you do in the test_timerfd_gettime() test.
  4. Finally, do a test_poll(timeout=3sec) that will fail with a timeout -- this proves that the timer was disarmed.

Done.

Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r7, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @kailun-qin)


Documentation/devel/features.md line 2879 at r6 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

Done.

Oops, I incorrectly wrote TFD_TIMER_ABSTIME when in reality I meant TFD_TIMER_CANCEL_ON_SET. But Kailun correctly wrote his text.


libos/src/sys/libos_timerfd.c line 233 at r2 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

Did not fix this yet but I agree w/ the problem -- I'll look into the linked PR.

This is the only comment that I have now; all others were resolved.


libos/src/sys/libos_timerfd.c line 169 at r7 (raw file):

                    "disallowed in Gramine.");
        return -EIO;
    }

You should move this check after the next one (type != TYPE_TIMERFD).

Otherwise the user app may call timerfd_settime(<fd-corresponding-to-regular-file>), and the field broken_in_child will read from totally different field.


libos/src/sys/libos_timerfd.c line 260 at r7 (raw file):

                    "disallowed in Gramine.");
        return -EIO;
    }

ditto (move further down)


libos/test/regression/timerfd.c line 213 at r7 (raw file):

    /* waiting for another event without reading the expiration count: here, even though the timer
     * expired at least one, there is no event reported because we're in edge-triggered mode (which

one -> once

Copy link
Contributor Author

@kailun-qin kailun-qin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)


libos/src/sys/libos_timerfd.c line 169 at r7 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

You should move this check after the next one (type != TYPE_TIMERFD).

Otherwise the user app may call timerfd_settime(<fd-corresponding-to-regular-file>), and the field broken_in_child will read from totally different field.

Good point, done.


libos/src/sys/libos_timerfd.c line 260 at r7 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

ditto (move further down)

Done.


libos/test/regression/timerfd.c line 213 at r7 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

one -> once

Done.

Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r8, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @kailun-qin)

a discussion (no related file):
You have this in PR description:

TODO: test it on some real workloads.

Did you already do this? Do you plan to do this?


Copy link
Contributor Author

@kailun-qin kailun-qin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)

a discussion (no related file):

Did you already do this?

Yes, I tested w/ stress-ng --timerfd.


Copy link
Contributor Author

@kailun-qin kailun-qin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)

a discussion (no related file):

Yes, I tested w/ stress-ng --timerfd.

Well, it's not that real -- but it's the only suggested workload I got from internal teams.


@kailun-qin kailun-qin force-pushed the kailun-qin/timerfd-support branch from c327e2f to 7d8e249 Compare July 25, 2024 01:53
Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 9 of 9 files at r9, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @kailun-qin)

a discussion (no related file):

Previously, kailun-qin (Kailun Qin) wrote…

Yes, I tested w/ stress-ng --timerfd.

Well, it's not that real -- but it's the only suggested workload I got from internal teams.

@kailun-qin Can you then update the PR description?


Copy link
Contributor Author

@kailun-qin kailun-qin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @dimakuv)

a discussion (no related file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

@kailun-qin Can you then update the PR description?

Done.


Copy link
Contributor

@mwkmwkmwk mwkmwkmwk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 33 files at r1, 17 of 38 files at r3, 2 of 5 files at r7, 4 of 9 files at r9.
Reviewable status: all files reviewed, 3 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @kailun-qin)


libos/src/libos_parser.c line 522 at r9 (raw file):

                          parse_integer_arg, parse_pointer_arg, parse_pointer_arg}},
    [__NR_signalfd] = {.slow = false, .name = "signalfd", .parser = {NULL}},
    [__NR_timerfd_create] = {.slow = false, .name = "timerfd_create", .parser = {parse_long_arg,

the first argument is not an int — it's an enum of CLOCK_* values

while we're at it, these consts should be moved to linux_abi/


libos/src/libos_parser.c line 1162 at r9 (raw file):

    }

    buf_printf(buf, "intvl:[%ld,%ld] val: [%ld,%ld]",

this is quite internally inconsistent formatting (space after : or not)

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 15 of 38 files at r3, 1 of 5 files at r7, 2 of 9 files at r9, all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @kailun-qin)


-- commits line 13 at r9:
That's not true, it doesn't "work" for multi-process, it just doesn't make Gramine exit right after fork. only if the app actually tries to use it in the child.

Code quote:

it may sometimes work for multi-process applications

Documentation/devel/features.md line 2876 at r9 (raw file):

and all operations are resolved entirely inside Gramine

Kinda yes, but no. This gives an impression that the whole working of timerfd is trusted, but in reality the time source in Gramine on SGX is not.


Documentation/devel/features.md line 2905 at r9 (raw file):

-`timer_delete()`: may be implemented in the future

-`timerfd_create()`: see notes above

ditto below

Suggestion:

see the notes above

@mkow
Copy link
Member

mkow commented Aug 12, 2024

libos/src/sys/libos_timerfd.c line 233 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

This is the only comment that I have now; all others were resolved.

So, this isn't fixed yet, right? Let's wait with this PR and fix this problem first.

Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 6 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @kailun-qin and @mkow)


libos/src/sys/libos_timerfd.c line 233 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

So, this isn't fixed yet, right? Let's wait with this PR and fix this problem first.

@kailun-qin Any progress on this?

Copy link
Contributor Author

@kailun-qin kailun-qin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 36 of 42 files reviewed, 6 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " and "squash! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @mwkmwkmwk)


-- commits line 13 at r9:

Previously, mkow (Michał Kowalczyk) wrote…

That's not true, it doesn't "work" for multi-process, it just doesn't make Gramine exit right after fork. only if the app actually tries to use it in the child.

Done.


Documentation/devel/features.md line 2876 at r9 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

and all operations are resolved entirely inside Gramine

Kinda yes, but no. This gives an impression that the whole working of timerfd is trusted, but in reality the time source in Gramine on SGX is not.

Done.


Documentation/devel/features.md line 2905 at r9 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

ditto below

Well, all similar places in this doc use "see notes above". Anyway, I switched to the version w/ "the" in our case.


libos/src/libos_parser.c line 522 at r9 (raw file):

the first argument is not an int — it's an enum of CLOCK_* values

But it's really an int for timerfd_create()?

while we're at it, these consts should be moved to linux_abi/

Sorry I'm not sure I undersood the request correctly. Pls take a look at the update and see if it's what you expect.


libos/src/libos_parser.c line 1162 at r9 (raw file):

Previously, mwkmwkmwk (Marcelina Kościelnicka) wrote…

this is quite internally inconsistent formatting (space after : or not)

Done.


libos/src/sys/libos_timerfd.c line 233 at r2 (raw file):

Any progress on this?

Just started testing and looking at it.

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 38 files at r3, 1 of 6 files at r10, all commit messages.
Reviewable status: 37 of 42 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " and "squash! " found in commit messages' one-liners (waiting on @dimakuv and @mwkmwkmwk)


Documentation/devel/features.md line 2905 at r9 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

Well, all similar places in this doc use "see notes above". Anyway, I switched to the version w/ "the" in our case.

Yeah, but I think this is the correct version.


libos/src/libos_parser.c line 522 at r9 (raw file):
What @mwkmwkmwk meant is that it's an enum and we should pretty-print it.

Sorry I'm not sure I undersood the request correctly. Pls take a look at the update and see if it's what you expect.

It's about CLOCK_* consts, which should be added to our own headers, not taken from the build host (which theoretically may not even be Linux).

This commit adds support for system calls that create and operate on a
timer that delivers timer expiration notifications via a file
descriptor, specifically: `timerfd_create()`, `timerfd_settime()` and
`timerfd_gettime()`. The timerfd object is associated with a dummy
eventfd created on the host to trigger notifications (e.g., in epoll).
The object is created inside Gramine, with all its operations resolved
entirely inside Gramine (note that the time source in Gramine SGX is
still untrusted).

The emulation is currently implemented at the level of a single process.
All timerfds created in the parent process are marked as invalid in
child processes. In multi-process applications, Gramine does not exit
immediately after fork; it only exits if the application attempts to use
timerfds in the child. Therefore, inter-process timing signals via
timerfds are not allowed.

LibOS regression tests are also added.

Signed-off-by: Kailun Qin <[email protected]>
@kailun-qin kailun-qin force-pushed the kailun-qin/timerfd-support branch from dd9d90c to a717ed5 Compare September 10, 2024 23:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Working on it
Development

Successfully merging this pull request may close these issues.

4 participants