You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Multi-threaded workloads with many syscalls stress the VMA subsystem a lot, because almost all syscalls verify their buffers for read/write access using the following functions:
is_user_memory_readable()
is_user_memory_writable()
is_user_string_readable()
is_user_memory_writable_no_skip()
All these functions call test_user_memory() helper:
The important part is spinlock_lock(&vma_tree_lock) and spinlock_unlock(&vma_tree_lock). On a multi-threaded app, this lock contention becomes the bottleneck.
Gramine introduced a workaround to sidestep this bottleneck, via the libos.check_invalid_pointers manifest option; it translates to the g_check_invalid_ptrs variable. However, this cannot be used in all cases:
Some runtimes like Java rely on being able to check invalid pointers. Thus they cannot set libos.check_invalid_pointers = false; this would lead to Java apps failing.
The is_user_memory_writable_no_skip() function does not honor the libos.check_invalid_pointers manifest option; this is because in certain situations Gramine really must decide whether the VMA is writable or read-only, see e.g. the ppoll() case which emulates how Linux works.
The proposed RW VMA lock is more heavy-weight (on the slow path), as it uses PalEventSet() and PalEventWait()
Potentially, in contention cases, PalEventSet() and PalEventWait() perform a futex OCALL, which can outweigh the benefits of switching to the RW lock. On the other hand, it's typically better to sleep on a futex in a contention case. Plus these PAL APIs are optimized to elide the OCALL when possible.
Problem
Multi-threaded workloads with many syscalls stress the VMA subsystem a lot, because almost all syscalls verify their buffers for read/write access using the following functions:
is_user_memory_readable()
is_user_memory_writable()
is_user_string_readable()
is_user_memory_writable_no_skip()
All these functions call
test_user_memory()
helper:gramine/libos/src/bookkeep/libos_signal.c
Lines 393 to 405 in 4afc550
This helper in turn calls the
is_in_adjacent_user_vmas()
func:gramine/libos/src/bookkeep/libos_vma.c
Lines 1204 to 1219 in 4afc550
The important part is
spinlock_lock(&vma_tree_lock)
andspinlock_unlock(&vma_tree_lock)
. On a multi-threaded app, this lock contention becomes the bottleneck.Gramine introduced a workaround to sidestep this bottleneck, via the
libos.check_invalid_pointers
manifest option; it translates to theg_check_invalid_ptrs
variable. However, this cannot be used in all cases:libos.check_invalid_pointers = false
; this would lead to Java apps failing.is_user_memory_writable_no_skip()
function does not honor thelibos.check_invalid_pointers
manifest option; this is because in certain situations Gramine really must decide whether the VMA is writable or read-only, see e.g. theppoll()
case which emulates how Linux works.Solution
Use the RW lock that was previously introduced in the Gramine codebase: https://github.com/gramineproject/gramine/blob/master/libos/include/libos_rwlock.h
Example usage of this RW lock: f071450
Benchmark results
TODO
The text was updated successfully, but these errors were encountered: