Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock when a process tries to open and lock on the same file twice #315

Open
youkaichao opened this issue Mar 24, 2024 · 5 comments
Open

Comments

@youkaichao
Copy link
Contributor

youkaichao commented Mar 24, 2024

Minimal reproducible example:

import os
from filelock import Timeout, FileLock

file_path = "high_ground.txt"
lock_path = "high_ground.txt.lock"

lock1 = FileLock(lock_path)

with lock1:
    lock2 = FileLock(lock_path)
    with lock2:
        print("I have the high ground!")

This is dangerous, and even when we just create one filelock, we don't know if any other code already created the filelock in the same process and we are within the filelock.

A possible solution would be to check and get existing fd on a file in UnixFileLock._acquire function. In Linux, this is easy because Python fd is just integer, and all fds are checkable in /proc/{process_id}/fd/ .

cc @gaborbernat

@YouJiacheng
Copy link

YouJiacheng commented Mar 24, 2024

If we use /proc/self/fd, the lock will be re-entrant across threads (see also https://py-filelock.readthedocs.io/en/latest/#filelocks-and-threads). Maybe we can try /proc/thread-self/fd or equivalently /proc/{pid}/task/{tid}/fd, but unfortunately the fds are still shared across threads.

from threading import Barrier, Condition, Thread


def thread_fds():
    from pathlib import Path
    return {int(p.stem) for p in Path('/proc/thread-self/fd').iterdir()}


barrier = Barrier(3)
i, j = 0, 0
cv = Condition()


parent_fds = thread_fds()


def foo(tid):
    global i, j

    with cv:
        cv.wait_for(lambda: i == tid)  # print initial fds in order
        print(tid, sorted(thread_fds() - parent_fds))
        i += 1  # wake up next thread
        cv.notify_all()

    barrier.wait()  # open file after all threads print initial fds

    with open('/dev/null'):

        barrier.wait()  # print new fds after all threads open file

        with cv:
            cv.wait_for(lambda: j == tid)  # print new fds in order
            print(tid, sorted(thread_fds() - parent_fds))
            j += 1  # wake up next thread
            cv.notify_all()

        barrier.wait()  # close file after all threads print new fds


def main():
    for tid in range(3):
        Thread(target=foo, args=(tid,)).start()


main()

Output on my machine:

0 []
1 []
2 []
0 [4, 5, 6]
1 [4, 5, 6]
2 [4, 5, 6]

@YouJiacheng
Copy link

TL; DR: Use is_singleton=True.

The core problem is the granularity of the re-entrant control.
Currently, filelock has a Thread-Object level re-entrance, i.e. the lock is re-entrant within the same Thread and the same FileLock object.
Checking fd will give a Process level re-entrance, i.e. lock is re-entrant within the same Process.
Ideally, we want a Thread level re-entrance, this can be achieved by introducing a thread local context, and registering all FileLock in this context.

When I tried building a wheel by myself, I suddenly found that is_singleton=True can "perfectly" solve this issue.

param is_singleton: If this is set to True then only one instance of this class will be created per lock file. This is useful if you want to use the lock object for reentrant locking without needing to pass the same object around.

Using is_singleton=True plus thread_local=True, we can achieve the Thread level re-entrance (if there is only one interpreter): All FileLock with the same lock path in the same Python Interpreter will be the same instance, but with ThreadLocalFileContext (i.e. the data of the same instance is not shared across threads). Thus, the lock path as a lock will be re-entrant in and only in the same thread.

However there are some minor problems:

  1. We cannot create FileLock with the same lock path but different args if is_singleton=True. Especially, we cannot create a thread_local=True FileLock and a thread_local=False FileLock. This undesirable property compels us to refrain from making is_singleton=True the default behavior.
  2. If is_singleton=True is not the default behavior, the author of library cannot assume all other codes use is_singleton=True.

@youkaichao
Copy link
Contributor Author

It seems all threads in one process share the same file descriptor table, and /proc/{pid}/task/{tid}/fd actually have the same content with /proc/{pid}/fd, according to https://unix.stackexchange.com/questions/202678/how-to-display-open-file-descriptors-with-thread-id-without-using-lsof-command .

So there is not a simple method to tell if the current thread has already opened a file.

That being said, I think we can at least give some warnings, when we detect the lock file has been opened in the current process. It is a necessity condition for this kind of deadlock, although not sufficient. And if it indeed causes deadlock, then users will know the root cause quickly.

@YouJiacheng
Copy link

YouJiacheng commented Mar 24, 2024

I read this stackexchange question (Actually, I was inspired by the comment from Dave Reikher) before writing code to verify that file descriptors are shared across threads in a process (as we learned in an OS course), but it seems not in the POSIX spec.

Anyway, I agree that "there is not a simple method to tell if the current thread has already opened a file" and we can do something when "the lock file has been opened in the current process"

Since checking all file descriptors might be time consuming (if this process has opened many files), this behavior can't be enable by default in my opinion. Maybe there can be some DEBUG mode?

Or use something like: (this is very similar to singleton)
(Maybe we can factor out timeout?)

from dataclasses import dataclass
from threading import local

@dataclass
class FileLockContext:
    """A dataclass which holds the context for a ``BaseFileLock`` object."""

    # The context is held in a separate class to allow optional use of thread local storage via the
    # ThreadLocalFileContext class.

    #: The path to the lock file.
    lock_file: str

    #: The default timeout value.
    timeout: float

    #: The mode for the lock files
    mode: int

    #: The file descriptor for the *_lock_file* as it is returned by the os.open() function, not None when lock held
    lock_file_fd: int | None = None

    #: The lock counter is used for implementing the nested locking mechanism.
    lock_counter: int = 0  # When the lock is acquired is increased and the lock is only released, when this value is 0

class ThreadLocalFileContext(FileLockContext, local):
    """A thread local version of the ``FileLockContext`` class."""


class ContextStore:
    def __init__(self) -> None:
        self.data: dict[tuple[str, float, int, bool], FileLockContext] = {} # maybe use WeakValueDictionary


default_context_store = ContextStore()


class FileLock:
    def __init__(
        self,
        lock_file: str,
        timeout: float = -1,
        mode: int = 0o644,
        thread_local: bool = True,
        *,
        is_singleton: bool = False,
        context_store: ContextStore | None = None
    ) -> None:
        if context_store is None:
            context_store = default_context_store

        args = (lock_file, timeout, mode)

        self._context = context_store.data.setdefault(
            (*args, thread_local),
            (ThreadLocalFileContext if thread_local else FileLockContext)(*args)
        )

@YouJiacheng
Copy link

#282
#283
It seems that singleton is proposed to solve this re-entrant problem. But it has some problems as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants