Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fork support #50

Open
renkun-ken opened this issue Oct 14, 2021 · 4 comments
Open

fork support #50

renkun-ken opened this issue Oct 14, 2021 · 4 comments

Comments

@renkun-ken
Copy link
Contributor

I tested the capability of the client in fork process under linux.

library(parallel)

redis <- redux::hiredis(host = "my-redis-host")
redis$AUTH(password)

tasks <- list(
  task1 = mcparallel({
    redis$HKEYS("test1")
  }),
  task2 = mcparallel({
    redis$HSET("test1", "key1", format(rnorm(1)))
  }),
  task3 = mcparallel({
    redis$HSET("test1", "key2", format(rnorm(1)))
  })
)
result <- mccollect(tasks)
result

Unlike #19, in this case, the redis client could still work properly. Is there a guarantee that the client should work in this case due to some statelessness of the client?

@renkun-ken
Copy link
Contributor Author

In a loop, it seems if two fork processes happen to execute a redis command via the same client at the same time, it will be stuck forever.

while (TRUE) {
  write_log("Update")
  tasks <- list(
    task1 = mcparallel({
      redis$HKEYS("test1")
    }),
    task2 = mcparallel({
      redis$HSET("test1", "key1", format(rnorm(1)))
    }),
    task3 = mcparallel({
      redis$HSET("test1", "key2", format(rnorm(1)))
    }),
    task4 = mcparallel({
      redis$HSET("test1", "key3", format(rnorm(1)))
    })
  )
  result <- mccollect(tasks)
  Sys.sleep(0.1)
}

@renkun-ken
Copy link
Contributor Author

renkun-ken commented Oct 14, 2021

Using filelock could solve the problem:

requireNamespace("filelock")

lock_file <- tempfile()
with_lock <- function(expr) {
  lock <- filelock::lock(lock_file)
  on.exit(filelock::unlock(lock))
  expr
}

while (TRUE) {
  write_log("Update")
  tasks <- list(
    task1 = mcparallel({
      with_lock(redis$HKEYS("test1"))
    }),
    task2 = mcparallel({
      with_lock(redis$HSET("test1", "key1", format(rnorm(1))))
    }),
    task3 = mcparallel({
      with_lock(redis$HSET("test1", "key2", format(rnorm(1))))
    }),
    task4 = mcparallel({
      with_lock(redis$HSET("test1", "key3", format(rnorm(1))))
    })
  )
  result <- mccollect(tasks)
  Sys.sleep(0.1)
}

@richfitz
Copy link
Owner

Hi - this is also not supported as, as you can see, the underlying C client does not like the forking of internal data. My guess is that from the point of view of the server you now have just one client but as they are now free to send queries simultaneously it confuses the protocol.

A far better approach would be create your connections on each forked process; the overhead of creating a client connection is not that great compared with the other overheads associated with mclapply and friends (e.g., serialising the responses) or the network roundtrip of using redis.

@richfitz
Copy link
Owner

Also worth noting that redis itself will queue responses so that they occur serially, and that controlling that order is generally to your advantage. If you're trying to speed something up (I can't tell what the underlying problem you have is, or if it's just curiosity), you should look at pipelining your commands as this typically improves performance immensely

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants