Message for pool saturated, shutdown pool instead of disconnect, logging around failover, etc #338

niedfelj · 2014-12-17T22:16:01Z

In most scenarios, node.disconnect doesn't make much sense (ie connection pool is saturated, not going to be very helpful to try to get a connection again just to disconnect it).

This is especially true given the way the new underlying connection pool works, which means if you are operating on separate pool.with blocks (not nested), the connection pool may have already returned the connection you were working with to the general pool (if it's the last on the stack of connections it goes back in the available pool and since pool.with always ensures that a connection is checked in, we would have to do a checkout and ensure checkin on our own if we want to work on the same connection or have disconnect happen within the existing block).

It seems much cleaner to just shutdown the pool (as that is the only API we really have from the connection pool to disconnect all connections) and create a new one, so we can ensure we have a fresh pool and connections. Otherwise we need to change the underlying way a node deals with connections and the pool. In general, it's a bit frustrating the way sessions/clusters/nodes are not shared at all across threads, as it prevents them from really communicating events/failovers/state etc without all of them discovering it separately? Correct me if I'm wrong, but this seems at least true with Mongoid's implementation via the Session::SessionFactory when you call Mongoid.default_session? I was working on another PR to move the pools to the node objects to simplify that logic, when I realized every thread had a different copy of the node, which is clearly why ConnectionManager is there in the first place? If the current architecture remains, does it make sense to have a NodeManager that can keep the state across them to record and mark them down, etc? IMHO, there are 3 or 4 layers now of thread protections that seem redundant, but I'm not familiar enough with all of the pipelining etc that is happening, even though it already seems like that's built into the Executable module.

I've made some more comments in the code below. But in general, this PR rescues a Timeout::Error which might be thrown by ConnectionPool in response to either not being able to get a connection within pool_timeout because they are all in use and/or it couldn't create any new connection instances because there are already pool_size connections created (this Timeout::Error has nothing to do with whether a connection can be established to a host). It raises a new exception Errors::PoolSaturated (both Errors::PoolSaturated and Errors::PoolTimeout already existed in the Moped codebase, I chose the former) with a message about not being able to get a new connection in certain amount of time and a suggestion to increase pool_size (might be nice in the future to have an auto-increment for pool size?). In combination with this PR for Mongoid, https://github.com/mongoid/mongoid/pull/3884, exceptions that happen DURING a command/query should be logged with them so they don't look like they are succeeding.

I'm also adding logging around events that seem like they are crucial, especially given failover events that might need to be recorded. I'm going to continue expanding logging/debugging in PR's, as Mongoid/Moped are rather devoid of it, and while 99% of the time we have zero issues with either, we've recently run into some situations where it would have been extremely helpful.

…e disconnect shutdown a pool/connections to start fresh, add logging around failover events and better debug messaging in places.

niedfelj · 2014-12-17T22:17:00Z

lib/moped/connection/manager.rb

@@ -13,6 +13,9 @@ module Manager
      # Used for synchronization of pools access.
      MUTEX = Mutex.new

+      # Used for synchronization of pool shutdown.
+      SHUTDOWN_MUTEX = Mutex.new


Separate Mutex for shutdown so we don't block other threads trying to get/create a new one while this thread shuts down.

arthurnn · 2014-12-20T18:11:24Z

kinda related to #337

niedfelj · 2015-01-05T17:59:33Z

It is very similar to #337, although I have some issues with the logic in that PR (namely, other sessions/nodes are memoizing @pool, which I've tried to resolve in mine because they will end up with a reference to a pool in a shutdown state...and my PR to shutdown a pool isn't blocking for other nodes/threads).

Rescue connection pool saturation with PoolSaturated and message, mak…

eac12ae

…e disconnect shutdown a pool/connections to start fresh, add logging around failover events and better debug messaging in places.

niedfelj reviewed Dec 17, 2014
View reviewed changes

niedfelj mentioned this pull request Dec 17, 2014

Logging exceptions that happen during a command/query mongodb/mongoid#3884

Closed

arthurnn added this to the 2.0.3 milestone Dec 20, 2014

This was referenced Feb 20, 2015

Writes fail with ConnectionPool::PoolShuttingDownError #345

Closed

Stuck in Moped::Errors::ConnectionFailure: Could not connect to a primary node for replica set #348

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Message for pool saturated, shutdown pool instead of disconnect, logging around failover, etc #338

Message for pool saturated, shutdown pool instead of disconnect, logging around failover, etc #338

niedfelj commented Dec 17, 2014

niedfelj Dec 17, 2014

arthurnn commented Dec 20, 2014

niedfelj commented Jan 5, 2015

Message for pool saturated, shutdown pool instead of disconnect, logging around failover, etc #338

Are you sure you want to change the base?

Message for pool saturated, shutdown pool instead of disconnect, logging around failover, etc #338

Conversation

niedfelj commented Dec 17, 2014

niedfelj Dec 17, 2014

Choose a reason for hiding this comment

arthurnn commented Dec 20, 2014

niedfelj commented Jan 5, 2015