Skip to content

Commit

Permalink
Merge pull request #74 from diogob/listener-supervisor
Browse files Browse the repository at this point in the history
Create a listener connection supervisor so we can recover in case of database connection failure
  • Loading branch information
diogob authored Jun 17, 2021
2 parents e26a0ca + 427a32f commit 040cdde
Show file tree
Hide file tree
Showing 10 changed files with 414 additions and 319 deletions.
11 changes: 9 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# postgres-websockets
# postgres-websockets

![CI](https://github.com/diogob/postgres-websockets/actions/workflows/ci.yml/badge.svg)
[![Hackage Matrix CI](https://matrix.hackage.haskell.org/api/v2/packages/postgres-websockets/badge)](https://matrix.hackage.haskell.org/package/postgres-websockets)
Expand Down Expand Up @@ -128,4 +128,11 @@ For instamce, if we use the configuration in the [sample-env](./sample-env) we w
{"event":"ConnectionOpen","channel":"server-info","payload":"server-info","claims":{"mode":"rw","message_delivered_at":1.602719440727465893e9}}
```

You can monitor these messages on another websocket connection with a proper read token for the channel `server-info` or also having an additional database listener on the `PGWS_LISTEN_CHANNEL`.
You can monitor these messages on another websocket connection with a proper read token for the channel `server-info` or also having an additional database listener on the `PGWS_LISTEN_CHANNEL`.

## Recovering from listener database connection failures

The database conneciton used to wait for notification where the `LISTEN` command is issued can cause problems when it fails. To prevent these problem from completely disrupting our websockets server there are two ways to configure postgres-websockets:

* Self healing connection - postgres-websockets comes with a connection supervisor baked in. You just need to set the configuration `PGWS_CHECK_LISTENER_INTERVAL` to a number of milliseconds that will be the maximum amount of time losing messages. There is a cost for this since at each interval an additional SELECT query will be issued to ensure the listener connection is still active. If the connecion is not found the connection thread will be killed and respawned. This method has the advantage of keeping all channels and websocket connections alive while the database connection is severed (although messages will be lost).
* Using external supervision - you can also unset `PGWS_CHECK_LISTENER_INTERVAL` and postgres-websockets will try to shutdown the server when the database connection is lost. This does not seem to work in 100% of the cases, since in theory is possible to have the database connection closed and the producer thread lingering. But in most cases it should work and some external process can then restart the server. The downside is that all websocket connections will be lost.
4 changes: 4 additions & 0 deletions sample-env
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,7 @@ export PGWS_PORT=3000
## (use "@filename" to load from separate file)
export PGWS_JWT_SECRET="auwhfdnskjhewfi34uwehdlaehsfkuaeiskjnfduierhfsiweskjcnzeiluwhskdewishdnpwe"
export PGWS_JWT_SECRET_BASE64=False

## Check database listener every 10 seconds
## comment it out to disable and shutdown the server on listener errors (can be useful when using external process supervisors)
export PGWS_CHECK_LISTENER_INTERVAL=10000
173 changes: 113 additions & 60 deletions src/PostgresWebsockets/Broadcast.hs
Original file line number Diff line number Diff line change
@@ -1,53 +1,77 @@
{-|
Module : PostgresWebsockets.Broadcast
Description : Distribute messages from one producer to several consumers.
{-# LANGUAGE DeriveGeneric #-}

PostgresWebsockets functions to broadcast messages to several listening clients
This module provides a type called Multiplexer.
The multiplexer contains a map of channels and a producer thread.
-- |
-- Module : PostgresWebsockets.Broadcast
-- Description : Distribute messages from one producer to several consumers.
--
-- PostgresWebsockets functions to broadcast messages to several listening clients
-- This module provides a type called Multiplexer.
-- The multiplexer contains a map of channels and a producer thread.
--
-- This module avoids any database implementation details, it is used by HasqlBroadcast where
-- the database logic is combined.
module PostgresWebsockets.Broadcast
( Multiplexer,
Message (..),
newMultiplexer,
onMessage,
relayMessages,
relayMessagesForever,
superviseMultiplexer,

This module avoids any database implementation details, it is used by HasqlBroadcast where
the database logic is combined.
-}
module PostgresWebsockets.Broadcast ( Multiplexer (src)
, Message (..)
, newMultiplexer
, onMessage
, relayMessages
, relayMessagesForever
-- * Re-exports
, readTQueue
, writeTQueue
, readTChan
) where
-- * Re-exports
readTQueue,
writeTQueue,
readTChan,
)
where

import Protolude hiding (toS)
import Protolude.Conv
import qualified StmContainers.Map as M
import Control.Concurrent.STM.TChan
import Control.Concurrent.STM.TQueue
import qualified Data.Aeson as A
import Protolude hiding (toS)
import Protolude.Conv (toS)
import qualified StmContainers.Map as M

import GHC.Show
data Message = Message
{ channel :: Text,
payload :: Text
}
deriving (Eq, Show)

data Message = Message { channel :: ByteString
, payload :: ByteString
} deriving (Eq, Show)
data Multiplexer = Multiplexer
{ channels :: M.Map Text Channel,
messages :: TQueue Message,
producerThreadId :: MVar ThreadId,
reopenProducer :: IO ThreadId
}

data Multiplexer = Multiplexer { channels :: M.Map ByteString Channel
, src :: ThreadId
, messages :: TQueue Message
}
data MultiplexerSnapshot = MultiplexerSnapshot
{ channelsSize :: Int,
messageQueueEmpty :: Bool,
producerId :: Text
}
deriving (Generic)

instance Show Multiplexer where
show Multiplexer{} = "Multiplexer"
data Channel = Channel
{ broadcast :: TChan Message,
listeners :: Integer
}

data Channel = Channel { broadcast :: TChan Message
, listeners :: Integer
}
instance A.ToJSON MultiplexerSnapshot

-- | Given a multiplexer derive a type that can be printed for debugging or logging purposes
takeSnapshot :: Multiplexer -> IO MultiplexerSnapshot
takeSnapshot multi =
MultiplexerSnapshot <$> size <*> e <*> thread
where
size = atomically $ M.size $ channels multi
thread = show <$> readMVar (producerThreadId multi)
e = atomically $ isEmptyTQueue $ messages multi

-- | Opens a thread that relays messages from the producer thread to the channels forever
relayMessagesForever :: Multiplexer -> IO ThreadId
relayMessagesForever = forkIO . forever . relayMessages
relayMessagesForever = forkIO . forever . relayMessages

-- | Reads the messages from the producer and relays them to the active listeners in their respective channels.
relayMessages :: Multiplexer -> IO ()
Expand All @@ -59,30 +83,59 @@ relayMessages multi =
Nothing -> return ()
Just c -> writeTChan (broadcast c) m

newMultiplexer :: (TQueue Message -> IO a)
-> (Either SomeException a -> IO ())
-> IO Multiplexer
newMultiplexer ::
(TQueue Message -> IO a) ->
(Either SomeException a -> IO ()) ->
IO Multiplexer
newMultiplexer openProducer closeProducer = do
msgs <- newTQueueIO
m <- liftA2 Multiplexer M.newIO (forkFinally (openProducer msgs) closeProducer)
return $ m msgs
let forkNewProducer = forkFinally (openProducer msgs) closeProducer
tid <- forkNewProducer
multiplexerMap <- M.newIO
producerThreadId <- newMVar tid
pure $ Multiplexer multiplexerMap msgs producerThreadId forkNewProducer

-- | Given a multiplexer, a number of milliseconds and an IO computation that returns a boolean
-- Runs the IO computation at every interval of milliseconds interval and reopens the multiplexer producer
-- if the resulting boolean is true
-- When interval is 0 this is NOOP, so the minimum interval is 1ms
-- Call this in case you want to ensure the producer thread is killed and restarted under a certain condition
superviseMultiplexer :: Multiplexer -> Int -> IO Bool -> IO ()
superviseMultiplexer multi msInterval shouldRestart = do
void $
forkIO $
forever $ do
threadDelay $ msInterval * 1000
sr <- shouldRestart
when sr $ do
snapBefore <- takeSnapshot multi
void $ killThread <$> readMVar (producerThreadId multi)
new <- reopenProducer multi
void $ swapMVar (producerThreadId multi) new
snapAfter <- takeSnapshot multi
putStrLn $
"Restarting producer. Multiplexer updated: "
<> A.encode snapBefore
<> " -> "
<> A.encode snapAfter

openChannel :: Multiplexer -> ByteString -> STM Channel
openChannel :: Multiplexer -> Text -> STM Channel
openChannel multi chan = do
c <- newBroadcastTChan
let newChannel = Channel{ broadcast = c
, listeners = 0
}
M.insert newChannel chan (channels multi)
return newChannel
c <- newBroadcastTChan
let newChannel =
Channel
{ broadcast = c,
listeners = 0
}
M.insert newChannel chan (channels multi)
return newChannel

{- | Adds a listener to a certain multiplexer's channel.
The listener must be a function that takes a 'TChan Message' and perform any IO action.
All listeners run in their own thread.
The first listener will open the channel, when a listener dies it will check if there acquire
any others and close the channel when that's the case.
-}
onMessage :: Multiplexer -> ByteString -> (Message -> IO()) -> IO ()
-- | Adds a listener to a certain multiplexer's channel.
-- The listener must be a function that takes a 'TChan Message' and perform any IO action.
-- All listeners run in their own thread.
-- The first listener will open the channel, when a listener dies it will check if there acquire
-- any others and close the channel when that's the case.
onMessage :: Multiplexer -> Text -> (Message -> IO ()) -> IO ()
onMessage multi chan action = do
listener <- atomically $ openChannelWhenNotFound >>= addListener
void $ forkFinally (forever (atomically (readTChan listener) >>= action)) disposeListener
Expand All @@ -92,13 +145,13 @@ onMessage multi chan action = do
let c = fromMaybe (panic $ "trying to remove listener from non existing channel: " <> toS chan) mC
M.delete chan (channels multi)
when (listeners c - 1 > 0) $
M.insert Channel{ broadcast = broadcast c, listeners = listeners c - 1 } chan (channels multi)
M.insert Channel {broadcast = broadcast c, listeners = listeners c - 1} chan (channels multi)
openChannelWhenNotFound =
M.lookup chan (channels multi) >>= \case
Nothing -> openChannel multi chan
Just ch -> return ch
Nothing -> openChannel multi chan
Just ch -> return ch
addListener ch = do
M.delete chan (channels multi)
let newChannel = Channel{ broadcast = broadcast ch, listeners = listeners ch + 1}
let newChannel = Channel {broadcast = broadcast ch, listeners = listeners ch + 1}
M.insert newChannel chan (channels multi)
dupTChan $ broadcast newChannel
12 changes: 6 additions & 6 deletions src/PostgresWebsockets/Claims.hs
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,13 @@ import qualified Data.Aeson as JSON


type Claims = M.HashMap Text JSON.Value
type ConnectionInfo = ([ByteString], ByteString, Claims)
type ConnectionInfo = ([Text], Text, Claims)

{-| Given a secret, a token and a timestamp it validates the claims and returns
either an error message or a triple containing channel, mode and claims hashmap.
-}
validateClaims
:: Maybe ByteString
:: Maybe Text
-> ByteString
-> LByteString
-> UTCTime
Expand Down Expand Up @@ -58,16 +58,16 @@ validateClaims requestChannel secret jwtToken time = runExceptT $ do
pure (validChannels, mode, cl')

where
claimAsJSON :: Text -> Claims -> Maybe ByteString
claimAsJSON :: Text -> Claims -> Maybe Text
claimAsJSON name cl = case M.lookup name cl of
Just (JSON.String s) -> Just $ encodeUtf8 s
Just (JSON.String s) -> Just s
_ -> Nothing

claimAsJSONList :: Text -> Claims -> Maybe [ByteString]
claimAsJSONList :: Text -> Claims -> Maybe [Text]
claimAsJSONList name cl = case M.lookup name cl of
Just channelsJson ->
case JSON.fromJSON channelsJson :: JSON.Result [Text] of
JSON.Success channelsList -> Just $ encodeUtf8 <$> channelsList
JSON.Success channelsList -> Just channelsList
_ -> Nothing
Nothing -> Nothing

Expand Down
Loading

0 comments on commit 040cdde

Please sign in to comment.