Pubsub not configured Error #46

maennchen · 2018-10-15T08:38:54Z

Sentry catches a lot of RuntimeErrors saying Pubsub not configured! Subscriptions require a configured pubsub module..

I could not find out in which cases it is failing, i just see around 600 errors over a timespan of 30 days. I've never experienced the error myself and it only occurs in a low percentage of all the subscriptions.

Full Exception

Elixir.RuntimeError: Pubsub not configured!

Subscriptions require a configured pubsub module.

  File "lib/absinthe/phase/subscription/subscribe_self.ex", line 105, in Absinthe.Phase.Subscription.SubscribeSelf.ensure_pubsub!/1
  File "lib/absinthe/phase/subscription/subscribe_self.ex", line 20, in Absinthe.Phase.Subscription.SubscribeSelf.do_subscription/3
  File "lib/absinthe/pipeline.ex", line 274, in Absinthe.Pipeline.run_phase/3
  File "lib/absinthe/phoenix/channel.ex", line 98, in Absinthe.Phoenix.Channel.run/4
  File "lib/absinthe/phoenix/channel.ex", line 59, in Absinthe.Phoenix.Channel.handle_in/3
  File "lib/phoenix/channel/server.ex", line 244, in anonymous fn/4 in Phoenix.Channel.Server.handle_info/2
  File "lib/acme_api/endpoint.ex", line 1, in AcmeApi.Endpoint.instrument/4
  File "gen_server.erl", line 616, in :gen_server.try_dispatch/4
  File "gen_server.erl", line 686, in :gen_server.handle_msg/6
  File "proc_lib.erl", line 247, in :proc_lib.init_p_do_apply/3
  Module "Elixir.Phoenix.Channel.Server", in Phoenix.Channel.Server.init/1

Versions

absinthe_phoenix: 1.4.3
absinthe: 1.4.13
phoenix: 1.3.4

Setup

*Absinthe.Subscription added to Application

Use of Absinthe.Phoenix.Socket in Socket Implementation
Router Forward to Absinthe.Plug (via ApolloTracing.Pipeline)

The text was updated successfully, but these errors were encountered:

nanlong · 2018-11-08T08:20:26Z

I am having the same issue.

benwilson512 · 2018-11-21T14:27:42Z

This may be related to phoenixframework/phoenix#3179

benwilson512 · 2018-11-29T16:58:35Z

Chris' reply to this is helpful but not the full story, since Absinthe.Phoenix right now is written to expect you to do use Absinthe.Phoenix.Endpoint in your endpoint module. The code there uses the Endpoint to lookup various pieces of useful information. If we're going to decouple this from the Endpoint we'll need a new module.

In the meantime I highly suggest implementing "readiness" checks alongside your heath checks, where you don't route traffic to your node until it's fully booted.

sebastianseilund · 2020-06-19T23:42:30Z

@benwilson512 are there any recent recommendations on this?

Since I upgraded to Phoenix 1.5, I see the many messages like this in my logs when my server shuts down:

GenServer #PID<0.4431.0> terminating
** (RuntimeError) Pubsub not configured!

Subscriptions require a configured pubsub module.

    (absinthe 1.5.1) lib/absinthe/phase/subscription/subscribe_self.ex:109: Absinthe.Phase.Subscription.SubscribeSelf.ensure_pubsub!/1
    (absinthe 1.5.1) lib/absinthe/phase/subscription/subscribe_self.ex:20: Absinthe.Phase.Subscription.SubscribeSelf.do_subscription/3
    (absinthe 1.5.1) lib/absinthe/pipeline.ex:368: Absinthe.Pipeline.run_phase/3
    (absinthe_phoenix 2.0.0) lib/absinthe/phoenix/channel.ex:123: Absinthe.Phoenix.Channel.run/4
    (absinthe_phoenix 2.0.0) lib/absinthe/phoenix/channel.ex:91: Absinthe.Phoenix.Channel.run_doc/4
    (absinthe_phoenix 2.0.0) lib/absinthe/phoenix/channel.ex:57: Absinthe.Phoenix.Channel.handle_in/3
    (phoenix 1.5.3) lib/phoenix/channel/server.ex:316: Phoenix.Channel.Server.handle_info/2
    (stdlib 3.12.1) gen_server.erl:637: :gen_server.try_dispatch/4

The error seems to happen for my already-connected clients. Not new clients connecting while the app is being shut down.

The client is connected to the socket, and the socket itself is fine. The highlighted line seems to be when Absinthe.Subscription is shut down. The client, still connected to the same socket, then tries to rejoin over and over again, but Absinthe.Subscription isn't running so it gets rejected every time.

Only when Phoenix.Endpoint closes the WebSocket connection will the client try to reconnect to a new socket and there they get to a healthy server.

Any guidance would be much appreciated. Thanks for Absinthe! :)

benwilson512 · 2020-06-20T01:28:56Z

@sebastianseilund What order do you have things in your supervision tree? Have you considered using something like https://github.com/derekkraan/ranch_connection_drainer ? Ideally you don't keep routing traffic to a node that is shutting down.

sebastianseilund · 2020-06-20T02:26:27Z

Order:

        {Phoenix.PubSub, pubsub_config()},
        UserflowWeb.Endpoint,
        {Absinthe.Subscription, UserflowWeb.Endpoint},

If I put Absinthe.Subscription before UserflowWeb.Endpoint it won't start.

I'm not routing any new traffic to my nodes once shutdown starts. As described above, the error happens in the sockets that are already connected (via WebSocket). This is the order of things:

Client is connected to node via WebSocket and has joined an Absinthe channel. All is well.
Kubernetes decides to delete the pod.
Kubernetes removes the pod from all Services (should not route any new traffic to it)
Kubernetes sends SIGTERM to the application. (I know this happens after (3) due to a preStop sleep 5 command)
Absinthe.Subscription is shut down.
That (immediately?) causes the above error in the channel that's already running (since Endpoint has not been shut down yet)
Client is still connected to the socket and rejoins the same channel, but it fails again due to Pubsub not configured!
(7) happens several times until Endpoint is finally shut down and the client is disconnected.
The client reconnects via a fresh WebSocket connection and gets to a healthy server it can join the channel on. All is well again.

I haven't looked at https://github.com/derekkraan/ranch_connection_drainer before. Would you put that after Absinthe.Subscription?

jordi-chacon · 2020-06-25T18:21:07Z

We are seeing the exact same error that @sebastianseilund posted above, for very similar reasons.

sebastianseilund · 2020-06-29T00:39:15Z

@benwilson512 I looked into https://github.com/derekkraan/ranch_connection_drainer. It waits for connections to end before letting Endpoint shut itself down. So I don't think it'll fix the issue here since the WebSocket connections stay open.

benwilson512 · 2020-06-29T14:12:16Z

Fair. I think the best option here would be to figure out a way to have the supervision tree look like this:

[
Pubsub,
Absinthe.Subscription,
Endpoint
]

Then the endpoint would shutdown before the pubsub. Let me see what I can do.

sebastianseilund · 2020-06-29T15:37:42Z

I think that would be amazing 🙏

lstrzebinczyk · 2020-07-14T10:24:53Z

@benwilson512 Is there any movement on this? I am seeing ~150 of these errors daily on my production.

benwilson512 · 2020-07-14T11:42:02Z

@lstrzebinczyk the best solution here is to use a load balancer that routes traffic away from your node on deploy. Even if I split out the pubsub from the endpoint, it will simply result in different issues for your clients. If you aren't routing traffic away from your node on deploy then when they are disconnected they may try to reconnect to the same node which will now fail.

I will try to find time to make the proposed improvements here this week, but it is merely going to help with error messages, it isn't going to solve the root issue. The root issue is that you are not disconnecting traffic properly from a node that is going down.

sebastianseilund · 2020-07-14T16:55:05Z

If you aren't routing traffic away from your node on deploy then when they are disconnected they may try to reconnect to the same node which will now fail.

This doesn't happen. Kubernetes does route new traffic away from pods before shutting down the Elixir container. The existing network connections are not terminated by anything in Kubernetes though.

When the client reconnects, they'll be sent to one of the new pods (or one of the old ones that haven't begun shutdown yet).

The root issue is that you are not disconnecting traffic properly from a node that is going down.

I don't think you'd want to terminate active connections outside of Elixir's control. If you have a regular slow HTTP request, you don't want it to be terminated half-way. That could leave stuff in a bad state, or at least ends up in an unnecessary 5xx error to the client. Just let it drain.

When MyApp.Endpoint is shut down, it will gracefully terminate the WebSocket connections. Only once this has happened, does it seem safe to shut down Absinthe.Subscription.

benwilson512 · 2020-07-14T18:50:22Z

That's fair, there are definitely shutdown patterns that make sense for Elixir to lead. OK, I'll prioritize working on that this week.

MeerKatDev · 2020-11-14T20:13:01Z

any update on this?

UnderFenex · 2021-06-22T22:13:28Z

Any updates?

cliftonmcintosh · 2021-07-29T19:16:20Z

I'm encountering this as well. I'm curious if there is an update on the horizon similar to what @benwilson512 seemed to have in mind in #46 (comment).

alvnrapada · 2021-08-03T02:56:12Z

Experienced this today, and fixed it by adding a supervisor to my application.ex (might have missed it when setting up). Hope this helps.

def start(_type, _args) do 
    import Supervisor.Spec

    children = [
      AppWeb.Telemetry,
      {Phoenix.PubSub, name: App.PubSub},
      AppWeb.Endpoint,
      supervisor(Absinthe.Subscription, [AppWeb.Endpoint]) -> Added this
    ]
end

- Purpose is an experiment to decouple from Phoenix.Endpoint as described in absinthe-graphql#46

treble37 · 2021-08-29T05:16:06Z

@alvnrapada's fix didn't seem to work on my application when I was testing by doing a rolling restart of a deployment in k8s... so I made this experimental "hack PR" #89 outlined by #46 (comment). Prior to making the change to my kubernetes deployed app indicated by this PR, I too was receiving the "pubsub not configured" error whenever I was triggering a rolling restart in kubernetes as a test.

I tried adding tests to the PR but Phoenix.ChannelTest helper pathways seemed to expect an ets backed endpoint configuration, and I wasn't quite sure the best way of tackling this at the moment (open to suggestions).

Testing notes:

I tested the below on a demo app (nothing with any kind of traffic other than me trying to manually trigger subscriptions)
I manually tested a simple query which triggered a subscription and a simple edit mutation which triggered another subscription using the /graphiql GUI in browser.
Doing the rolling restart of a kubernetes deployement with the below no longer seemed to trigger the pubsub not configured error (at least as much as I could verify by live tailing the logs of a pod).

I will try and put together a demo application esp. if people think it might help in any way with testing (I'd also be curious as to results others might get), but here is the outline of what I did:

application.ex

  def start(_type, _args) do

    children = [
      # Start the Telemetry supervisor
      ExampleWeb.Telemetry,
      # Start the PubSub system
      # Start the Endpoint (http/https)
      {Phoenix.PubSub, name: Example.PubSub},
      Example.Presence,
      {Absinthe.Subscription, ExampleWeb.Pubsub.Endpoint},
      ExampleWeb.Endpoint
    ]
...# continued

user_socket.ex

defmodule ExampleWeb.UserSocket do
  use Phoenix.Socket
  use Absinthe.Phoenix.Socket, schema: ExampleWeb.Schema, endpoint: ExampleWeb.Pubsub.Endpoint

  ## Channels
  # channel "room:*", ExampleWeb.RoomChannel

  # Socket params are passed from the client and can
  # be used to verify and authenticate a user. After
  # verification, you can put default assigns into
  # the socket that will be set for all channels, ie
  #
  #     {:ok, assign(socket, :user_id, verified_user_id)}
  #
  # To deny connection, return `:error`.
  #
  # See `Phoenix.Token` documentation for examples in
  # performing token verification on connect.
  @impl true
  def connect(_params, socket, _connect_info) do
    {:ok, socket}
  end

  # Socket id's are topics that allow you to identify all sockets for a given user:
  #
  #     def id(socket), do: "user_socket:#{socket.assigns.user_id}"
  #
  # Would allow you to broadcast a "disconnect" event and terminate
  # all active sockets and channels for a given user:
  #
  #     ExampleWeb.Endpoint.broadcast("user_socket:#{user.id}", "disconnect", %{})
  #
  # Returning `nil` makes this socket anonymous.
  @impl true
  def id(_socket), do: nil
# omitted for brevity...
end

endpoint.ex

defmodule ExampleWeb.Endpoint do
  use Phoenix.Endpoint, otp_app: :example
  # use Absinthe.Phoenix.Endpoint

  # The session will be stored in the cookie and signed,
  # this means its contents can be read but not tampered with.
  # Set :encryption_salt if you would also like to encrypt it.
  @session_options [
    store: :cookie,
    key: "_Example_key",
    signing_salt: "sgWFmDOb"
  ]

  socket("/socket", ExampleWeb.UserSocket,
    websocket: [check_origin: false],
    longpoll: false
  )

  socket("/live", Phoenix.LiveView.Socket, websocket: [connect_info: [session: @session_options]])

#...omitted for brevity
end

new decoupled pubsub implementation file

defmodule ExampleWeb.Pubsub.Endpoint do
  use Absinthe.Phoenix.Subscription.Pubsub, otp_app: :example
end

whatever_file_where_a_subscription_is_published.ex

    :ok =
      Absinthe.Subscription.publish(ExampleWeb.Pubsub.Endpoint, %{users: user_list},
        new_users: "new_users"
      )

config.exs

config :example, ExampleWeb.Pubsub.Endpoint, pubsub_server: Example.PubSub

In terms of my k8 deployment.yaml config, it might also be worth noting I had a readiness probe and a terminationGracePeriod seconds of 65 like this (abbreviated yaml):

      terminationGracePeriodSeconds: 65
      containers:
        - name: example-web
          # the same name and tag must be used when building the image
          image: gcr.io/project_id_2200/example_web:prod
          imagePullPolicy: Always
          ports:
            - containerPort: 80
          readinessProbe:
            httpGet:
              path: /lbcheck
              port: 5000
            initialDelaySeconds: 60

danturn · 2021-09-09T09:31:57Z

we've just started seeing this issue in the last week or so, wondering if we can help at all?

benwilson512 · 2021-09-09T12:36:58Z

Has anyone been able to reproduce this issue in a locally running application? That might simplify testing any proposed fixes here.

kristoff3r · 2021-10-05T14:32:45Z

It happens in our dev enviroment every time it restarts while running under watchexec, so I can test proposed solutions quite easily. @alvnrapada's fix did not work for me.

bfad · 2021-11-05T17:34:15Z

So here's what I'm trying to do to get the supervision tree to end with the endpoint:

The supervision tree setup:

    children = [
      {Phoenix.PubSub, [name: MyappWeb.PubSub, adapter: Phoenix.PubSub.PG2]},
      {Absinthe.Subscription, MyappWeb.Endpoint},
      MyappWeb.Endpoint,
    ]

The Endpoint for my app:

defmodule MyappWeb.Endpoint do
  # Important that this goes before `use Phoenix.Endpoint`
  use MyappWeb.OverridePhoenixEndpoint, otp_app: :myapp
  use Phoenix.Endpoint, otp_app: :myapp
  use Absinthe.Phoenix.Endpoint
# ...
end

The overriding for Phoenix.Endpoint.pubsub_server!/0

defmodule MyappWeb.OverridePhoenixEndpoint do
  defmacro __using__([otp_app: app_name]) do
    quote do
      defp pubsub_server! do
        server = if :ets.whereis(__MODULE__) != :undefined do
          config(:pubsub_server)
        else
          Application.get_env(unquote(app_name), __MODULE__) |> Keyword.get(:pubsub_server)
        end

        server || raise ArgumentError, "no :pubsub_server configured for #{inspect(__MODULE__)}"
      end
    end
  end
end

This seems to me to be working, but also seems like it's not the greatest idea in the world. Does anyone have any better suggestions or reasons why this is dangerous?

densefog · 2021-12-07T20:18:19Z

I'm seeing this error as well in my local environment. I'm attempting to capture process exit's to gracefully shutdown. When I issue a System.stop() in IEX to initiate a shutdown, I see a number of Absinthe errors. That may be a way to help reproduce it locally.

nandofarias · 2021-12-07T21:14:09Z

I was also able to reproduce sending a sigterm signal to the process running mix phx.server locally. My steps were:

Start phoenix mix phx.server
Subscribe to any graphql subscription
Kill the process, pkill -f "mix phx.server" (MacOS)

For reference, sigterm is the signal sent by AWS ECS system to shutdown task containers. "Today, ECS always sends a SIGTERM" (https://aws.amazon.com/blogs/containers/graceful-shutdowns-with-ecs/)

hoyon · 2021-12-08T10:01:43Z

I was also able to repro in our codebase by sending SIGTERM to the process while there's an active subscription.

eliasdarruda · 2022-07-20T15:52:00Z

Any news here? It seems that having Subscription and Endpoint causes race conditions and your Supervision tree ends up being unpredictable

JoeriDijkstra · 2023-07-06T07:53:41Z

This issue still occurs, though I only get one error every few days running in prod. Anyone found a "neat" fix?

gorkunov · 2023-08-04T13:34:03Z

there is (a bit dirty) fix that we use now in production: prosapient/absinthe@39e5a92

note: it handles SIGTERM (perfect for k8s) - just to understand when pubsub is not configured and when there is a progress with app shutdown.

rossvz · 2024-04-04T22:49:32Z

@benwilson512 Any movement on this issue? We have been implementing more subscription-based functionality in our app and as a consequence we're seeing this PubSub failure a lot more now when users are connected to the socket during a restart. @bfad 's solution technically does work - but we get a lot of warning logs like

warning: this clause for pubsub_server!/0 cannot match because a previous clause at line 2 always matches
  lib/my_app_web/endpoint.ex:3

With that solution, killing the app (with :init.stop() in IEx) results in a log message like

18:48:34.553 [info] Shutting down 1 sockets in 1 rounds of 2000ms

before closing gracefully.

treble37 added a commit to treble37/absinthe_phoenix that referenced this issue Aug 29, 2021

Add a new module that implement Pubsub behavior

2ac28d7

- Purpose is an experiment to decouple from Phoenix.Endpoint as described in absinthe-graphql#46

treble37 added a commit to treble37/absinthe_phoenix that referenced this issue Aug 29, 2021

Add a new module that implement Pubsub behavior

ac0340c

- Purpose is an experiment to decouple from Phoenix.Endpoint as described in absinthe-graphql#46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pubsub not configured Error #46

Pubsub not configured Error #46

maennchen commented Oct 15, 2018

nanlong commented Nov 8, 2018

benwilson512 commented Nov 21, 2018

benwilson512 commented Nov 29, 2018

sebastianseilund commented Jun 19, 2020

benwilson512 commented Jun 20, 2020

sebastianseilund commented Jun 20, 2020

jordi-chacon commented Jun 25, 2020

sebastianseilund commented Jun 29, 2020

benwilson512 commented Jun 29, 2020

sebastianseilund commented Jun 29, 2020

lstrzebinczyk commented Jul 14, 2020

benwilson512 commented Jul 14, 2020

sebastianseilund commented Jul 14, 2020

benwilson512 commented Jul 14, 2020

MeerKatDev commented Nov 14, 2020

UnderFenex commented Jun 22, 2021

cliftonmcintosh commented Jul 29, 2021

alvnrapada commented Aug 3, 2021

treble37 commented Aug 29, 2021 •

edited

Loading

danturn commented Sep 9, 2021

benwilson512 commented Sep 9, 2021

kristoff3r commented Oct 5, 2021

bfad commented Nov 5, 2021

densefog commented Dec 7, 2021

nandofarias commented Dec 7, 2021

hoyon commented Dec 8, 2021

eliasdarruda commented Jul 20, 2022

JoeriDijkstra commented Jul 6, 2023

gorkunov commented Aug 4, 2023 •

edited

Loading

rossvz commented Apr 4, 2024

Pubsub not configured Error #46

Pubsub not configured Error #46

Comments

maennchen commented Oct 15, 2018

Full Exception

Versions

Setup

nanlong commented Nov 8, 2018

benwilson512 commented Nov 21, 2018

benwilson512 commented Nov 29, 2018

sebastianseilund commented Jun 19, 2020

benwilson512 commented Jun 20, 2020

sebastianseilund commented Jun 20, 2020

jordi-chacon commented Jun 25, 2020

sebastianseilund commented Jun 29, 2020

benwilson512 commented Jun 29, 2020

sebastianseilund commented Jun 29, 2020

lstrzebinczyk commented Jul 14, 2020

benwilson512 commented Jul 14, 2020

sebastianseilund commented Jul 14, 2020

benwilson512 commented Jul 14, 2020

MeerKatDev commented Nov 14, 2020

UnderFenex commented Jun 22, 2021

cliftonmcintosh commented Jul 29, 2021

alvnrapada commented Aug 3, 2021

treble37 commented Aug 29, 2021 • edited Loading

application.ex

user_socket.ex

endpoint.ex

new decoupled pubsub implementation file

whatever_file_where_a_subscription_is_published.ex

config.exs

danturn commented Sep 9, 2021

benwilson512 commented Sep 9, 2021

kristoff3r commented Oct 5, 2021

bfad commented Nov 5, 2021

densefog commented Dec 7, 2021

nandofarias commented Dec 7, 2021

hoyon commented Dec 8, 2021

eliasdarruda commented Jul 20, 2022

JoeriDijkstra commented Jul 6, 2023

gorkunov commented Aug 4, 2023 • edited Loading

rossvz commented Apr 4, 2024

treble37 commented Aug 29, 2021 •

edited

Loading

gorkunov commented Aug 4, 2023 •

edited

Loading