Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid context_id panic in dispatcher #304

Open
dragosc28 opened this issue Feb 21, 2025 · 1 comment
Open

Invalid context_id panic in dispatcher #304

dragosc28 opened this issue Feb 21, 2025 · 1 comment

Comments

@dragosc28
Copy link

I'm facing a weird panic, which does happen in my app, however not frequently, and I can't seem to reproduce it.

The panic is this one:

[2024-10-29 12:35:08.448][39][critical][wasm] [source/extensions/common/wasm/context.cc:1181] wasm log envoy.vm: panicked at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/proxy-wasm-0.2.2/src/dispatcher.rs:375:13:
invalid context_id
[2024-10-29 12:35:08.448][39][error][wasm] [source/extensions/common/wasm/wasm_vm.cc:38] Function: proxy_on_response_headers failed: Uncaught RuntimeError: unreachable
Proxy-Wasm plugin in-VM backtrace:
  0:  0x37cec - __rust_start_panic
  1:  0x37cba - rust_panic
  2:  0x37cac - _ZN3std9panicking20rust_panic_with_hook17h844c7fdc9e749b51E
  3:  0x336a5 - _ZN3std9panicking11begin_panic28_$u7b$$u7b$closure$u7d$$u7d$17h2d5a84dc577a6a1fE
  4:  0x33668 - _ZN3std10sys_common9backtrace26__rust_end_short_backtrace17h1cd545de85e32f17E
  5:  0x33a93 - _ZN3std9panicking11begin_panic17hef1cc66353531458E
  6:  0x29caf - _ZN10proxy_wasm10dispatcher10Dispatcher24on_http_response_headers17ha0ec21a408e1e8a6E
  7:  0x2f992 - proxy_on_response_headers
libc++abi: Pure virtual function called!

A few details about this specific envoy, I'm running an external_processor filter, followed by a wasm filter. The envoy's clusters and routes are configured dynamically from an xds-server.

One interesting part is that all my replicas seem to be crashing in the same time. So this made me look towards the common xds service, but I can't exclude some lifecycle race condition somewhere that occurs after failures from my external_processor grpc service.

I tried reproducing, with a similar setup, ext_proc + wasm, having the ext_proc timeout for half the requests, while also sending cds updates every 10ms, but it works as expected.

Does the team here have any ideas what could cause this or how to investigate further? It just seems the response_headers handler is called after the context is removed (on_done).

@PiotrSikora
Copy link
Member

It seems that your Proxy-Wasm plugin receives HTTP response for a HTTP request that it didn't receive.

One scenario that comes to mind (but I didn't verify it) is that your ext_proc filter (or something else, really) is returning a local error response, and Envoy sends it through the whole filter chain, which then triggers proxy_on_response_headers without a prior proxy_on_request_headers in your plugin.

This might explain this behavior, and there was a number of issues related to sending locally generated error responses through the whole filter chain in the past.

If you could run Envoy with logging at the trace level, it would include all Proxy-Wasm calls, and we could see the whole flow for the request that's panicking to confirm if that's the case.

Perhaps @mpwarres @martijneken @leonm1 have more insights.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants