Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

404 Session Not Found error When accessing gradio via a proxy #6920

Closed
1 task done
lykeven opened this issue Jan 2, 2024 · 69 comments · Fixed by #7935
Closed
1 task done

404 Session Not Found error When accessing gradio via a proxy #6920

lykeven opened this issue Jan 2, 2024 · 69 comments · Fixed by #7935
Assignees
Labels
bug Something isn't working cloud Issues that only happen when deploying Gradio on cloud services Priority High priority issues

Comments

@lykeven
Copy link

lykeven commented Jan 2, 2024

Describe the bug

I am running a Gradio application locally, where there's a requests request to a remote server in the click event function of a button, and the result is returned to the component.
Everything works fine, but if I turn on a proxy (Shadowsocks) to access the Gradio application, requests with short response time return normally, while requests that take longer return exceptions.

Have you searched existing issues? 🔎

  • I have searched and found no existing issues

Reproduction

#!/usr/bin/env python

import gradio as gr
import os
import json
import requests
import base64

URL = os.environ.get("URL")

def image_to_base64(image_path):
    with open(image_path, "rb") as image_file:
        encoded_string = base64.b64encode(image_file.read())
        return encoded_string.decode('utf-8')

def post(
        input_text,
        image_prompt,
        ):
    headers = {
        "Content-Type": "application/json; charset=UTF-8",
        "User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36",
    }
    if image_prompt:
        encoded_img = image_to_base64(image_prompt)
    else:
        return "", []

    data = json.dumps({
        'text': input_text,
        'history': [],
        'image': encoded_img
    })
    try:
        response = requests.request("POST", URL, headers=headers, data=data, timeout=(60, 100)).json()
    except Exception as e:
        return "", []
    answer = str(response['result'])
    return "", [[input_text, answer]]


def main():
    gr.close_all()
    with gr.Blocks() as demo:
        with gr.Row():
            with gr.Column(scale=4.5):
                with gr.Group():
                    input_text = gr.Textbox(label='Input Text', placeholder='Please enter text prompt below and press ENTER.')
                    with gr.Row():
                        run_button = gr.Button('Generate')
                    image_prompt = gr.Image(type="filepath", label="Image Prompt", value=None)

            with gr.Column(scale=5.5):
                result_text = gr.components.Chatbot(label='Multi-round conversation History', value=[("", "Hi, What do you want to know about this image?")], height=550)
                
        run_button.click(fn=post,inputs=[input_text, image_prompt],
                         outputs=[input_text, result_text])
    demo.launch(server_port=7862)

if __name__ == '__main__':
    main()

Screenshot

image

Logs

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/starlette/_exception_handler.py", line 44, in wrapped_app
    await app(scope, receive, sender)
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/starlette/routing.py", line 73, in app
    await response(scope, receive, send)
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/starlette/responses.py", line 259, in __call__
    await wrap(partial(self.listen_for_disconnect, receive))
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 597, in __aexit__
    raise exceptions[0]
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/starlette/responses.py", line 255, in wrap
    await func()
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/starlette/responses.py", line 244, in stream_response
    async for chunk in self.body_iterator:
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/gradio/routes.py", line 660, in sse_stream
    raise e
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/gradio/routes.py", line 601, in sse_stream
    raise HTTPException(
fastapi.exceptions.HTTPException: 404: Session not found.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/starlette/applications.py", line 116, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/starlette/middleware/cors.py", line 83, in __call__
    await self.app(scope, receive, send)
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/starlette/_exception_handler.py", line 55, in wrapped_app
    raise exc
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/starlette/_exception_handler.py", line 44, in wrapped_app
    await app(scope, receive, sender)
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/starlette/routing.py", line 746, in __call__
    await route.handle(scope, receive, send)
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/starlette/routing.py", line 288, in handle
    await self.app(scope, receive, send)
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/starlette/routing.py", line 75, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/home/user/anaconda3/envs/py3.8/lib/python3.8/site-packages/starlette/_exception_handler.py", line 59, in wrapped_app
    raise RuntimeError(msg) from exc
RuntimeError: Caught handled exception, but response already started.

System Info

Python 3.8.16
Gradio 4.12.0
requests 2.31.0
fastapi 0.108.0

System: Ubuntu
Browser: Chrome 120.0.6099.129, Safari 16.1
Proxy: Shadowsocks

Severity

Blocking usage of gradio

@lykeven lykeven added the bug Something isn't working label Jan 2, 2024
@abidlabs abidlabs added the cloud Issues that only happen when deploying Gradio on cloud services label Jan 2, 2024
@shimizust
Copy link
Contributor

shimizust commented Jan 5, 2024

I am also experiencing this issue after bumping gradio from 3.50.2 to 4.12.0.

I basically have a Gradio app deployed on a k8s cluster. Port-forwarding directly to the pod works as expected, but accessing it externally via emissary causes this same error.

@shimizust
Copy link
Contributor

@abidlabs Do you have any ideas on what might be the issue and how to work around it, as it's blocking the major version upgrade for us? Is it related to the switch to using SSE by default in v4? Is there a way to disable it?

I think this may be an important issue as more "production" Gradio apps being served on internal infra try upgrading to v4.

@freddyaboulton
Copy link
Collaborator

Hi @shimizust ! I think the root_path parameter needs to be set when running behind a proxy. See this guide: https://www.gradio.app/guides/running-gradio-on-your-web-server-with-nginx#run-your-gradio-app-on-your-web-server

It uses nginx but I think it should apply to other proxies

@shimizust
Copy link
Contributor

Thanks @freddyaboulton, although I think this may be a different issue. My app is running at the root of the domain already.

I think this has to do with session affinity and the use of SSE when you have the app running on multiple pods. Once I configured session affinity at the ingress layer (or if you are accessing within the cluster, you would need to configure session affinity at service layer), I was able to get past the initial "An Unexpected Error Occurred" on initial load of the app. However, I'm still starting to get these 404 Session Not Found errors while using the app:

image

These go away if I decrease the number of pods to a single pod, which isn't ideal. I'd like to be able to put multiple pods behind a load balancer. Not sure if anyone has any insights. Again, this wasn't an issue with gradio 3.x.

@pseudotensor
Copy link
Contributor

pseudotensor commented Feb 11, 2024

@abidlabs @freddyaboulton Also see same issues when using gradio 4.17.0 on k8 even though not trying to access it directly, just across pods. 3.50.2 worked perfectly in exact same setup.

Probably will have to unfortunately revert again back to 3.50.2 (I've tried 4 times to upgrade :( ).

Note that we use nginx perfectly fine on 4.17.0, so it's not just a proxy issue.

@ilyashusterman
Copy link

ilyashusterman commented Feb 12, 2024

im running with fastapi too , with gradio.queue , gradio mount app
and its definetly problem with gradio logic for accessing between session id of routes, tried with following kubernetes yaml config , still same exception for session already started :

  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 3600 # 1 hour

tried also with gunicorn and still no success and same exception:

./.venv/bin/gunicorn main:app --workers 4 --worker-class uvicorn.workers.UvicornWorker --bind 127.0.0.1:7860

maybe there is a problem with my configurations, can you guys check also?

pseudotensor added a commit to h2oai/h2ogpt that referenced this issue Feb 12, 2024
pseudotensor added a commit to h2oai/h2ogpt that referenced this issue Feb 12, 2024
pseudotensor added a commit to h2oai/h2ogpt that referenced this issue Feb 12, 2024
…ial one-off docker build"""

This reverts commit 4405324.
pseudotensor added a commit to h2oai/h2ogpt that referenced this issue Feb 12, 2024
…for special one-off docker build""""

This reverts commit c24cc26.
@pseudotensor
Copy link
Contributor

pseudotensor commented Feb 12, 2024

@abidlabs Note that this is a regression, 3.50.2 worked fine. Should be fixed I'd hope. I'm unable to upgrade to gradio 4 due to his, event though all non-networking things are wonderful with gradio4.

@abidlabs
Copy link
Member

Looking into this!

@pseudotensor
Copy link
Contributor

Collecting info and repro-ness.

When things are bad on k8, on 4.17.0 (before nginx issue), this is one failure:

I have no name!@h2ogpte-core-7899c6665-n4nn8:/app$ python3                          
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from gradio_client import Client
>>> x = Client('http://h2ogpt-web/')
>>> y = x.predict(api_name='/system_hash')
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/gradio_client/client.py", line 192, in stream_messages
    event_id = resp["event_id"]
KeyError: 'event_id'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/dist-packages/gradio_client/client.py", line 1590, in result
    return super().result(timeout=timeout)
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.10/dist-packages/gradio_client/client.py", line 973, in _inner
    predictions = _predict(*data)
  File "/usr/local/lib/python3.10/dist-packages/gradio_client/client.py", line 1008, in _predict
    result = utils.synchronize_async(
  File "/usr/local/lib/python3.10/dist-packages/gradio_client/utils.py", line 870, in synchronize_async
    return fsspec.asyn.sync(fsspec.asyn.get_loop(), func, *args, **kwargs)  # type: ignore
  File "/usr/local/lib/python3.10/dist-packages/fsspec/asyn.py", line 103, in sync
    raise return_result
  File "/usr/local/lib/python3.10/dist-packages/fsspec/asyn.py", line 56, in _runner
    result[0] = await coro
  File "/usr/local/lib/python3.10/dist-packages/gradio_client/client.py", line 1206, in _sse_fn_v1_v2
    return await utils.get_pred_from_sse_v1_v2(
  File "/usr/local/lib/python3.10/dist-packages/gradio_client/utils.py", line 414, in get_pred_from_sse_v1_v2
    raise exception
  File "/usr/local/lib/python3.10/dist-packages/gradio_client/utils.py", line 524, in stream_sse_v1_v2
    raise CancelledError()
concurrent.futures._base.CancelledError
>>>

Another one is:

INFO:     10.255.6.190:60640 - "GET /queue/data?session_hash=a3c4ade8-099a-4837-ba21-da4358400402 HTTP/1.1" 200 OK
Exception in ASGI application
Traceback (most recent call last):
  File "/h2ogpt_conda/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/h2ogpt_conda/lib/python3.10/site-packages/starlette/routing.py", line 77, in app
    await response(scope, receive, send)
  File "/h2ogpt_conda/lib/python3.10/site-packages/starlette/responses.py", line 257, in __call__
    async with anyio.create_task_group() as task_group:
  File "/h2ogpt_conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 597, in __aexit__
    raise exceptions[0]
  File "/h2ogpt_conda/lib/python3.10/site-packages/starlette/responses.py", line 260, in wrap
    await func()
  File "/h2ogpt_conda/lib/python3.10/site-packages/starlette/responses.py", line 249, in stream_response
    async for chunk in self.body_iterator:
  File "/h2ogpt_conda/lib/python3.10/site-packages/gradio/routes.py", line 663, in sse_stream
    raise e
  File "/h2ogpt_conda/lib/python3.10/site-packages/gradio/routes.py", line 604, in sse_stream
    raise HTTPException(
fastapi.exceptions.HTTPException: 404: Session not found.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/h2ogpt_conda/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 419, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/h2ogpt_conda/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
  File "/h2ogpt_conda/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/h2ogpt_conda/lib/python3.10/site-packages/starlette/applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/h2ogpt_conda/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/h2ogpt_conda/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/h2ogpt_conda/lib/python3.10/site-packages/starlette/middleware/cors.py", line 83, in __call__
    await self.app(scope, receive, send)
  File "/h2ogpt_conda/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/h2ogpt_conda/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/h2ogpt_conda/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/h2ogpt_conda/lib/python3.10/site-packages/starlette/routing.py", line 758, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/h2ogpt_conda/lib/python3.10/site-packages/starlette/routing.py", line 778, in app
    await route.handle(scope, receive, send)
  File "/h2ogpt_conda/lib/python3.10/site-packages/starlette/routing.py", line 299, in handle
    await self.app(scope, receive, send)
  File "/h2ogpt_conda/lib/python3.10/site-packages/starlette/routing.py", line 79, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/h2ogpt_conda/lib/python3.10/site-packages/starlette/_exception_handler.py", line 68, in wrapped_app
    raise RuntimeError(msg) from exc
RuntimeError: Caught handled exception, but response already started.

@abidlabs
Copy link
Member

Could someone here please try installing this version of gradio and seeing if the issue is resolved?

pip install https://gradio-builds.s3.amazonaws.com/9b8810ff9af4d9a50032752af09cefcf2ef7a7ac/gradio-4.18.0-py3-none-any.whl

@oobabooga
Copy link

I am also getting these "404: Session not found." errors all the time after upgrading to gradio==4.19. This is the error message, very similar to the one above by @pseudotensor:

Exception in ASGI application
Traceback (most recent call last):
  File "/home/me/.miniconda3/envs/textgen/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/home/me/.miniconda3/envs/textgen/lib/python3.10/site-packages/starlette/routing.py", line 77, in app 
    await response(scope, receive, send)
  File "/home/me/.miniconda3/envs/textgen/lib/python3.10/site-packages/starlette/responses.py", line 257, in __call__
    async with anyio.create_task_group() as task_group:
  File "/home/me/.local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 597, in __aexit__
    raise exceptions[0]
  File "/home/me/.miniconda3/envs/textgen/lib/python3.10/site-packages/starlette/responses.py", line 260, in wrap
    await func()
  File "/home/me/.miniconda3/envs/textgen/lib/python3.10/site-packages/starlette/responses.py", line 249, in stream_response
    async for chunk in self.body_iterator:
  File "/home/me/.miniconda3/envs/textgen/lib/python3.10/site-packages/gradio/routes.py", line 665, in sse_stream
    raise e
  File "/home/me/.miniconda3/envs/textgen/lib/python3.10/site-packages/gradio/routes.py", line 605, in sse_stream
    raise HTTPException(
fastapi.exceptions.HTTPException: 404: Session not found.

Could someone here please try installing this version of gradio and seeing if the issue is resolved?

pip install https://gradio-builds.s3.amazonaws.com/9b8810ff9af4d9a50032752af09cefcf2ef7a7ac/gradio-4.18.0-py3-none-any.whl

I have tried this wheel, and it seems to make the exception go away, but I still get several "Connection errored out." popups in the UI if I refresh it a few times with F5/Ctrl+F5. This is the offending line in index.js:

print

I don't have a simple example to reproduce the issue, but it happens all the time in the dev branch of my project, which now uses Gradio 4.19.

@abidlabs
Copy link
Member

Thanks @oobabooga are you running behind a proxy as well?

Also when you say that you see this error after upgrading to 4.19, what version were you upgrading from? I.e what was the latest version that did not have this issue for you?

@oobabooga
Copy link

oobabooga commented Feb 15, 2024

I am not using a proxy, just launching the server with server_name='0.0.0.0' and accessing the UI from another computer in the same local network. The error above doesn't happen every time I open the UI, but if I refresh the page a few times, it always ends up happening after a few attempts.

For clarity, several error popups with the message "404: Session not found." appear in the UI when the stacktrace that I posted happens.

The last gradio version I used was 3.50.2, and that issue never happened there.

print

@oobabooga
Copy link

I found that if I comment my interface.load events, the error stops happening. That's lines 149 to 154 here:

https://github.com/oobabooga/text-generation-webui/blob/7123ac3f773baa120d644e7b8ab10027758d1813/server.py#L149

@abidlabs
Copy link
Member

Ok I think I know why its happening on k8, but not sure why its happening for you @oobabooga. It seems like that's a separate issue. If you are able to put together a more self-contained repro, that would be veryy appreciated.

@oobabooga
Copy link

I have been trying to come up a minimal example to reproduce the issue, but it has been difficult. I did find that the same error has been happening in other repositories:

daswer123/xtts-finetune-webui#7

invoke-ai/invoke-training#92

It may be the case that the problem in this issue is not the use of a proxy, but that fact that events behind a proxy take longer to run, somehow triggering the error.

@abidlabs abidlabs added the Priority High priority issues label Mar 2, 2024
@mgirard772
Copy link

mgirard772 commented Mar 5, 2024

Subscribing to this issue. Hoping a fix can be issued soon since the newest version without this issue has known security vulnerabilities.

@abidlabs abidlabs changed the title Error When accessing gradio via a proxy 404 Session Not Found error When accessing gradio via a proxy Mar 6, 2024
@arian81
Copy link
Contributor

arian81 commented Mar 12, 2024

@abidlabs Is there any updates on fixing this issue on versions after 4.16 ?

@freddyaboulton
Copy link
Collaborator

Have you tried with the latest version (4.21.0)? I think this PR may fix it (https://github.com/gradio-app/gradio/pull/7641/files). It improves the logic for determining the URL the gradio app is hosted in when behind a proxy but you can also set the root_path argument in launch to set it yourself.

@arian81
Copy link
Contributor

arian81 commented Mar 12, 2024

Have you tried with the latest version (4.21.0)? I think this PR may fix it (https://github.com/gradio-app/gradio/pull/7641/files). It improves the logic for determining the URL the gradio app is hosted in when behind a proxy but you can also set the root_path argument in launch to set it yourself.

I haven't tested out anything because the only way to test out is to push it to prod on my end which i hesitate to do in case it breaks again. @abidlabs seemed to know exactly what my specific issue was in this case as he mentioned something they changed in 4.17 that broke it for me. Maybe he can comment on wether 4.21 will work in my case and then I'll try it out.

@mgirard772
Copy link

mgirard772 commented Mar 12, 2024

Confirming that this is still broken.

I did two deployments:

  1. Just bumping the gradio version to 4.21.0
  2. Bumping the gradio version to 4.21.0 and set the GRADIO_ROOT_PATH to my Route53 domain (https://${DOMAIN}/)

Neither of these are working and are still producing the same error.

My environment consists of the following:

  • Route53
  • Application Load Balancer
  • ECS, with 2 replicas

Note this seems to only break in environments with more than one replica.

image

@abidlabs
Copy link
Member

@arian81 if my hypothesis is correct, then the fix should be out in 4.21 and I suggest giving it a shot.

That being said, it looks like the issue is not resolved with multiple replicas, as @mgirard772 pointed out so still taking a look at how ECS works.

@mgirard772
Copy link

@abidlabs For what it's worth @edsna's suggestion of rolling back to 3.50.2 works, and that's what I have deployed in stage and production. Not sure exactly what breaking changes were made since then though. I do want to update to a newer version to resolve the security vulernabilities listed by dependabot prior to version 4.19.2, but I can't do that until this issue is resolved.

@abidlabs abidlabs assigned abidlabs and unassigned aliabid94 Mar 28, 2024
@mgirard772
Copy link

mgirard772 commented Apr 3, 2024

Any update here? Is there anything I can do to help move us towards a solution? I ended up deploying 4.22.0 through to production due to security concerns, which has left my production deployment with just one replica.

I'd strongly prefer to have this in HA at least in production, but in order to achieve that this issue needs to be resolved.

@abidlabs
Copy link
Member

abidlabs commented Apr 3, 2024

I'm looking into this issue, currently with @edsna's repro. What would be helpful is if we had a repro that was fully local, e.g. using minikube.

@abidlabs
Copy link
Member

abidlabs commented Apr 3, 2024

Can you try setting up sticky sessions (e.g. ensure that connections from the same client IP are routed to the same machine). Here's an example of how to do that: https://gist.github.com/fjudith/e8acc791f015adf6fd47e5ad7be736cb

@mgirard772
Copy link

mgirard772 commented Apr 3, 2024

@abidlabs Good news. Per your second comment, I enabled stickiness in the target group for my ALB in AWS and added a replica in ECS and that seems to have resolved the issue.

For reference here are settings I used in the AWS Console:
image

For those that use terraform, you'll want to add a stickiness block like this into your target group definition

  stickiness {
    type = "lb_cookie"
    cookie_duration = 86400
    enabled = true
  }

@freddyaboulton
Copy link
Collaborator

Glad it's working @mgirard772 ! We should probably make a note of that in the deployment guide

@abidlabs
Copy link
Member

abidlabs commented Apr 4, 2024

Awesome! I’ll add this to the docs, maybe in the nginx guide (did you have a different one in mind @freddyaboulton?)

In the meantime, if anyone else is still facing this issue and the above solution doesn’t work for them, please let us know so that we can investigate further if needed

@shimizust
Copy link
Contributor

Thanks for all the efforts to investigate the issue. For others' reference, if using Emissary Ingress (previously Ambassador) to route external requests to your k8s service, you can update the Emissary mapping resource to inject a cookie to enable sticky sessions like below. I was able to deploy multiple replicas without issue this way.

apiVersion: getambassador.io/v2
kind: Mapping
metadata:
  name: gradio-ui-emissary-mapping
spec:
  ...
  service: http://<service>.<namespace>:7860
  load_balancer:
    policy: ring_hash
    cookie:
      name: emissary-sticky-cookie
      ttl: 7200s

@SamridhiVaid
Copy link

I am still getting the error that @edsna got. I am using gradio==4.26.0 and gradio-client==0.15.1. I have built docker image of the gradio app and hosted it on cloud service. When I try to access it using a URL, my app keeps loading.
Screenshot 2024-04-18 at 11 40 52 AM

@abidlabs
Copy link
Member

Hi @SamridhiVaid are you running multiple replicas? Can you provide more details on your app and your deployment setup? Are there any Python or JS console logs? We'll need more information to help

@SamridhiVaid
Copy link

Hi @SamridhiVaid are you running multiple replicas? Can you provide more details on your app and your deployment setup? Are there any Python or JS console logs? We'll need more information to help

Hi. I am not running multiple replicas. I have two docker containers hosted on a cloud service. One is a vllm and other is the Gradio app. The textbox on the Gradio app should display the output of the llm. I can verify that both llm and Gradio are able to communicate with each other but the only problem is that Gradio doesn't display the output. I am not sure if I should just downgrade my gradio version and check this.
Screenshot 2024-04-18 at 12 02 39 PM

@michaeltremeer
Copy link

Just confirming that this issue was also occuring with Azure App Service when deployed with multiple instances, but enabling session affinity fixed the issue immediately.

@FrancescoSaverioZuppichini

Just confirming that this issue was also occuring with Azure App Service when deployed with multiple instances, but enabling session affinity fixed the issue immediately.

I also have issue with Azure web app!

@FrancescoSaverioZuppichini

Turned on session affinity and worked like a charm! Thanks a lot

Screenshot 2024-09-02 at 10 36 03

@satyajitghana
Copy link

for anyone trying out in minikube/nginx ingress, adding these annotations should fix it

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
    name: classifier-ingress
    annotations:
        nginx.ingress.kubernetes.io/affinity: "cookie"
        nginx.ingress.kubernetes.io/affinity-mode: "balanced"
        nginx.ingress.kubernetes.io/session-cookie-name: "INGRESSCOOKIE"
        nginx.ingress.kubernetes.io/session-cookie-expires: "172800"
        nginx.ingress.kubernetes.io/session-cookie-max-age: "172800"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cloud Issues that only happen when deploying Gradio on cloud services Priority High priority issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.