Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Dockerized OpenIM Server Reports "Unhealthy" and Experiences Kafka Connection Issues #1571

Closed
cubxxw opened this issue Dec 18, 2023 · 5 comments
Labels
bug Categorizes issue or PR as related to a bug.

Comments

@cubxxw
Copy link
Contributor

cubxxw commented Dec 18, 2023

What happened?

I have deployed the OpenIM server using Docker, and it's currently reporting as "unhealthy" in the Docker status. When checking the logs, there are multiple errors related to Kafka connections and log file rotation.

What did you expect to happen?

always healthy

How can we reproduce it (as minimally and precisely as possible)?

  1. Deployed the OpenIM server using Docker.
  2. Ran docker ps | grep openim-server to check the status.
  3. The server status is shown as "Up 36 hours (unhealthy)".
  4. Executed docker logs openim-server to inspect the logs.

The OpenIM server should start without any errors and should not be in an unhealthy state.

Actual Behavior:

The Docker container for the OpenIM server is marked as unhealthy, and the logs indicate multiple errors:

  • Failed log file rotation: failed to rotate: open /openim/openim-server/logs//OpenIM.log.all.2023-12-17_lock: file exists
  • Kafka connection issues: panic: kafka: client has run out of available brokers to talk to: read tcp 172.28.0.8:57066->172.28.0.1:19092: read: connection reset by peer

Anything else we need to know?

Log Snippet:

# Begin to check all openim service

## Check all dependent service ports

## Check OpenIM service name
Found 4 processes named /openim/openim-server/_output/bin/platforms/linux/amd64

## Check all OpenIM service ports
Checking ports: 10110 10120 10130 10140 10150 10160 10170 10180 10190 10002 10001

### Started ports:
Port 10110 - Command: openim-rpc-user, PID: 2705, FD: 3, Started: Sat Dec 16 22:38:04 2023
Port 10120 - Command: openim-rpc-frie, PID: 2736, FD: 3, Started: Sat Dec 16 22:38:04 2023
Port 10130 - Command: openim-rpc-msg, PID: 2767, FD: 3, Started: Sat Dec 16 22:38:04 2023
Port 10140 - Command: openim-msggatew, PID: 4611, FD: 8, Started: Sat Dec 16 22:38:07 2023
Port 10150 - Command: openim-rpc-grou, PID: 2798, FD: 3, Started: Sat Dec 16 22:38:04 2023
Port 10160 - Command: openim-rpc-auth, PID: 2831, FD: 3, Started: Sat Dec 16 22:38:05 2023
Port 10170 - Command: openim-push, PID: 3507, FD: 3, Started: Sat Dec 16 22:38:06 2023
Port 10180 - Command: openim-rpc-conv, PID: 2863, FD: 6, Started: Sat Dec 16 22:38:05 2023
Port 10190 - Command: openim-rpc-thir, PID: 2895, FD: 3, Started: Sat Dec 16 22:38:05 2023
Port 10002 - Command: openim-api, PID: 2223, FD: 9, Started: Sat Dec 16 22:38:02 2023
Port 10001 - Command: openim-msggatew, PID: 4611, FD: 3, Started: Sat Dec 16 22:38:07 2023
[success 1216 23:45:11] ==>  All specified processes are running.
failed to rotate: open /openim/openim-server/logs//OpenIM.log.all.2023-12-17_lock: file exists
failed to rotate: open /openim/openim-server/logs//OpenIM.log.all.2023-12-17_lock: file exists
panic: kafka: client has run out of available brokers to talk to: read tcp 172.28.0.8:57066->172.28.0.1:19092: read: connection reset by peer

goroutine 99 [running]:
github.com/openimsdk/open-im-server/v3/pkg/common/kafka.(*MConsumerGroup).RegisterHandleAndConsumer(0xc00044f740, {0x17419d0, 0xc00007c6c0})
        /openim/openim-server/pkg/common/kafka/consumer_group.go:71 +0x145
panic: panic: panic: panic: created by github.com/openimsdk/open-im-server/v3/internal/push.(*Consumer).Start
kafka: client has run out of available brokers to talk to: EOFkafka: client has run out of available brokers to talk to: read tcp 172.28.0.8:57078->172.28.0.1:19092: read: connection reset by peerkafka: client has run out of available brokers to talk to: read tcp 172.28.0.8:57016->172.28.0.1:19092: read: connection reset by peerkafka: client has run out of available brokers to talk to: read tcp 172.28.0.8:57018->172.28.0.1:19092: read: connection reset by peer







goroutine goroutine goroutine goroutine 89127231205 [ [ [runningrunningrunning]:
]:
]:
github.com/openimsdk/open-im-server/v3/pkg/common/kafka.(*MConsumerGroup).RegisterHandleAndConsumergithub.com/openimsdk/open-im-server/v3/pkg/common/kafka.(*MConsumerGroup).RegisterHandleAndConsumergithub.com/openimsdk/open-im-server/v3/pkg/common/kafka.(*MConsumerGroup).RegisterHandleAndConsumer(((0xc00060ab000xc00037cac0, 0xc00004ea80, {, {0x166cf30{0x166cf30, 0x166cf60 [, 0xc00059c978, running0xc0001de618}0xc000598700]:
})
})
        )
        /openim/openim-server/pkg/common/kafka/consumer_group.go        /openim/openim-server/pkg/common/kafka/consumer_group.go:/openim/openim-server/pkg/common/kafka/consumer_group.go:71:71 +71 +0x145 +0x145
0x145
created by 
created by github.com/openimsdk/open-im-server/v3/internal/msgtransfer.(*MsgTransfer).Startcreated by github.com/openimsdk/open-im-server/v3/internal/msgtransfer.(*MsgTransfer).Start
github.com/openimsdk/open-im-server/v3/internal/msgtransfer.(*MsgTransfer).Start

        /openim/openim-server/internal/msgtransfer/init.go      /openim/openim-server/internal/msgtransfer/init.go:/openim/openim-server/internal/msgtransfer/init.go:99:99 +98 +0x27f +0x27f
0x1ea

github.com/openimsdk/open-im-server/v3/pkg/common/kafka.(*MConsumerGroup).RegisterHandleAndConsumer     /openim/openim-server/internal/push/consumer_init.go:31 +0x8f
(0xc00025f4c0, {0x166cf30, 0xc000624300})
        /openim/openim-server/pkg/common/kafka/consumer_group.go:71 +0x145
created by github.com/openimsdk/open-im-server/v3/internal/msgtransfer.(*MsgTransfer).Start
        /openim/openim-server/internal/msgtransfer/init.go:99 +0x27f

Additional Information:
Kafka configuration details, if applicable.
Any specific changes made to the default configuration.
Network setup details if relevant.

version

OpenIM Server Version: 3.5.0-rc.6
Docker version:

smile@smile:/data/workspaces/openim-docker$ docker version
Client:
 Version:           24.0.5
 API version:       1.43
 Go version:        go1.20.3
 Git commit:        24.0.5-0ubuntu1~22.04.1
 Built:             Mon Aug 21 19:50:14 2023
 OS/Arch:           linux/amd64
 Context:           default

Server:
 Engine:
  Version:          24.0.5
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.3
  Git commit:       24.0.5-0ubuntu1~22.04.1
  Built:            Mon Aug 21 19:50:14 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.7.2
  GitCommit:        
 runc:
  Version:          1.1.7-0ubuntu1~22.04.1
  GitCommit:        
 docker-init:
  Version:          0.19.0
  GitCommit:   

Host Operating System: ubuntu 22.04
Deployment: Docker

Cloud provider

OS version

```console # On Linux: $ cat /etc/os-release # paste output here $ uname -a # paste output here # On Windows: C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture # paste output here ```

Install tools

@cubxxw cubxxw added the bug Categorizes issue or PR as related to a bug. label Dec 18, 2023
@cubxxw
Copy link
Contributor Author

cubxxw commented Dec 18, 2023

name-of-archive.tar.gz

@kubbot
Copy link
Contributor

kubbot commented Feb 16, 2024

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@cubxxw
Copy link
Contributor Author

cubxxw commented Feb 18, 2024

Perhaps this issue could be considered normal, but the potential problem is that Kafka may not be functioning properly. Apart from exploring a more lightweight version of Kafka, perhaps we could also take a look at whether the memory is sufficient for deploying Kafka.

@kubbot
Copy link
Contributor

kubbot commented Apr 19, 2024

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@kubbot
Copy link
Contributor

kubbot commented Apr 28, 2024

This issue was closed because it has been stalled for 7 days with no activity.

@kubbot kubbot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

2 participants