Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: online user and memory used are only increasing ,and there are a large number of TCP connections in the FIN-WAIT-2 state #2142

Closed
manlinux opened this issue Mar 26, 2024 · 5 comments
Labels
bug Categorizes issue or PR as related to a bug.

Comments

@manlinux
Copy link

manlinux commented Mar 26, 2024

What happened?

On the granafa monitoring page of openim, we can see that online_user num and user_login_total are increasing continuously, and there are still high numbers even in the early morning.

There are many TCP connections to openim server (10001 port number ) in the FIN_WAIT2 state in the system.

4ffd99bd35d0dfd1b0677985abe21bb

Clip_2024-03-26_13-23-27

3eb7a54d0a42370a7ff1996638a3dc2

What did you expect to happen?

We'd like to understand why this happens and how to fix it, preferably can show the real number of online users.
Memory usage can be efficiently released.

How can we reproduce it (as minimally and precisely as possible)?

https://github.com/openimsdk/openim-docker/blob/v3.5.0/docker-compose.yaml

the Dockerfile of openim-oimws:


# Build Stage
FROM golang:1.20 AS builder

# Set go mod installation source and proxy
ARG GO111MODULE=on
ARG GOPROXY=https://goproxy.cn,direct
ENV GO111MODULE=$GO111MODULE
ENV GOPROXY=$GOPROXY

# Set up the working directory
WORKDIR /openim/openim-server

COPY go.mod go.sum ./
RUN go mod download

# Copy all files to the container
ADD . .

RUN make clean
RUN make build

FROM ghcr.io/openim-sigs/openim-ubuntu-image:latest

ENV OPENIM_API_IP ${OPENIM_API_IP}
ENV OPENIM_API_PORT  ${OPENIM_API_PORT}
ENV OPENIM_WS_IP ${OPENIM_WS_IP}
ENV OPENIM_WS_PORT ${OPENIM_WS_PORT}
ENV SDK_WS_PORT  ${SDK_WS_PORT}
ENV OPENIM_LOG_LEVEL  ${OPENIM_LOG_LEVEL}
ENV OPENIM_DB_DIR ${OPENIM_DB_DIR}

WORKDIR /app
COPY --from=builder /openim/openim-server/_output/bin/ /app/bin/

CMD  /app/bin/main -openIM_api_address="http://${OPENIM_API_IP}:${OPENIM_API_PORT}" -openIM_ws_address="ws://${OPENIM_WS_IP}:${OPENIM_WS_PORT}" -sdk_ws_port=${SDK_WS_PORT} -openIM_log_level=${OPENIM_LOG_LEVEL} -openIMDbDir="${OPENIM_DB_DIR}"

Anything else we need to know?

No response

version

openim server 3.5
docker pull openim/openim-server:release-v3.5
openim-chat release-v1.5
docker pull openim/openim-chat:release-v1.5
openim-oimws v3.5.1-alpha.8
https://github.com/openim-sigs/oimws/tree/v3.5.1-alpha.8

Cloud provider

OS version

NAME="Oracle Linux Server" VERSION="7.8"

Linux OpenIMServer 4.14.35-1902.300.11.el7uek.x86_64

Install tools

@manlinux manlinux added the bug Categorizes issue or PR as related to a bug. label Mar 26, 2024
@kubbot
Copy link
Contributor

kubbot commented Mar 26, 2024

Hello! Thank you for filing an issue.

If this is a bug report, please include relevant logs to help us debug the problem.

Join slack 🤖 to connect and communicate with our developers.

@cubxxw
Copy link
Contributor

cubxxw commented Mar 26, 2024

Could you please confirm if this Dockerfile was authored by you? Additionally, would you be able to provide specific version information used? Thank you.

@manlinux
Copy link
Author

manlinux commented Mar 27, 2024

Could you please confirm if this Dockerfile was authored by you? Additionally, would you be able to provide specific version information used? Thank you.

Yes. The Dockerfile of openim-oimws was was authored by myself.

mariadb:10.6
kafka:3.5.1
mongo:6.0.2
redis:7.0.0

openim server image version: release-v3.5
openim chat image version: release-v1.5
openim-admin:toc-base-open-docker.35
oimws git branch :v3.5.1-alpha.8

Additionally, I also added the following information to the linux system:
net.core.somaxconn=65535
net.ipv4.tcp_tw_reuse=1
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_syn_retries = 2
net.ipv4.tcp_fin_timeout = 10
net.ipv4.tcp_max_syn_backlog = 8192
net.ipv4.tcp_fastopen = 3
net.ipv4.tcp_slow_start_after_idle = 0
net.ipv4.tcp_keepalive_time = 60
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_intvl = 6

When I use netstat cmd in linux console:"netstat -nat | grep FIN_WAIT2|wc -l", FIN_WAIT2 TCP connection number is about 28245 now.

There are some warning logs in openim-oimws logs files ,like as these:
openim-oimws | 2024-03-26 19:35:02.156 WARN [PID:7] [interaction/long_conn_mgr.go:186] reConn {"error": "read tcp 10.195.1.12:35564->10.195.1.1:10001: read: connection reset by peer"}
openim-oimws | 2024-03-26 19:35:02.160 WARN [PID:7] [interaction/long_conn_mgr.go:186] reConn {"operationID": "1711449856320880343", "error": "read tcp 10.195.1.12:35568->10.195.1.1:10001: read: connection reset by peer"}
openim-oimws | 2024-03-26 19:35:02.160 WARN [PID:7] [interaction/long_conn_mgr.go:186] reConn {"error": "read tcp 10.195.1.12:35566->10.195.1.1:10001: read: connection reset by peer"}
openim-oimws | 2024-03-26 19:35:02.170 WARN [PID:7] [interaction/long_conn_mgr.go:186] reConn {"error": "read tcp 10.195.1.12:35570->10.195.1.1:10001: read: connection reset by peer"}
openim-oimws | 2024-03-26 19:35:02.170 WARN [PID:7] [interaction/long_conn_mgr.go:186] reConn {"operationID": "1711449858558726627", "error": "read tcp 10.195.1.12:35574->10.195.1.1:10001: read: connection reset by peer"}
openim-oimws | 2024-03-26 19:35:02.170 WARN [PID:7] [interaction/long_conn_mgr.go:186] reConn {"operationID": "1711449856066444155", "error": "read tcp 10.195.1.12:35576->10.195.1.1:10001: read: connection reset by peer"}

@manlinux manlinux reopened this Mar 29, 2024
@skiffer-git
Copy link
Member

I recommend you update to release-v3.8. If you encounter any new issues, please reopen this issue or create a new one.

@skiffer-git
Copy link
Member

The JS SDK has been updated with a new solution and will be released this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

4 participants