Intermittent EOF Errors #1464

0xrelapse · 2025-01-04T04:06:54Z

Observed

Getting this error periodically via the golang clickhouse driver when connecting to the clickhouse cloud service.

read:
    github.com/ClickHouse/ch-go/proto.(*Reader).ReadFull
        /go/pkg/mod/github.com/!click!house/[email protected]/proto/reader.go:62
  - EOF

Here is our config (with connection details omitted):

        // nolint:exhaustruct
	conn, err := clickhouse.Open(&clickhouse.Options{
		TLS:     ...,
		Protocol: clickhouse.Native,
		Addr:     ....,
		Auth: clickhouse.Auth{
			Username: ...,
			Password: ...,
			Database: ...,
		},
		ClientInfo: clickhouse.ClientInfo{
			Products: []struct {
				Name    string
				Version string
			}{
				{Name: "....", Version: "0.1"},
			},
		},
		Compression: &clickhouse.Compression{
			Method: clickhouse.CompressionLZ4,
		},
		BlockBufferSize: 10,
		MaxOpenConns: 70,
		MaxIdleConns: 50,
	})

After this error occurs, other queries (both reads and write) also fail, all outputting the same EOF error

Because this is intermittent, it's a little hard to reproduce.

I wonder if there's a race condition in the connection lifetime cleanup routine?

Details

Environment

clickhouse-go version: v2.30.0
Interface: database/sql compatible driver
Go version: 1.22.10
Operating system: amazon-linux-2023
ClickHouse version: 24.8
Is it a ClickHouse Cloud? yes
ClickHouse Server non-default settings, if any: everything is default

The text was updated successfully, but these errors were encountered:

jkaflik · 2025-01-07T06:55:03Z

@SpencerTorres could you take a look?

SpencerTorres · 2025-01-09T18:23:07Z

@0xrelapse Could you add some details about how you're connecting and what type of queries you're running? How frequent, etc. Also let me know if you have any special settings in the TLS config. This issue will be hard to reproduce

begelundmuller · 2025-02-04T19:37:53Z

@SpencerTorres We're also seeing this issue. It seems to us that it happens when connecting to a ClickHouse Cloud service that is idle, which causes it to wait for scale up from zero.

We are connecting with the HTTP protocol (not native protocol) and with TLS enabled, but apart from that there's no other custom connection config.

SpencerTorres · 2025-02-04T20:56:24Z

@begelundmuller I appreciate the extra insight... This makes more sense as the root cause. I suppose this would need to be handled at the application level? Perhaps some kind of retry or health check to verify the server is ready for connections? If this is a production instance you could also disable the service's sleep timeout.

I'll ask around internally to see if we have any other suggestions for this. What can the client do if it's ultimately a networking issue? We could add some extra logic to verify the connection is ready, but either way the application should already be ready to handle network outages

begelundmuller · 2025-02-05T13:11:02Z

@SpencerTorres Thank you for investigating! My impression from the docs is that the autoscaling works similar to services like AWS Lambda, i.e. that the proxy will keep the connection alive until the underlying cluster has scaled up and is ready to serve queries. So I wonder if this might just be a bug in the proxy implementation.

If you require retries in the application logic, it would be helpful if you could provide guidance on how to implement it. However, if retries are indeed needed, I think they might have to be implemented inside this driver since the database/sql interface doesn't provide granular control of the connection pool.

0xrelapse added bug needs triage labels Jan 4, 2025

SpencerTorres added cloud ClickHouse Cloud related tests and removed needs triage labels Feb 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intermittent EOF Errors #1464

Intermittent EOF Errors #1464

0xrelapse commented Jan 4, 2025

jkaflik commented Jan 7, 2025

SpencerTorres commented Jan 9, 2025

begelundmuller commented Feb 4, 2025 •

edited

Loading

SpencerTorres commented Feb 4, 2025

begelundmuller commented Feb 5, 2025 •

edited

Loading

Intermittent EOF Errors #1464

Intermittent EOF Errors #1464

Comments

0xrelapse commented Jan 4, 2025

Observed

Details

Environment

jkaflik commented Jan 7, 2025

SpencerTorres commented Jan 9, 2025

begelundmuller commented Feb 4, 2025 • edited Loading

SpencerTorres commented Feb 4, 2025

begelundmuller commented Feb 5, 2025 • edited Loading

begelundmuller commented Feb 4, 2025 •

edited

Loading

begelundmuller commented Feb 5, 2025 •

edited

Loading