Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory allocations optimization #1695

Open
dkropachev opened this issue May 9, 2023 · 6 comments
Open

Memory allocations optimization #1695

dkropachev opened this issue May 9, 2023 · 6 comments

Comments

@dkropachev
Copy link

dkropachev commented May 9, 2023

There are lots of peaces of code where reciever is concrete value:

type CollectionType struct {
	NativeType
	Key  TypeInfo // only used for TypeMap
	Elem TypeInfo // only used for TypeMap, TypeList and TypeSet
}

func (t CollectionType) NewWithError() (interface{}, error) {
	typ, err := goType(t)
	if err != nil {
		return nil, err
	}
	return reflect.New(typ).Interface(), nil
}

As result when NewWithError is called golang recreate copy of reciever, i.e. CollectionType struct.
There is simple way to optimize it is to convert all of them to pointer reciever, which will make these allocations go away.
I have run test on that and here is results.

Before the fix (concrete receiver)

➜  gocql git:(dk/1693-potential-panic-on-deserialization) ✗ go test -bench . -benchmem               
goos: linux
goarch: amd64
pkg: github.com/gocql/gocql
cpu: 12th Gen Intel(R) Core(TM) i9-12900HK
BenchmarkMarshal-20      1532197               751.1 ns/op           504 B/op         13 allocs/op

After the fix (pointer receiver)

➜  gocql git:(dk/1693-potential-panic-on-deserialization) ✗ go test -bench . -benchmem               
goos: linux
goarch: amd64
pkg: github.com/gocql/gocql
cpu: 12th Gen Intel(R) Core(TM) i9-12900HK
BenchmarkMarshal-20      1793576               676.4 ns/op           312 B/op         11 allocs/op

As you can see size-wize memory allocations went from 504 to 312 B/op and allocations went from 13 to 11 per op.

Test code:

func BenchmarkMarshal(b *testing.B) {
	typeInfo := UDTTypeInfo{NativeType{proto: 3, typ: TypeUDT}, "", "xyz", []UDTField{
		{Name: "x", Type: &NativeType{proto: 3, typ: TypeInt}},
		{Name: "y", Type: &NativeType{proto: 3, typ: TypeInt}},
		{Name: "z", Type: &NativeType{proto: 3, typ: TypeInt}},
	}}
	type NewWithError struct {
		Value interface{}
		Error error
	}

	type ResultStore struct {
		NewWithError NewWithError
		New          interface{}
		String       string
		Type         Type
		Version      byte
		Custom       string
	}

	store := make([]ResultStore, b.N)
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		rec := &store[i]
		rec.NewWithError.Value, rec.NewWithError.Error = typeInfo.NewWithError()
		rec.New = typeInfo.New()
		rec.String = typeInfo.String()
		rec.Type = typeInfo.Type()
		rec.Version = typeInfo.Version()
		rec.Custom = typeInfo.Custom()
	}
}
@martin-sucha
Copy link
Contributor

Unfortunately, it seems that changing usage of TypeInfo family of types to pointer receiver is a backward incompatible change, so it would need to be in a new major version. Those types are used outside of the gocql package (for example in kiwicom/easycql, there is a type assertion like udt, ok := info.(gocql.UDTTypeInfo) that would break).

@dkropachev
Copy link
Author

Unfortunately, it seems that changing usage of TypeInfo family of types to pointer receiver is a backward incompatible change, so it would need to be in a new major version. Those types are used outside of the gocql package (for example in kiwicom/easycql, there is a type assertion like udt, ok := info.(gocql.UDTTypeInfo) that would break).

True.

Given the fact that serialization and deseralization as a whole are not implemented in the best memory efficient way whole thing could be rewritten to improve performance.

@dkropachev
Copy link
Author

@martin-sucha , does it mean I can proceed working on it?

@martin-sucha
Copy link
Contributor

There are no concrete plans for v2 now. There are other things that require breaking changes and should be included in v2. Some of those other things are marked with the semver-major label as well. For example we should get rid of global logger, global batch, split host selection policies to separate packages and make them testable, etc. Remove support of old protocol versions, maybe add support for protocol v5 (if it requires breaking changes in the API, there may not be the need to tie it to v2). It would also help to change the Marshaller/Unmarshaller interfaces, there is currently no way to register custom unmarshaller for a CQL type protocol extension (#1675 (comment)).

Releasing a v2 also means that the other packages in the ecosystem like scylladb/gocqlx will need to add support.

Also there is https://github.com/scylladb/scylla-go-driver that I want to try out, so personally I can't promise anything regarding gocql v2 now. I am open to discussions though. If the community agrees on what the scope of what should be (and what shouldn't be) in v2 and who will implement all the changes required for v2, and how long v1 should be supported alongside v2, I guess I could help with releasing the changes.

cc @jameshartig

@joao-r-reis
Copy link
Contributor

joao-r-reis commented Oct 30, 2024

We're currrently working on a v2 release and this issue is labeled with semver-major so I took a look at it. It's not immediately clear if the performance benefit makes the breaking change worth it... it doesn't seem that great and the breaking change impact seems significant. I don't have a very strong opinion on this though.

@joao-r-reis
Copy link
Contributor

I'd like to hear from other people in the community (maybe a DISCUSS thread on the mailing list) before putting this in scope for the v2 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants