Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1786191: SNOW-1747415 High Memory Usage SFReusableChunk #1055

Open
hadighattas opened this issue Nov 4, 2024 · 4 comments
Open

SNOW-1786191: SNOW-1747415 High Memory Usage SFReusableChunk #1055

hadighattas opened this issue Nov 4, 2024 · 4 comments
Assignees
Labels
backend changes needed Change must be implemented on the Snowflake service, and not in the client driver. enhancement The issue is a request for improvement or a new feature status-blocked Progress cannot be made to this issue due to an outside blocking factor. status-triage_done Initial triage done, will be further handled by the driver team

Comments

@hadighattas
Copy link

  1. What version of .NET driver are you using?
    4.1.0 also tried to build and use latest version of master including fixes mentioned in SNOW-1612981: SNOW-1640968 SNOW-1629635 Massive Memory Load  #1004

  2. What operating system and processor architecture are you using?
    macOS 14.7 ARM

  3. What version of .NET framework are you using?
    .net standard 2.0

  4. What did you do?

Pulling a lot of data is memory intensive.
We tried pulling 100M rows and the memory usage is averaging around ~800-900MB for a unit test, forcing garbage collection does not change anything.
This test is using the dapper unbuffered API which fully supports streaming.
The memory profiler is indicating that almost all of the objects allocated are in SFReusableChunk BlockResultData.

Profiler screenshots

image image image image

Reproducing this issue is pretty straightforward, we pulled 100M records using this query of sample data
SELECT * from snowflake_sample_data.TPCDS_SF100TCL.inventory
  1. What did you expect to see?

Memory buffer sizes (chunks) should be more conservative/configurable.

  1. Can you set logging to DEBUG and collect the logs?

Reproducing this is pretty straightforward. I can do it necessary.

@hadighattas hadighattas added the bug label Nov 4, 2024
@github-actions github-actions bot changed the title High Memory Usage SFReusableChunk SNOW-1786191: High Memory Usage SFReusableChunk Nov 4, 2024
@sfc-gh-dszmolka sfc-gh-dszmolka added the status-triage_done Initial triage done, will be further handled by the driver team label Nov 5, 2024
@sfc-gh-dszmolka sfc-gh-dszmolka changed the title SNOW-1786191: High Memory Usage SFReusableChunk SNOW-1786191: SNOW-1747415 High Memory Usage SFReusableChunk Nov 5, 2024
@sfc-gh-dszmolka
Copy link
Contributor

hi - this issue looks awfully similar to the ones we have in the

because the issue is coming from the backend; currently the chunks sizes are not really configurable and if you're retrieving a huge amount of data, then it will be memory intensive (or crash with OOM, depending how limited the memory is)

You can give setting CLIENT_RESULT_CHUNK_SIZE=16 a shot [reference(https://docs.snowflake.com/en/sql-reference/parameters#client-result-chunk-size)] but I can imagine it won't change the situation very much.

What you can use as a workaround while the backend issue is sorted out, and should be working, is to use the LIMIT ... OFFSET ... argument to your select, to 'paginate' through the huge resultset, if you need to bring down the memory usage under a certain level.

@sfc-gh-dszmolka sfc-gh-dszmolka added enhancement The issue is a request for improvement or a new feature and removed bug labels Nov 5, 2024
@sfc-gh-dszmolka
Copy link
Contributor

hi @hadighattas there's something you can try, maybe it helps. In 4.2.0 driver version we improved memory usage in SFReusableChunk. Can you please upgrade to 4.2.0 or higher and see if the issues cease? Thank you for the feedback in advance!

@hadighattas
Copy link
Author

Hello @sfc-gh-dszmolka
Thanks for your response!
I tried with both CLIENT_RESULT_CHUNK_SIZE and upgrading to 4.2.0, in both cases the memory usage for the test I had is the same as for 4.1.0 😕

@sfc-gh-dszmolka
Copy link
Contributor

thank you for testing and the feedback! this was necessary to confirm and rule out other reasons , besides what I originally suspected (see my first response). We'll need time to address this, on the backend.

@sfc-gh-dszmolka sfc-gh-dszmolka added status-blocked Progress cannot be made to this issue due to an outside blocking factor. backend changes needed Change must be implemented on the Snowflake service, and not in the client driver. labels Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend changes needed Change must be implemented on the Snowflake service, and not in the client driver. enhancement The issue is a request for improvement or a new feature status-blocked Progress cannot be made to this issue due to an outside blocking factor. status-triage_done Initial triage done, will be further handled by the driver team
Projects
None yet
Development

No branches or pull requests

3 participants