Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Repetitive connection failure causing memory leak #783

Open
spasypaddy opened this issue Nov 30, 2024 · 24 comments
Open

[Bug] Repetitive connection failure causing memory leak #783

spasypaddy opened this issue Nov 30, 2024 · 24 comments
Labels
bug Something isn't working

Comments

@spasypaddy
Copy link

Home Assistant Version

2024.11.3

Bambu Lab Integration Version

v2.0.38

Describe the bug

For some reason this week the integration has started causing a memory leak on my raspberry pi 4, cant explain why. When its disabled my memory usage stays in the 50% range, with it on it climbs constantly until the PI crashes and home assistant goes down.

To Reproduce

Borrow my home assistant set up on a raspberry pi? No idea...

Expected Behaviour

No memory leak

What device are you using?

P1S

Diagnostic Output

{
  "home_assistant": {
    "installation_type": "Home Assistant OS",
    "version": "2024.11.3",
    "dev": false,
    "hassio": true,
    "virtualenv": false,
    "python_version": "3.12.4",
    "docker": true,
    "arch": "aarch64",
    "timezone": "Europe/London",
    "os_name": "Linux",
    "os_version": "6.6.31-haos-raspi",
    "supervisor": "2024.11.4",
    "host_os": "Home Assistant OS 13.2",
    "docker_version": "27.2.0",
    "chassis": "embedded",
    "run_as_root": true
  },
  "custom_components": {
    "hacs": {
      "documentation": "https://hacs.xyz/docs/configuration/start",
      "version": "2.0.1",
      "requirements": [
        "aiogithubapi>=22.10.1"
      ]
    },
    "hass_agent": {
      "documentation": "https://github.com/LAB02-Research/HASS.Agent-Integration",
      "version": "2022.11.9",
      "requirements": []
    },
    "localtuya": {
      "documentation": "https://github.com/rospogrigio/localtuya/",
      "version": "5.2.1",
      "requirements": []
    },
    "sonoff": {
      "documentation": "https://github.com/AlexxIT/SonoffLAN",
      "version": "3.8.1",
      "requirements": [
        "pycryptodome>=3.6.6"
      ]
    },
    "dreo": {
      "documentation": "https://github.com/jeffsteinbok/hass-dreo/blob/master/README.md",
      "version": "1.0.0",
      "requirements": [
        "websockets"
      ]
    },
    "watchman": {
      "documentation": "https://github.com/dummylabs/thewatchman",
      "version": "0.6.5",
      "requirements": [
        "prettytable==3.12.0"
      ]
    },
    "bermuda": {
      "documentation": "https://github.com/agittins/bermuda",
      "version": "0.7.2",
      "requirements": []
    },
    "format_ble_tracker": {
      "documentation": "https://github.com/formatBCE/Format-BLE-Tracker/blob/main/README.md",
      "version": "0.0.8",
      "requirements": []
    },
    "eufy_security": {
      "documentation": "https://github.com/fuatakgun/eufy_security",
      "version": "8.1.0",
      "requirements": [
        "websocket-client==1.8.0",
        "aiortsp==1.4.0"
      ]
    },
    "fontawesome": {
      "documentation": "https://github.com/thomasloven/hass-fontawesome",
      "version": "2.2.3",
      "requirements": []
    },
    "garmin_connect": {
      "documentation": "https://github.com/cyberjunky/home-assistant-garmin_connect",
      "version": "0.2.22",
      "requirements": [
        "garminconnect>=0.2.23",
        "tzlocal"
      ]
    },
    "mass": {
      "documentation": "https://music-assistant.io",
      "version": "2024.11.4",
      "requirements": [
        "music-assistant-client==1.0.6"
      ]
    },
    "nest_protect": {
      "documentation": "https://github.com/imicknl/ha-nest-protect",
      "version": "0.4.0b7",
      "requirements": []
    },
    "scheduler": {
      "documentation": "https://github.com/nielsfaber/scheduler-component",
      "version": "v0.0.0",
      "requirements": []
    },
    "webrtc": {
      "documentation": "https://github.com/AlexxIT/WebRTC",
      "version": "v3.6.0",
      "requirements": []
    },
    "meross_lan": {
      "documentation": "https://github.com/krahabb/meross_lan",
      "version": "5.4.0",
      "requirements": []
    },
    "bambu_lab": {
      "documentation": "https://github.com/greghesp/ha-bambulab",
      "version": "2.0.38",
      "requirements": [
        "cloudscraper"
      ]
    },
    "spotcast": {
      "documentation": "https://github.com/fondberg/spotcast",
      "version": "v4.0.0",
      "requirements": [
        "spotipy==2.23.0"
      ]
    }
  },
  "integration_manifest": {
    "domain": "bambu_lab",
    "name": "Bambu Lab",
    "codeowners": [
      "greghesp",
      "AdrianGarside"
    ],
    "config_flow": true,
    "dependencies": [
      "device_automation",
      "ffmpeg",
      "mqtt"
    ],
    "documentation": "https://github.com/greghesp/ha-bambulab",
    "iot_class": "local_push",
    "issue_tracker": "https://github.com/greghesp/ha-bambulab/issues",
    "requirements": [
      "cloudscraper"
    ],
    "ssdp": [
      {
        "st": "urn:bambulab-com:device:3dprinter:1"
      }
    ],
    "version": "2.0.38",
    "is_built_in": false,
    "overwrites_built_in": false
  },
  "setup_times": {},
  "data": {
    "config_entry": {
      "created_at": "1970-01-01T00:00:00+00:00",
      "data": {
        "device_type": "P1S",
        "serial": "**REDACTED**"
      },
      "discovery_keys": {},
      "disabled_by": null,
      "domain": "bambu_lab",
      "entry_id": "6d07f055b04f83f55648de2c84fb8cde",
      "minor_version": 1,
      "modified_at": "2024-11-18T12:37:13.429688+00:00",
      "options": {
        "access_code": "",
        "auth_token": "**REDACTED**",
        "email": "",
        "host": "",
        "local_mqtt": false,
        "name": "",
        "region": "",
        "usage_hours": 121.82333333333332,
        "username": "**REDACTED**"
      },
      "pref_disable_new_entities": false,
      "pref_disable_polling": false,
      "source": "user",
      "title": "**REDACTED**",
      "unique_id": null,
      "version": 2
    },
    "push_all": null,
    "get_version": null
  }
}

Log Extracts

No response

Other Information

No response

@spasypaddy spasypaddy added the bug Something isn't working label Nov 30, 2024
@AdrianGarside
Copy link
Collaborator

AdrianGarside commented Dec 1, 2024

Does this occur with older versions? I'm not (yet) seeing any memory increase in my test HA instance running inside docker although I'm only been monitoring it for a short while. Does turning on debug logging shed any light on what might be causing the leak?

@spasypaddy
Copy link
Author

spasypaddy commented Dec 1, 2024 via email

@AdrianGarside
Copy link
Collaborator

Did you recently update the HA version or the integration version? Or is the sudden instability out of the blue without updating either?

@jbeardon
Copy link

jbeardon commented Dec 10, 2024

@AdrianGarside
I'm afraid I came here to report the same thing. It's taken a week to diagnose what was causing out of memory crashes of my Proxmox installed HA. I have 6 Bambu printers. With the BambuLab integration enabled, memory use is drastically higher than without. Usage increases and will eventually cause a crash of HASSOS. As I say, it's taken over a week to diagnose this as there is nothing in the logs. I had updated a few HACS components at the same time and disabled all of them when this issue was first discovered.
After enabling each of the disabled integrations one-by-one over the course of the week and monitoring for between 12 and 24 hours before moving onto the next, the memory issues did not return until the Bambu integrations were re-enabled.

The attached system monitor log shows quite clearly when the disabled printers were re-enabled this evening. Memory use has jumped significantly and is gently creeping up (it will not stop).
I would estimate that HASSOS would crash due to out of memory errors in the console within 6 hours. I have disabled again and restarted HA, memory use dropped to below 2Gb and will be stable.

image

@greghesp
Copy link
Owner

@AdrianGarside I'm afraid I came here to report the same thing. It's taken a week to diagnose what was causing out of memory crashes of my Proxmox installed HA. I have 6 Bambu printers. With the BambuLab integration enabled, memory use is drastically higher than without. Usage increases and will eventually cause a crash of HASSOS. As I say, it's taken over a week to diagnose this as there is nothing in the logs. I had updated a few HACS components at the same time and disabled all of them when this issue was first discovered. After enabling each of the disabled integrations one-by-one over the course of the week and monitoring for between 12 and 24 hours before moving onto the next, the memory issues did not return until the Bambu integrations were re-enabled.

The attached system monitor log shows quite clearly when the disabled printers were re-enabled this evening. Memory use has jumped significantly and is gently creeping up (it will not stop). I would estimate that HASSOS would crash due to out of memory errors in the console within 6 hours. I have disabled again and restarted HA, memory use dropped to below 2Gb and will be stable.

image

What are you running that on?

@jbeardon
Copy link

Dell 3040 Micro, i5 6500T, 16Gb RAM. Running Proxmox. HA in a VM.
I've increased the VM RAM to 6Gb since this became an issue. Still runs away with an OoM crash if I enable the Bambu integrations.

@AdrianGarside
Copy link
Collaborator

AdrianGarside commented Dec 14, 2024

@jbeardon can you post the integration diagnostics and enable debug logging to see if anything is listed there that could shed light on this? Does the memory increase correlate with activity (such as printing). Does it stop if you disable the camera?
image
So far, I'm not able to observe any increase in memory usage over time with my setup.

@jbeardon
Copy link

Will do - I will however need to kill the integration before I go to bed. HA crashing overnight will not be acceptable to she who must be obeyed!

@jbeardon
Copy link

Diagnostics from one of the configs (let me know if you want all of them).
config_entry-bambu_lab-41624040c4490a8674cd7de037924024.json

@jbeardon
Copy link

This may have a bearing!
image
Associated log attached.
home-assistant_2024-12-14T18-11-05.776Z.log

@jbeardon
Copy link

RAM usage has increased by over 1Gb in 15 minutes. That's with the cameras disabled.
image

@jbeardon
Copy link

OK, so digging further. I've got 3 P1Ps, 2 A1s and an A1 Mini.
All of the P1P sensors were showing as unknown/unavailable. I've clicked through the configure process on each of those existing entries and they have come back to life. Memory usage did not drop (had increased another 400Mb) since my last post. So am re-starting. Will continue to monitor but I am assuming the auth errors in the logs were the underlying cause.
Will provide a further update once I've had time to monitor after the re-config of the integration entries.

@AdrianGarside
Copy link
Collaborator

Yes. One of your printers (a P1P) didn't have working credentials any more so it was repetitively trying to connect and failing. So any memory leak in that path would go from noise to significant. You also got rate limited by cloudflare accessing the bambu cloud APIs due to that repetitive activity. I'll need to think how to better handle that state.

@AdrianGarside
Copy link
Collaborator

Going through configure would automatically take working credentials from one of the other integration instances and fix up the broken one.

@jbeardon
Copy link

Curious this occurred because the credentials have not changed and it did seem to coincide with the update to 2.0.38 but it may just be a coincidence.

After the re-start, memory use has dropped back down to below 2Gb and is not spiking so all appears well again.

Thanks for poking me in the right direction to get it fixed 👌

@AdrianGarside
Copy link
Collaborator

AdrianGarside commented Dec 16, 2024

@spasypaddy I expect you're hitting the same authentication problem as you're configured for bambu cloud mqtt connection and the diagnostics log has no successful data retrieved from your printer:
"push_all": null,
"get_version": null

A debug log would confirm this.

@AdrianGarside
Copy link
Collaborator

This may have a bearing! image Associated log attached. home-assistant_2024-12-14T18-11-05.776Z.log

That second connection failed error should have been being flagged in the home assistant UI and helped us understand this even without debug logging enabled. Did HA not flag the error from the integration?

@jbeardon
Copy link

@AdrianGarside, Unfortunately, I can't be sure but I cannot remember seeing anything related to the Bambu integration in the logs when this problem arose. As you say, it would have also given me a reason to delve further myself.
On the positive side, all has been well since I re-authenticated.

@tofuSCHNITZEL
Copy link

I have the same issue on an Rpi4 homeassistant keeps crashing and rebooting completely ever 1-1,5hours - no more crashing since disabling this integration

@jbeardon
Copy link

I have the same issue on an Rpi4 homeassistant keeps crashing and rebooting completely ever 1-1,5hours - no more crashing since disabling this integration

Have you tried re-authenticating? That was the fix for my issue.

@tofuSCHNITZEL
Copy link

tofuSCHNITZEL commented Dec 21, 2024

I have the same issue on an Rpi4 homeassistant keeps crashing and rebooting completely ever 1-1,5hours - no more crashing since disabling this integration

Have you tried re-authenticating? That was the fix for my issue.

I removed my printer and readded it (did a reauth in the process) will see if it crashes again
Edit: yes looks good so far...

@AdrianGarside AdrianGarside changed the title [Bug] Causing memory leak [Bug] Repetitive connection failure causing memory leak Dec 31, 2024
@tofuSCHNITZEL
Copy link

Still happens but less frequent... this is a screenshot of the memory usage sensor:
image

since disabling the printer entity the memory usage did not grow like before.

@AdrianGarside
Copy link
Collaborator

So HA is crashing every 1.5-2 days still with the integration enabled? And the integration is otherwise behaving normally? I.e. All sensors working correctly?

@AdrianGarside
Copy link
Collaborator

Is there anything interesting in the debug logs while this is occuring?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants