Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use methodtools.lru_cache instead of functools.lru_cache in class methods #37757

Merged
merged 2 commits into from
Apr 3, 2024

Conversation

Taragolis
Copy link
Contributor

@Taragolis Taragolis commented Feb 27, 2024

This is attempt to prevent use functools.lru_cache or functools.cache on methods which lead to memory leak, e.g. object can't be garbage collected if it use methods decorated by this decorators.

Simple reproducible example

import functools

class A:
    def __init__(self):
        self.a = {"a": 1, "b": 2}

    @functools.lru_cache(maxsize=None)
    def get_val(self, key: str):
        return self.a[key]

for _ in range(100):
    xxx = A()
    xxx.get_val(key="a")
    xxx.get_val("a")
    xxx.get_val("b")
    del xxx

import gc
gc.collect()
a_instances = [o for o in gc.get_objects() if isinstance(o, A)]
assert len(a_instances) == 0, len(a_instances)
# AssertionError: 100

Instead of that better to use use functools.cached_property (if it could be a property), ruff B018 rule suggestion or use some third-party library, e.g. methodtools

import methodtools

class A:
    def __init__(self):
        self.a = {"a": 1, "b": 2}

    @methodtools.lru_cache(maxsize=None)
    def get_val(self, key: str):
        return self.a[key]

for _ in range(100):
    xxx = A()
    xxx.get_val(key="a")
    xxx.get_val(key="a")
    xxx.get_val("b")
    del xxx

import gc
gc.collect()
a_instances = [o for o in gc.get_objects() if isinstance(o, A)]
assert len(a_instances) == 0, len(a_instances) # Do not raise assertion error
  • ruff do not detect compat decorator airflow.compat.functools.cache for avoid use it in methods use simple pre-commit check. It is only required until we have Python 3.9 as min version, and do not use airflow.compat.functools.cache in code base.
  • There is couple of places where cache used into the Airflow core, some of them might be a result of memory leakage, the simple way is use methodtools.lru_cache - this one would not breaks public interfaces or behaviour and do no required to rewrite. In another word we have bug fix "Here and Now" instead of think how to rewrite without additional dependency.

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

.pre-commit-config.yaml Outdated Show resolved Hide resolved
@Taragolis
Copy link
Contributor Author

@potiuk I've also found that ruff do not detect compat decorator airflow.compat.functools.cache and we use it in some methods places into the core, which might cause memory leakage

@cache
def get_parse_time_mapped_ti_count(self) -> int:
"""

@classmethod
@cache
def get_serialized_fields(cls):
# Not using 'cls' here since we only want to serialize base fields.
return frozenset(attr.fields_dict(MappedOperator)) - {

@cache # Prevent multiple database access.
def _get_previous_dagrun_success() -> DagRun | None:
return task_instance.get_previous_dagrun(state=DagRunState.SUCCESS, session=session)

@cache
def get_yesterday_ds() -> str:
return (logical_date - timedelta(1)).strftime("%Y-%m-%d")

@cache
def get_tomorrow_ds() -> str:
return (logical_date + timedelta(1)).strftime("%Y-%m-%d")

@cache
def get_prev_execution_date():
# For manually triggered dagruns that aren't run on a schedule,

@cache
def get_prev_ds() -> str | None:
execution_date = get_prev_execution_date()

@cache
def get_parse_time_mapped_ti_count(self) -> int:
"""
Return the Number of instances a task in this group should be mapped to, when a DAG run is created.

@classmethod
@cache
def all_weight_rules(cls) -> set[str]:
"""Return all weight rules."""
return set(cls.__members__.values())

@cache
def get_task_group_children_getter() -> operator.methodcaller:
sort_order = conf.get("webserver", "grid_view_sorting_order", fallback="topological")

@Taragolis
Copy link
Contributor Author

WeightRule is enum, and members of enum are singletons, so it is pretty safe to use it there, but in the other cases it still looks like a problem

@Taragolis Taragolis requested a review from uranusjr as a code owner April 1, 2024 14:21
@Taragolis Taragolis added full tests needed We need to run full set of tests for this PR to merge all versions If set, the CI build will be forced to use all versions of Python/K8S/DBs and removed provider:google Google (including GCP) related issues area:providers labels Apr 1, 2024
@Taragolis Taragolis closed this Apr 1, 2024
@Taragolis Taragolis reopened this Apr 1, 2024
@Taragolis Taragolis added the upgrade to newer dependencies If set, upgrade to newer dependencies is forced label Apr 1, 2024
.pre-commit-config.yaml Outdated Show resolved Hide resolved
@Taragolis
Copy link
Contributor Author

The problem here with tests that Prod images build do not reflect changes into requirements in hatch_build.py and it doesn't install methodtools

@Taragolis Taragolis added the type:bug-fix Changelog: Bug Fixes label Apr 2, 2024
@potiuk
Copy link
Member

potiuk commented Apr 2, 2024

The problem here with tests that Prod images build do not reflect changes into requirements in hatch_build.py and it doesn't install methodtools

That's already fixed in #38678 -> I rebased your PR @Taragolis

@potiuk
Copy link
Member

potiuk commented Apr 2, 2024

(and even better - extraction of the common code added then is in #38682 - ready for review.

@Taragolis
Copy link
Contributor Author

That's already fixed in #38678 -> I rebased your PR @Taragolis

Hehe, I've also rebased it. DOUBLE REBASE 🤣

@Taragolis
Copy link
Contributor Author

But this failure also show that docker compose tests can't show useful information in case if API failed because it just fail before show logs from the container. I will add workaround for that later on in the separate PR

@potiuk
Copy link
Member

potiuk commented Apr 2, 2024

But this failure also show that docker compose tests can't show useful information in case if API failed because it just fail before show logs from the container. I will add workaround for that later on in the separate PR

Yep we could dump and upload logs as artifacts from all conainers as we do in case of failure in k8s/regular tests

@Taragolis Taragolis merged commit c65b083 into apache:main Apr 3, 2024
91 checks passed
@Taragolis Taragolis deleted the enable-B019 branch April 3, 2024 08:42
@Taragolis Taragolis added this to the Airflow 2.9.1 milestone Apr 4, 2024
jedcunningham pushed a commit that referenced this pull request Apr 26, 2024
… methods (#37757)

* Use `methodtools.lru_cache` instead of `functools.lru_cache` in class methods

* Rename pre-commit script

(cherry picked from commit c65b083)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
all versions If set, the CI build will be forced to use all versions of Python/K8S/DBs area:dev-tools full tests needed We need to run full set of tests for this PR to merge type:bug-fix Changelog: Bug Fixes upgrade to newer dependencies If set, upgrade to newer dependencies is forced
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants