Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utility cleanup for keeping artifacts based on @version #38

Open
rpocase opened this issue Dec 10, 2018 · 2 comments
Open

Utility cleanup for keeping artifacts based on @version #38

rpocase opened this issue Dec 10, 2018 · 2 comments

Comments

@rpocase
Copy link
Contributor

rpocase commented Dec 10, 2018

A predominant use case internally is generic repositories that have multiple types of artifacts within a given folder. E.g.

repo
├── module1
│   └── version
│       ├── module1-version.pdf
│       └── module1-version.tgz
└── module2
    └── version
        ├── module2-version.pdf
        └── module2-version.tgz

Count based retention works great here except in cases where I may have artifacts that should be kept regardless of age (e.g., artifacts containing metadata that can't be expressed well in properties).

Using the versions API, we could be more intelligent about keeping the latest set of versioned artifacts from particular repositories (or particular folders under a given repository).

E.g.

My workaround for the time being is like the below. This combines time_based_retention and count_based_retention while allowing for providing extra_aql to handle filtering. This still requires me to make structural changes to how I post files that should be included (e.g. they go in a separate tree that gets filtered out by AQL).

from lavatory.utils.artifactory import Artifactory
import datetime


def time_count_based_retention(artifactory: Artifactory, retention_count=1, keep_days=15,
                               project_depth=1,
                               artifact_depth=2,
                               extra_aql=None):
    """
    Discard any folders older than keep_days while maintaining at least retention_count folders

    :param artifactory: Artifactory instance provided to policy purgelist
    :param retention_count: Number of versions to keep
    :param keep_days: artifact_depth as defined by artifactory.time_based_retention
    :param project_depth: artifact_depth as defined by artifactory.count_based_retention
    :param artifact_depth: artifact_depth as defined by artifactory.count_based_retention
    :param extra_aql: extra_aql as defined by lavatory utils
    :return:
    """
    if not extra_aql:
        extra_aql = []
    versions = _get_retention_count(artifactory, extra_aql=extra_aql,
                                    retention_count=retention_count,
                                    project_depth=project_depth, artifact_depth=artifact_depth)
    now = datetime.datetime.now()
    before = now - datetime.timedelta(days=keep_days)
    created_before = before.strftime("%Y-%m-%dT%H:%M:%SZ")
    keep_days_versions = _get_retention_count(artifactory,
                                              extra_aql=extra_aql + [{'created': {"$lt": created_before}}],
                                              retention_count=retention_count,
                                              project_depth=project_depth,
                                              artifact_depth=artifact_depth)
    return [artifact for artifact in keep_days_versions if artifact in versions]


def _get_retention_count(artifactory, extra_aql=None, retention_count=1,
                         project_depth=1, artifact_depth=2):
    return artifactory.count_based_retention(retention_count=retention_count,
                                             project_depth=project_depth,
                                             artifact_depth=artifact_depth, item_type='folder',
                                             extra_aql=extra_aql)
@rpocase
Copy link
Contributor Author

rpocase commented Dec 10, 2018

Worth noting that the approach above only works for structures where artifacts are stored in version folders. The simple-default layout proposes versioning artifacts directly without the version folder layer. In this context, count_based_retention only works if you produce a single type of artifact. You can certainly change your posting structure, but that typically has far reaching ramifications.

@sijis
Copy link
Contributor

sijis commented Jul 30, 2020

I realized this is an older issue. Replying in case this could help someone else.

It possible to use native aql to do what you desire

    terms = [ { "stat.downloaded": { "$before": "1mo" }},
                    { "@build.correlation_ids": { "$nmatch": "*" }},
                    { "name": { "$match": "manifest.json" }},
                    { "path": { "$nmatch": "*/latest" }}
                ]
    purgeable = artifactory.filter(terms=terms, depth=None, item_type="file")
    return purgeable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants