Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-16818 common: Evict NE memory bucket when empty - MD_ON_SSD_p2 #15519

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

sherintg
Copy link
Collaborator

Pre-enabler changes to umem_cache_map routine to enable allocator to mark a non-evictable page as evictable and visa-versa. The allocator will make use of the above changes as follows:

  • When all memory blocks in a non-evictable page is marked unused, allocator can mark the entire page as unused and will notify the umem_cache to mark the page as evictable. No further allocations from this memory block will happen from now ownwards.
  • Allocator will request the umem_cache to map the page again as non-evictable when there is a need to extend non-evictable region or as evictable page when a new evictable page has to be reserved.

Before requesting gatekeeper:

  • Two review approvals and any prior change requests have been resolved.
  • Testing is complete and all tests passed or there is a reason documented in the PR why it should be force landed and forced-landing tag is set.
  • Features: (or Test-tag*) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.
  • Commit messages follows the guidelines outlined here.
  • Any tests skipped by the ticket being addressed have been run and passed in the PR.

Gatekeeper:

  • You are the appropriate gatekeeper to be landing the patch.
  • The PR has 2 reviews by people familiar with the code, including appropriate owners.
  • Githooks were used. If not, request that user install them and check copyright dates.
  • Checkpatch issues are resolved. Pay particular attention to ones that will show up on future PRs.
  • All builds have passed. Check non-required builds for any new compiler warnings.
  • Sufficient testing is done. Check feature pragmas and test tags and that tests skipped for the ticket are run and now pass with the changes.
  • If applicable, the PR has addressed any potential version compatibility issues.
  • Check the target branch. If it is master branch, should the PR go to a feature branch? If it is a release branch, does it have merge approval in the JIRA ticket.
  • Extra checks if forced landing is requested
    • Review comments are sufficiently resolved, particularly by prior reviewers that requested changes.
    • No new NLT or valgrind warnings. Check the classic view.
    • Quick-build or Quick-functional is not used.
  • Fix the commit message upon landing. Check the standard here. Edit it to create a single commit. If necessary, ask submitter for a new summary.

Pre-enabler changes to umem_cache_map routine to enable allocator to
mark a non-evictable page as evictable and visa-versa. The allocator
will make use of the above changes as follows:
- When all memory blocks in a non-evictable page is marked unused,
  allocator can mark the entire page as unused and will notify the
  umem_cache to mark the page as evictable. No further allocations
  from this memory block will happen from now ownwards.
- Allocator will request the umem_cache to map the page again as
  non-evictable when there is a need to extend non-evictable region
  or as evictable page when a new evictable page has to be reserved.

Signed-off-by: Sherin T George <[email protected]>
Copy link

Ticket title is 'Evict NE memory bucket when empty - MD_ON_SSD_p2'
Status is 'Open'
Labels: 'hpe,md_on_ssd2'
https://daosio.atlassian.net/browse/DAOS-16818

NiuYawei
NiuYawei previously approved these changes Nov 20, 2024
src/common/mem.c Outdated
@@ -3231,6 +3232,10 @@ cache_evict_page(struct umem_cache *cache, bool for_sys)
/* The page is referenced by others while flushing */
if ((pinfo->pi_ref > 0) || is_page_dirty(pinfo) || pinfo->pi_io == 1)
return -DER_AGAIN;

/* The status of the page changed to non-evictable */
if (!is_id_evictable(cache, pinfo->pi_pg_id))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be more clear if we check "pinfo->pi_evictable" here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

src/common/mem.c Show resolved Hide resolved
Addressed review comments on mem.c

Signed-off-by: Sherin T George <[email protected]>
Enhanced the DAV_V2 allocator to support evicting empty
non-evictable memory buckets from umem cache post a DAOS GC or
aggregation. This inturn enables more evictable memory buckets to
be cached.

Signed-off-by: Sherin T George <[email protected]>
@sherintg sherintg marked this pull request as ready for review November 25, 2024 12:35
@sherintg sherintg requested review from a team as code owners November 25, 2024 12:35
@daosbuild1
Copy link
Collaborator

Test stage Unit Test bdev with memcheck on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-15519/2/testReport/


if (!heap->rt->empty_nemb_gcth) {
heap->rt->empty_nemb_gcth = HEAP_NEMB_EMPTY_THRESHOLD;
d_getenv_uint("DAOS_NEMB_EMPTY_RECYCLE_THRESHOLD", &heap->rt->empty_nemb_gcth);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 0 a valid value for this env var? I think we'd set it to default value if the env var isn't set (or set to an invalid value), and the logic must ensure the d_getenv_uint() is called only once on first initialization.

DAOS common practice is calling the getenv on initialization only, I'd suggest moving it to pool open/create to avoid potential complications caused by calling the getenv within a tx.

rg.cr_off = GET_ZONE_OFFSET(zone_id);
rg.cr_size =
((heap->size - rg.cr_off) > ZONE_MAX_SIZE) ? ZONE_MAX_SIZE : heap->size - rg.cr_off;
rc = umem_cache_map(heap->layout_info.store, &rg, 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, will it cause the page being loaded if the zone is E? That'll cause problem since it could be in a tx.

D_ERROR("Force GC failed to free up enough nembs, cnt = %d",
heap->rt->empty_nemb_cnt);

return 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you estimate the CPU time & transaction size in extreme case (assume meta blob size is 1TB, most of the zones are empty and to be reclaimed case).

If it could generate too large tx or hog CPU for too long (> 5ms), I think we need to introduce a limit for each round of force recycle.

Address review comment on checking the env variable during pool
open/create and fixed valgrind related failure.

Signed-off-by: Sherin T George <[email protected]>
@daosbuild1
Copy link
Collaborator

Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15519/3/execution/node/340/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15519/3/execution/node/345/log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants