-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-16818 common: Evict NE memory bucket when empty - MD_ON_SSD_p2 #15519
base: master
Are you sure you want to change the base?
Conversation
Pre-enabler changes to umem_cache_map routine to enable allocator to mark a non-evictable page as evictable and visa-versa. The allocator will make use of the above changes as follows: - When all memory blocks in a non-evictable page is marked unused, allocator can mark the entire page as unused and will notify the umem_cache to mark the page as evictable. No further allocations from this memory block will happen from now ownwards. - Allocator will request the umem_cache to map the page again as non-evictable when there is a need to extend non-evictable region or as evictable page when a new evictable page has to be reserved. Signed-off-by: Sherin T George <[email protected]>
Ticket title is 'Evict NE memory bucket when empty - MD_ON_SSD_p2' |
src/common/mem.c
Outdated
@@ -3231,6 +3232,10 @@ cache_evict_page(struct umem_cache *cache, bool for_sys) | |||
/* The page is referenced by others while flushing */ | |||
if ((pinfo->pi_ref > 0) || is_page_dirty(pinfo) || pinfo->pi_io == 1) | |||
return -DER_AGAIN; | |||
|
|||
/* The status of the page changed to non-evictable */ | |||
if (!is_id_evictable(cache, pinfo->pi_pg_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be more clear if we check "pinfo->pi_evictable" here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Addressed review comments on mem.c Signed-off-by: Sherin T George <[email protected]>
Enhanced the DAV_V2 allocator to support evicting empty non-evictable memory buckets from umem cache post a DAOS GC or aggregation. This inturn enables more evictable memory buckets to be cached. Signed-off-by: Sherin T George <[email protected]>
Test stage Unit Test bdev with memcheck on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-15519/2/testReport/ |
src/common/dav_v2/heap.c
Outdated
|
||
if (!heap->rt->empty_nemb_gcth) { | ||
heap->rt->empty_nemb_gcth = HEAP_NEMB_EMPTY_THRESHOLD; | ||
d_getenv_uint("DAOS_NEMB_EMPTY_RECYCLE_THRESHOLD", &heap->rt->empty_nemb_gcth); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 0 a valid value for this env var? I think we'd set it to default value if the env var isn't set (or set to an invalid value), and the logic must ensure the d_getenv_uint() is called only once on first initialization.
DAOS common practice is calling the getenv on initialization only, I'd suggest moving it to pool open/create to avoid potential complications caused by calling the getenv within a tx.
rg.cr_off = GET_ZONE_OFFSET(zone_id); | ||
rg.cr_size = | ||
((heap->size - rg.cr_off) > ZONE_MAX_SIZE) ? ZONE_MAX_SIZE : heap->size - rg.cr_off; | ||
rc = umem_cache_map(heap->layout_info.store, &rg, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, will it cause the page being loaded if the zone is E? That'll cause problem since it could be in a tx.
D_ERROR("Force GC failed to free up enough nembs, cnt = %d", | ||
heap->rt->empty_nemb_cnt); | ||
|
||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you estimate the CPU time & transaction size in extreme case (assume meta blob size is 1TB, most of the zones are empty and to be reclaimed case).
If it could generate too large tx or hog CPU for too long (> 5ms), I think we need to introduce a limit for each round of force recycle.
Address review comment on checking the env variable during pool open/create and fixed valgrind related failure. Signed-off-by: Sherin T George <[email protected]>
Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15519/3/execution/node/340/log |
Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15519/3/execution/node/345/log |
Pre-enabler changes to umem_cache_map routine to enable allocator to mark a non-evictable page as evictable and visa-versa. The allocator will make use of the above changes as follows:
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: