Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IBX-5502: Added additional tag cleaning to limit down number of orphaned tag en… #477

Open
wants to merge 6 commits into
base: 4.6
Choose a base branch
from

Conversation

mateuszbieniek
Copy link
Contributor

@mateuszbieniek mateuszbieniek commented Feb 12, 2025

Rebase of ezsystems/ezplatform-kernel#372

Question Answer
JIRA issue IBX-5502
Type bug
Target Ibexa version v4.6
BC breaks no

Due to how the cache is implemented in Symfony-based projects using the Redis adapter (and probably - not only Redis), entries (Members) inside a Tag (Set) are removed only when the tag is purged. Due to that, we can encounter on multiple occasions a situation, when a Tag is purged, resulting in cache items (Keys) being purged as well, but only the Tag that triggered a purge is cleaned up. All other tags that contain entries pointing to cache items that no longer exist will remain there until the affected tag is purged - which often is "never".

This results in orphaned entries in tags (or even in whole orphaned tags) filling up the memory. Memory allocated to tags in Redis is nonevictable, therefore it is freed only when purged on purpose (maxmemory-policy setting won't affect orphaned members/sets). On very big and/or constantly changing content structures, this unevictable memory allocated to orphaned tags and entries can significantly lower the amount of memory available for standard cache items to be stored.

Additional Tag purging added in this PR reduces this effect, but - unfortunately - it is not possible to resolve it fully with current Symfony's cache adapters implementation, due to the fact that there is no available interface to remove an entry from a Tag, without purging it whole - and that would result in purging cache items that were nor related to the original action performed on content.

Additional tag purging introduced by PR:

  • ContentHandler::updateContent:
    Multiple content-related entries (ibx-c-<contentId>-<versionNo>-0, ibx-c-<contentId>-<versionNo>-<languageCode>, ibx-cvi-<contentId>, ibx-cl-<contentId>-pfd are orphaned in tags: l-<locationId>, lp-<locationId>, c-<contentId>-v-<versionNo>, c-<contentId>-v-<prevVersionNo>. Added purging of those tags to avoid it.

  • TrashHandler::trashSubtree:
    Multiple content-related entries are orphaned (ibx-c-<contentId>-v<versionNumber>, ibx-cl-<contentId>, ibx-cvi-<contentId>, ibx-l-<locationId>-<languageCode>, ibx-c-<contentId>-<versionNo>-0, ibx-c-<contentId>-<versionNo>-<languageCode>, ibx-cvi-<contentId> ... (many more) are orphaned in tags: l-<locationId>, c-<contentId>-v-<versionNo>.

Handling those two occurrences helps to limit greatly the number of orphaned members.

My tests show, that one of the biggest culprits here is lp-<locationIdOfParent> tag, which can have multiple orphaned entries - therefore it might be a good idea to purge it as well, to avoid blocked memory in cost of potentially small performance loss - up for discussion.

This PR won't resolve the issue fully but limits the scope of it using available tools. To resolve this fully, we will need (in my opinion):

  • an interface to remove entries from sets
  • a garbage collector, as entries in tags might become orphaned as well when a key is evicted by Redis when maxmemory limit is reached.
    But those two are out of this PR scope.

Checklist:

  • Provided PR description.
  • Tested the solution manually.
  • Checked that target branch is set correctly (master for features, the oldest supported for bugs).
  • Ran PHP CS Fixer for new PHP code (use $ composer fix-cs).
  • Asked for a review (ping @ezsystems/engineering-team).

@mateuszbieniek mateuszbieniek changed the title Added additional tag cleaning to limit down number of orphaned tag en… IBX-5502: Added additional tag cleaning to limit down number of orphaned tag en… Feb 12, 2025
Copy link
Member

@alongosz alongosz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solution to the issue concerns me a bit. The idea behind tags was always to catch everything by using just one, so it feels like this needs to be resolved in a different place. However maybe I'm wrong or maybe there's no other way to solve it - POV ping @ibexa/php-dev.

@mateuszbieniek before moving with the remarks to the solution, see the one about getContentTags - in that case prior diff comments to that are not applicable.

@@ -335,12 +335,33 @@ public function updateContent($contentId, $versionNo, UpdateStruct $struct)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While at it, given your changes affect method PHPDoc and signature, could you do full improvement on that?
The signature should now be as follows:

    /**
     * @throws \Ibexa\Contracts\Core\Repository\Exceptions\InvalidArgumentException
     * @throws \Psr\Cache\InvalidArgumentException
     */
    public function updateContent($contentId, $versionNo, UpdateStruct $struct): Content

Comment on lines +342 to +344
$locationPathTags = array_map(function (Content\Location $location): string {
return $this->cacheIdentifierGenerator->generateTag(self::LOCATION_PATH_IDENTIFIER, [$location->id]);
}, $locations);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given LOCATION_PATH_IDENTIFIER tag exists only in conjunction with LOCATION_IDENTIFIER tag, are you sure both are necessary?
Moreover, I don't see LOCATION_PATH_IDENTIFIER tags (lp-%s) in the list off affected by the issue.

@@ -335,12 +335,33 @@ public function updateContent($contentId, $versionNo, UpdateStruct $struct)
{
$this->logger->logCall(__METHOD__, ['content' => $contentId, 'version' => $versionNo, 'struct' => $struct]);
$content = $this->persistenceHandler->contentHandler()->updateContent($contentId, $versionNo, $struct);
$this->cache->invalidateTags([
$this->cacheIdentifierGenerator->generateTag(
$locations = $this->persistenceHandler->locationHandler()->loadLocationsByContent($contentId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you checked the impact of this on performance? Every load inside caching system is a potential risk. I'd like to avoid the situation where we release some memory, but make longer update time as the cost.

[$contentId, $versionNo]
),
]);
[$contentId, $versionNo - 1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not invalidate tags for an unrelated versions. The cache performance side effects of that might be difficult to predict.

Comment on lines +339 to +341
$locationTags = array_map(function (Content\Location $location): string {
return $this->cacheIdentifierGenerator->generateTag(self::LOCATION_IDENTIFIER, [$location->id]);
}, $locations);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not very readable to me. How about:

Suggested change
$locationTags = array_map(function (Content\Location $location): string {
return $this->cacheIdentifierGenerator->generateTag(self::LOCATION_IDENTIFIER, [$location->id]);
}, $locations);
$locationTags = array_map(
fn (Content\Location $location): string => $this->cacheIdentifierGenerator->generateTag(
self::LOCATION_IDENTIFIER,
[$location->id]
),
$locations
);

->willReturn($innerContentHandlerMock);

$innerContentHandlerMock
->expects($this->exactly($loadReverseRelationsCount))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
->expects($this->exactly($loadReverseRelationsCount))
->expects(self::exactly($loadReverseRelationsCount))

);

$this->persistenceHandlerMock
->expects($this->exactly($locationHandlerCount))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
->expects($this->exactly($locationHandlerCount))
->expects(self::exactly($locationHandlerCount))

->willReturn($innerLocationHandlerMock);

$innerLocationHandlerMock
->expects($this->exactly($loadLocationsByContentCount))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
->expects($this->exactly($loadLocationsByContentCount))
->expects(self::exactly($loadLocationsByContentCount))

@@ -149,18 +153,29 @@ public function testTrashSubtree()
->method($handlerMethodName)
->willReturn($innerHandler);

$contentHandlerMock
->expects($this->once())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
->expects($this->once())
->expects(self::once())

$innerHandler
->expects($this->once())
->method('trashSubtree')
->with($locationId)
->willReturn(null);

$this->cacheIdentifierGeneratorMock
->expects($this->exactly(2))
->expects($this->exactly(4))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
->expects($this->exactly(4))
->expects(self::exactly(4))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants