FIX Unpublising parent pages should include child pages #89

ssmarco · 2023-09-28T08:37:06Z

Issue #88

Uses SiteTree::enforce_strict_hierarchy config in getting the dependent documents for indexing child pages.

src/DataObject/DataObjectDocument.php

GuySartorelli · 2023-10-04T21:26:28Z

src/DataObject/DataObjectDocument.php

+        foreach ($page->AllChildren() as $record) {
+            $document = DataObjectDocument::create($record);
+            $docs[$document->getIdentifier()] = $document;
+            $docs = array_merge($docs, $document->getDependentDocuments());


Can calling getDependentDocuments() here ever result in an infinite loop?

If my theory is correct, then the CMS will prevent us from this happening especially on a SiteTree which I used as a guide.

https://github.com/silverstripe/silverstripe-cms/blob/668744e728154818873bc24cb0b8d45f73728b95/code/Model/SiteTree.php#L1752
The delete method on each child page executes on before delete as well which is kind of the same thing which is to recursively call the delete function.

My initial change was to account only the immediate children (SiteTree) of a certain page but @michalkleiner made a valid point to account other relationships on each child pages hence made the changes.

Am thinking of asking other vendors to stress test this implementation since this issue came from them since they have more content.

src/Jobs/RemoveDataObjectJob.php

Issue [silverstripe#88](silverstripe#88) Uses `SiteTree::enforce_strict_hierarchy` config in getting the dependent documents for indexing child pages.

GuySartorelli · 2023-10-05T01:00:55Z

I don't have enough context into this module (let alone a way to test it) to feel comfortable approving it, I was only looking at it 'cause Max asked me to. But from what I can tell, it looks okay.

ssmarco · 2023-10-05T01:03:34Z

I don't have enough context into this module (let alone a way to test it) to feel comfortable approving it, I was only looking at it 'cause Max asked me to. But from what I can tell, it looks okay.

If you are keen, we can have a catch-up to set-up Elastic Search on your local.

ssmarco · 2024-04-05T05:31:46Z

@blueo Ready for another look. Added a few more tests cases and comments where I find interesting to know and for future reference.

GuySartorelli

Tested this locally - it works as expected. All descendents of the page being unpublished (and that page itself) are removed from the index.
No pages which should not be removed are removed.

I'll leave it to bernie to merge (or to comment if he has any requested changes)

P.S. loving your DDEV addon, Marco. It makes testing this so easy

blueo

great changes - particularly the test coverage. I have a couple of questions around where the sitetree logic should go and the way we're getting 'removed' objects

src/DataObject/DataObjectDocument.php

blueo · 2024-04-10T01:07:48Z

src/Jobs/RemoveDataObjectJob.php

+
+                    // Taking into account that this queued job has a reference of existing child pages
+                    // We need to make sure that we are able to send these pages to ElasticSearch etc. for removal
+                    $oldRecord = $doc->getDataObject();


__unserialise does this a bit differently by getting the 'latest' version - do we want to do that to be consistent? I"m assuming this is about getting documents that have been removed (ie were not picked up by the previous condition because of stage=Live)

I have tested your __unserialize PR on top of this and your fix works as expected. As discussed, this job only checks for relationships between dataobjects and not their contents. Sending the live data is what the other PR has resolved a seemingly module wide issue which this PR also needs.

good to know they work together - I was thinking though - should these two code blocks work the same way?

From this Job

$oldRecord = $doc->getDataObject(); if ($oldRecord->isArchived() || $oldRecord->isOnDraft()) { $document = DataObjectDocument::create($oldRecord); $carry[$document->getIdentifier()] = $document; }

from the unserialise function

if (!$dataObject && DataObject::has_extension($data['className'], Versioned::class) && $data['fallback']) { // get the latest version - usually this is an object that has been deleted $dataObject = Versioned::get_latest_version( $data['className'], $data['id'] ); }

eg your one does a check for isArchived + isOnDraft if there is no DataObject - where the unserialize does a Versioned::get_latest_version call. OR does it not matter and we get the same result in the end anyway?

as discussed - the __unserialize function will look after returning a draft/deleted version of the dataobject - so we don't need this check $oldRecord->isArchived() || $oldRecord->isOnDraft() - we can simply add the current dataobject - the job will then look after either indexing or removing it from the index

blueo · 2024-04-10T01:09:32Z

tests/Jobs/RemoveDataObjectJobTest.php

@@ -16,7 +17,7 @@
 class RemoveDataObjectJobTest extends SearchServiceTest
 {

-    protected static $fixture_file = '../fixtures.yml'; // phpcs:ignore
+    protected static $fixture_file = '../fixtures.yml'; // @phpcs:ignore


AFAIK you shouldn't need the @ - ideally just ignore a specific error

I was torn in between following commenting out a specific error since it makes the line of code looks rather long and unpleasant to the eyes. I just felt that this error is not really an error for a Silverstripe developer and should be an acceptable convention rather than following 3rd-party conventions that does not care about Silverstripe.

blueo · 2024-04-10T01:09:55Z

tests/Jobs/RemoveDataObjectJobTest.php

@@ -76,11 +97,25 @@ public function testJob(): void

        $resultTitles = [];

+        // This determines whether the document should be added or removed from from the index


great additions

Co-authored-by: Bernard Hamlin <[email protected]>

… indexing the document

Extract to extension for SiteTree related documents

blueo

Great work @ssmarco . I think this is ready to go now.

michalkleiner reviewed Sep 28, 2023

View reviewed changes

src/DataObject/DataObjectDocument.php Outdated Show resolved Hide resolved

ssmarco changed the title ~~FIX Unpublising parent pages should include chid pages~~ FIX Unpublising parent pages should include child pages Sep 28, 2023

ssmarco force-pushed the feature/3-remove-children-from-index branch from 9c67c32 to 737945e Compare September 28, 2023 13:19

GuySartorelli reviewed Oct 4, 2023

View reviewed changes

src/Jobs/RemoveDataObjectJob.php Outdated Show resolved Hide resolved

FIX Unpublising parent pages should include child pages

188f6bd

Issue [silverstripe#88](silverstripe#88) Uses `SiteTree::enforce_strict_hierarchy` config in getting the dependent documents for indexing child pages.

ssmarco force-pushed the feature/3-remove-children-from-index branch from 737945e to 188f6bd Compare October 5, 2023 00:48

ssmarco force-pushed the feature/3-remove-children-from-index branch from 45775e0 to 0f9e26c Compare April 5, 2024 00:48

MNT: Additional PHPUnit tests

4109616

ssmarco force-pushed the feature/3-remove-children-from-index branch from 0f9e26c to 4109616 Compare April 5, 2024 05:29

GuySartorelli approved these changes Apr 8, 2024

View reviewed changes

blueo requested changes Apr 10, 2024

View reviewed changes

FIX Update comments on src/DataObject/DataObjectDocument.php

de17f14

Co-authored-by: Bernard Hamlin <[email protected]>

ssmarco force-pushed the feature/3-remove-children-from-index branch from 00384eb to de17f14 Compare April 10, 2024 04:32

Marco Hermo added 6 commits April 11, 2024 19:04

Extract to extension for SiteTree related documents

59a4a5e

Indexer: Making sure we get the Live version of the DataObject before…

4efcc96

… indexing the document

Merge pull request #1 from ssmarco/feature/3-remove-children-extension

b38d94e

Extract to extension for SiteTree related documents

Fix linting suppression

de32cd3

Indexer additional for published data

a73313a

Index - Undo versioned check to combine with __unserialise fix

ddf1115

blueo approved these changes Apr 12, 2024

View reviewed changes

blueo merged commit 0bede9c into silverstripe:3 Apr 12, 2024
9 checks passed

blueo mentioned this pull request Apr 12, 2024

Unpublishing the parent page does not include the children #88

Closed

lukereative mentioned this pull request Jul 3, 2024

Dataobjects using SearchServiceExtension become unindexed upon upgrading #99

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX Unpublising parent pages should include child pages #89

FIX Unpublising parent pages should include child pages #89

ssmarco commented Sep 28, 2023

GuySartorelli Oct 4, 2023

ssmarco Oct 5, 2023 •

edited

Loading

ssmarco Oct 5, 2023

GuySartorelli commented Oct 5, 2023

ssmarco commented Oct 5, 2023

ssmarco commented Apr 5, 2024

GuySartorelli left a comment •

edited

Loading

blueo left a comment

blueo Apr 10, 2024

ssmarco Apr 10, 2024

blueo Apr 10, 2024 •

edited

Loading

blueo Apr 10, 2024

blueo Apr 10, 2024

ssmarco Apr 10, 2024

blueo Apr 10, 2024

blueo left a comment

		@@ -76,11 +97,25 @@ public function testJob(): void

		$resultTitles = [];

		// This determines whether the document should be added or removed from from the index

FIX Unpublising parent pages should include child pages #89

FIX Unpublising parent pages should include child pages #89

Conversation

ssmarco commented Sep 28, 2023

Choose a reason for hiding this comment

ssmarco Oct 5, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GuySartorelli commented Oct 5, 2023

ssmarco commented Oct 5, 2023

ssmarco commented Apr 5, 2024

GuySartorelli left a comment • edited Loading

Choose a reason for hiding this comment

blueo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

blueo Apr 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

blueo left a comment

Choose a reason for hiding this comment

ssmarco Oct 5, 2023 •

edited

Loading

GuySartorelli left a comment •

edited

Loading

blueo Apr 10, 2024 •

edited

Loading