Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IBX-5385: Add option content-type to reindex command #370

Closed

Conversation

papcio122
Copy link
Contributor

@papcio122 papcio122 commented Apr 24, 2023

Question Answer
JIRA issue IBX-5385
Type bug/improvement
Target Ibexa version v3.3+
BC breaks no

Checklist:

  • Provided PR description.
  • Tested the solution manually.
  • Provided automated test coverage.
  • Checked that target branch is set correctly (master for features, the oldest supported for bugs).
  • Ran PHP CS Fixer for new PHP code (use $ composer fix-cs).
  • Asked for a review (ping @ezsystems/engineering-team).

@papcio122 papcio122 requested review from a team April 24, 2023 09:47
@papcio122 papcio122 marked this pull request as ready for review April 24, 2023 09:47
Changed from VALUE_OPTIONAL to VALUE_REQUIRED for content-type option

Co-authored-by: Paweł Niedzielski <[email protected]>
eZ/Publish/SPI/Search/Content/IndexerGateway.php Outdated Show resolved Hide resolved
@@ -40,6 +40,18 @@ public function getContentInSubtree(string $locationPath, int $iterationCount):
*/
public function countContentInSubtree(string $locationPath): int;

/**
* @throws \Doctrine\DBAL\Exception
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as the indexing is concerned, I'm not sure if we want to declare that this can throw this exception. The fact that we are using Doctrine is irrelevant for indexing (at least for the interface). @alongosz ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as the indexing is concerned, I'm not sure if we want to declare that this can throw this exception. The fact that we are using Doctrine is irrelevant for indexing (at least for the interface). @alongosz ?

You're right. On SPI level we can throw SPI or API exceptions.

'content-type',
null,
InputOption::VALUE_REQUIRED,
'Content type identifier to refresh (deleted/updated/added). Implies "no-purge", cannot be combined with "since", "subtree" or "content-ids"'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note, outside of the scope of this PR:
We are not actually checking that conflicting options are passed.

@adamwojs adamwojs changed the title IBX-5385 add option content-type to reindex command IBX-5385: Add option content-type to reindex command Apr 24, 2023
Relaxed constraints for getContentWithContentTypeIdentifier in IndexerGateway interface.

Co-authored-by: Paweł Niedzielski <[email protected]>
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 5 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@papcio122 papcio122 requested a review from a team July 19, 2023 07:52
@@ -63,6 +63,22 @@ public function countContentInSubtree(string $locationPath): int
return (int)$query->execute()->fetchOne();
}

public function getContentWithContentTypeIdentifier(string $contentTypeIdentifier, int $iterationCount): Generator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue reported is relevant. Narrowing down of return type is available for PHP 7.4+, and this version of ezplatform-kernel supports PHP 7.3, which means you cannot use this feature.

Suggested change
public function getContentWithContentTypeIdentifier(string $contentTypeIdentifier, int $iterationCount): Generator
public function getContentWithContentTypeIdentifier(string $contentTypeIdentifier, int $iterationCount): iterable

Comment on lines +48 to +53
public function getContentWithContentTypeIdentifier(string $contentTypeIdentifier, int $iterationCount): iterable;

/**
* @throws \Doctrine\DBAL\Exception
*/
public function countContentWithContentTypeIdentifier(string $contentTypeIdentifier): int;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we actually ok with introducing new method into this class @alongosz? It is a contract.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we actually ok with introducing new method into this class @alongosz? It is a contract.

No, this is strict SPI, any additions to contracts there (except for SPI\Persistence) are hard BC break.

OT: Though, TBH I'm not sure what idea I had in mind when placing this in SPI. Maybe I thought it's gonna be implemented by each search engine, but AFAICS it should just fetch data related to indexing directly from Repository. Maybe the purpose was rather Legacy Storage strategy...

Copy link
Member

@alongosz alongosz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This improvement is not directly related to a bug fix, but it's rather a new feature making easier to reindex data after a bug fix, right? Moreover AFAICS related CS tickets are 4.x. We fix bugs and performance issues on the oldest supported version, but we don't add features to it.

As a compromise, I would accept this solution for 4.5 instead of 4.6.x-dev, if @webhdx agrees.

Review remarks:

@@ -128,23 +128,28 @@ protected function configure()
'since',
null,
InputOption::VALUE_OPTIONAL,
'Refresh changes since a time provided in any format understood by DateTime. Implies "no-purge", cannot be combined with "content-ids" or "subtree"'
'Refresh changes since a time provided in any format understood by DateTime. Implies "no-purge", cannot be combined with "content-ids", "subtree" or "content-type"'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Punctuation: use Oxford comma here and in the other related places please

Suggested change
'Refresh changes since a time provided in any format understood by DateTime. Implies "no-purge", cannot be combined with "content-ids", "subtree" or "content-type"'
'Refresh changes since a time provided in any format understood by DateTime. Implies "no-purge", cannot be combined with "content-ids", "subtree", or "content-type"'

@@ -40,6 +40,18 @@ public function getContentInSubtree(string $locationPath, int $iterationCount):
*/
public function countContentInSubtree(string $locationPath): int;

/**
* @throws \Doctrine\DBAL\Exception
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as the indexing is concerned, I'm not sure if we want to declare that this can throw this exception. The fact that we are using Doctrine is irrelevant for indexing (at least for the interface). @alongosz ?

You're right. On SPI level we can throw SPI or API exceptions.

Comment on lines +48 to +53
public function getContentWithContentTypeIdentifier(string $contentTypeIdentifier, int $iterationCount): iterable;

/**
* @throws \Doctrine\DBAL\Exception
*/
public function countContentWithContentTypeIdentifier(string $contentTypeIdentifier): int;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we actually ok with introducing new method into this class @alongosz? It is a contract.

No, this is strict SPI, any additions to contracts there (except for SPI\Persistence) are hard BC break.

OT: Though, TBH I'm not sure what idea I had in mind when placing this in SPI. Maybe I thought it's gonna be implemented by each search engine, but AFAICS it should just fetch data related to indexing directly from Repository. Maybe the purpose was rather Legacy Storage strategy...

Comment on lines +265 to +267
$count = $this->gateway->countContentWithContentTypeIdentifier($contentType);
$generator = $this->gateway->getContentWithContentTypeIdentifier($contentType, $iterationCount);
$purge = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given you can't add anything to that Gateway without breaking BC promise (see previous comment), I think better solution would be to either use Repository Filtering (\eZ\Publish\API\Repository\ContentService::find) or make a dedicated method in \eZ\Publish\SPI\Persistence\Content\Handler. The latter option would be more performant and require implementing cache layer (which is good).

@papcio122
Copy link
Contributor Author

Closing this PR, moving changes to ibexa/core#259 in order to merge this to 4.5

@papcio122 papcio122 closed this Sep 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants