Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HSEARCH-4487 + HSEARCH-4980 Two fixes for Jakarta Batch Mass Indexing job: MySQL with jdbcFetchSize=Integer.MIN_VALUE + embedded ids #3754

Merged
merged 8 commits into from
Oct 2, 2023

Conversation

yrodiere
Copy link
Member

@yrodiere yrodiere commented Sep 28, 2023

  • HSEARCH-4980: Jakarta Batch job fails for entities with embedded ids
  • HSEARCH-4487: Jakarta Batch Mass Indexing job fails with @IndexedEmbedded entity and MySQL

For HSEARCH-4980, it's just that tests were disabled so the implementation was garbage and didn't even work. I would know, I committed that nonsense. We might want to backport the fix to 6.2, though I'm not sure it's easy.

For HSEARCH-4487, the problem was that we were using a scroll on entities. When using MySQL with jdbcFetchSize=Integer.MIN_VALUE (AKA "I mean scrolling for real, not loading the whole database in memory"), this is fundamentally incompatible with lazy loading (because scrolling basically takes over the whole connection and one can't use it for lazy loading anymore).

So, I solved this by not scrolling entities anymore: we're doing things like in the MassIndexer, with one session (connection) scrolling on IDs, and the other loading entities in batches.
This obviously changes the performance profile quite a lot, so we should brace for performance regressions.
This also implies some API changes as well: in particular, I changed the "custom HQL" feature to be closer to what we have in the MassIndexer, and removed the (now nonsensical) sessionClearInterval setting.
But AFAIK this is the only way to solve the problem.

I won't bother backporting HSEARCH-4487 to 6.2, as backporting without breaking changes would require too much work and duplication.

@yrodiere yrodiere force-pushed the HSEARCH-4487 branch 2 times, most recently from 43f8365 to fe83edd Compare September 28, 2023 14:47
@yrodiere
Copy link
Member Author

Ok, I had to push two updates:

  1. One to use stateless sessions where relevant, mostly as an optimization
  2. One to fix some JQAssistant constraint violations

Copy link
Member

@marko-bekhta marko-bekhta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😃 👍🏻

@yrodiere yrodiere added the Waiting for CI Ready to merge once CI passes label Sep 28, 2023
@yrodiere yrodiere force-pushed the HSEARCH-4487 branch 5 times, most recently from c40f929 to 3c3613e Compare September 29, 2023 13:15
These tests shouldn't have been disabled in the first place.
…mples

The documentation example was contrived (weird data).

There is no need to use "e." everywhere, as conditions work fine without
that prefix.
… mass indexing job

No need for a full-blown session just to count entities or scroll on
identifiers.
@sonarcloud
Copy link

sonarcloud bot commented Oct 2, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 6 Code Smells

92.8% 92.8% Coverage
1.8% 1.8% Duplication

@yrodiere yrodiere merged commit da756ee into hibernate:main Oct 2, 2023
11 checks passed
@yrodiere yrodiere deleted the HSEARCH-4487 branch November 29, 2023 08:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Waiting for CI Ready to merge once CI passes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants