-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove IndexSearcher#search(Query,Collector) in favor of IndexSearcher#search(Query,CollectorManager) [LUCENE-10002] #11041
Comments
Greg Miller (@gsmiller) (migrated from JIRA) +1. I also don't love that users can setup an |
Zach Chen (@zacharymorn) (migrated from JIRA) Hi @jpountz @gsmiller, I have created a PR for this to deprecate the collector API in favor of the collector manager API, as well as some initial refactoring to some tests and the parts in DrillSideways that use TopScoreDocCollector & TopFieldCollector to use the latter API. I plan to submit more PRs afterward for other areas in the codebase. Please note that I did first try to remove the collector API entirely, but that ended up resulting in way too many changes than I'm comfortable with in a single PR, and I also feel this API is such a commonly used one that users may prefer a more gradual deprecation / transition period. Hence I reverted my previous effort and adopted a phased approach. |
Greg Miller (@gsmiller) (migrated from JIRA) Nice @zacharymorn! Quite a large change for sure! I took a look at the DrillSideways changes and they appear correct to me at first glance, but I'll see if I can spend more time going through the whole PR in the next couple of days. In the meantime, I went ahead and spun off #11088 to track making a similar API change to DrillSideways. |
Zach Chen (@zacharymorn) (migrated from JIRA)
Sounds great, thanks Greg! |
Mark Robert Miller (@markrmiller) (migrated from JIRA) I did this some time ago. Read in Mike's blog you can just toss in an executor, tossed it, looked for my that was easy button. +1 on making it obvious that I literally had changed nothing. Unfortunately, would not have saved me from the complications around implementing the "map reduce" collectors, all hopefully efficiently, and working out when it makes sense to use them vs the std search method, given the overhead and the index size and number of segments needed to make the tradeoff worth it, or at least not a performance hit. Or working out how to deal with changing segment count makeup, what kind of merge control or shenanigans make sense when or how and when and why to manage the segment slices and how they are grouped. Etc. Etc. Went from, "drop in an executor? sweet" to ... |
ASF subversion and git services (migrated from JIRA) Commit 11006fb in lucene's branch refs/heads/main from Luca Cavanna LUCENE-10002: Replace simple usages of TotalHitCountCollector with IndexSearcher#count (#612) In case only number of documents are collected, IndexSearcher#search(Query, Collector) is commonly used, which does not use the executor that's been eventually set to the searcher. Calling Co-authored-by: Adrien Grand <[email protected]> |
ASF subversion and git services (migrated from JIRA) Commit 794f941 in lucene's branch refs/heads/branch_9x from Luca Cavanna LUCENE-10002: Replace simple usages of TotalHitCountCollector with IndexSearcher#count (#612) In case only number of documents are collected, IndexSearcher#search(Query, Collector) is commonly used, which does not use the executor that's been eventually set to the searcher. Calling Co-authored-by: Adrien Grand <[email protected]> |
ASF subversion and git services (migrated from JIRA) Commit ee7a8d6 in lucene's branch refs/heads/main from Luca Cavanna LUCENE-10002: Replace some IndexSearcher#search(Collector, Query) in tests (#639) Also use LuceneTestCase#newSearcher |
ASF subversion and git services (migrated from JIRA) Commit dba9b18 in lucene's branch refs/heads/branch_9x from Luca Cavanna LUCENE-10002: Replace some IndexSearcher#search(Collector, Query) in tests (#639) Also use LuceneTestCase#newSearcher |
ASF subversion and git services (migrated from JIRA) Commit 1b083ea in lucene's branch refs/heads/main from Luca Cavanna LUCENE-10002: Replace test usages of TopScoreDocCollector with a corresponding collector manager (#716) In the effort or replacing usages of IndexSearcher#search(Query, Collector) with IndexSearcher#search(Query, CollectorManager), this commit replaces many test usages of TopScoreDocCollector with its corresponding CollectorManager created by calling TopScoreDocCollector#createSharedManager. |
ASF subversion and git services (migrated from JIRA) Commit bfe7096 in lucene's branch refs/heads/branch_9x from Luca Cavanna LUCENE-10002: Replace test usages of TopScoreDocCollector with a corresponding collector manager (#716) In the effort or replacing usages of IndexSearcher#search(Query, Collector) with IndexSearcher#search(Query, CollectorManager), this commit replaces many test usages of TopScoreDocCollector with its corresponding CollectorManager created by calling TopScoreDocCollector#createSharedManager. |
ASF subversion and git services (migrated from JIRA) Commit bff4246 in lucene's branch refs/heads/main from Adrien Grand LUCENE-10002: Fix test failure. When IndexSearcher is created with a threadpool it becomes impossible to assert |
ASF subversion and git services (migrated from JIRA) Commit 2a6b2ca in lucene's branch refs/heads/branch_9x from Adrien Grand LUCENE-10002: Fix test failure. When IndexSearcher is created with a threadpool it becomes impossible to assert |
ASF subversion and git services (migrated from JIRA) Commit 66bbc95 in lucene's branch refs/heads/main from Luca Cavanna LUCENE-10002: Add FixedBitSetCollector and corresponding collector manager to test framework (#766) Some tests collect matching docs in a FixedBitSet. In the effort of moving such tests to using IndexSearcher#search(Query, CollectorManager) as part of LUCENE-10002, this commit adds a new FixedBitSetCollector class that exposes this functionality as well as a createManager method that returns a corresponding CollectorManager. |
ASF subversion and git services (migrated from JIRA) Commit 3bd6e32 in lucene's branch refs/heads/branch_9x from Luca Cavanna LUCENE-10002: Add FixedBitSetCollector and corresponding collector manager to test framework (#766) Some tests collect matching docs in a FixedBitSet. In the effort of moving such tests to using IndexSearcher#search(Query, CollectorManager) as part of LUCENE-10002, this commit adds a new FixedBitSetCollector class that exposes this functionality as well as a createManager method that returns a corresponding CollectorManager. |
ASF subversion and git services (migrated from JIRA) Commit e7f9f2c in lucene's branch refs/heads/main from Luca Cavanna LUCENE-10002: Replaces usages of FacetsCollector with FacetsCollectorManager (#764) In the effort of decreasing usages of IndexSearcher#search(query, Collector) by using the corresponding method that accepts a collector manager, this commit replaces many usages of FacetsCollector with its corresponding existing collector manager. |
ASF subversion and git services (migrated from JIRA) Commit 7ed0f3d in lucene's branch refs/heads/main from Luca Cavanna #11520: Add support for concurrent facets random sampling (#765) This commit adds a new createManager static method to RandomSamplingFacetsCollector that allows users to perform random sampling concurrently. The returned collector manager is very similar to the existing FacetsCollectorManager but it exposes a specialized reduced RandomSamplingFacetsCollector. This relates to LUCENE-10002. It allows users to use a collector manager instead of a collector when doing random sampling, in the effort of reducing usages of IndexSearcher#search(Query, Collector). |
ASF subversion and git services (migrated from JIRA) Commit 395126a in lucene's branch refs/heads/branch_9x from Luca Cavanna LUCENE-10002: Replaces usages of FacetsCollector with FacetsCollectorManager (#764) In the effort of decreasing usages of IndexSearcher#search(query, Collector) by using the corresponding method that accepts a collector manager, this commit replaces many usages of FacetsCollector with its corresponding existing collector manager. |
ASF subversion and git services (migrated from JIRA) Commit b0ecba4 in lucene's branch refs/heads/branch_9x from Luca Cavanna #11520: Add support for concurrent facets random sampling (#765) This commit adds a new createManager static method to RandomSamplingFacetsCollector that allows users to perform random sampling concurrently. The returned collector manager is very similar to the existing FacetsCollectorManager but it exposes a specialized reduced RandomSamplingFacetsCollector. This relates to LUCENE-10002. It allows users to use a collector manager instead of a collector when doing random sampling, in the effort of reducing usages of IndexSearcher#search(Query, Collector). |
ASF subversion and git services (migrated from JIRA) Commit 74e9716 in lucene's branch refs/heads/main from Luca Cavanna LUCENE-10002: move MemoryIndex to search(Query, CollectorManager) (#785) |
ASF subversion and git services (migrated from JIRA) Commit 1cf1b30 in lucene's branch refs/heads/main from Luca Cavanna LUCENE-10002: replace more usages of search(Query, Collector) in tests (#787) This commit replaces more usages of search(Query, Collector) with calling the corresponding search(Query, CollectorManager) instead. This round focuses on tests that implement custom collector, that need a corresponding collector manager. |
ASF subversion and git services (migrated from JIRA) Commit 37434ff in lucene's branch refs/heads/branch_9x from Luca Cavanna LUCENE-10002: move MemoryIndex to search(Query, CollectorManager) (#785) |
ASF subversion and git services (migrated from JIRA) Commit ccd21fa in lucene's branch refs/heads/branch_9x from Luca Cavanna LUCENE-10002: replace more usages of search(Query, Collector) in tests (#787) This commit replaces more usages of search(Query, Collector) with calling the corresponding search(Query, CollectorManager) instead. This round focuses on tests that implement custom collector, that need a corresponding collector manager. |
These can now be replaced by the corresponding methods added to FacetsCollectorManager that accept a FacetsCollectorManager as last argument in place of a Collector Relates to apache#11041
These can now be replaced by the corresponding methods added to FacetsCollectorManager that accept a FacetsCollectorManager as last argument in place of a Collector Relates to #11041
Only the actual removal of the deprecated search(Query, Collector) is left to do. That is tracked by #12892 hence this issue can be closed. |
It's a bit trappy that you can create an IndexSearcher with an executor, but that it would always search on the caller thread when calling
IndexSearcher#search(Query,Collector)
.Let's remove
IndexSearcher#search(Query,Collector)
, point our users toIndexSearcher#search(Query,CollectorManager)
instead, and change factory methods of our main collectors (e.g.TopScoreDocCollector#create
) to return aCollectorManager
instead of aCollector
?Migrated from LUCENE-10002 by Adrien Grand (@jpountz), updated Apr 06 2022
Pull requests: #240
The text was updated successfully, but these errors were encountered: