forked from apache/lucene
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathCHANGES.txt
19025 lines (13674 loc) · 826 KB
/
CHANGES.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Lucene Change Log
For more information on past and future Lucene versions, please see:
http://s.apache.org/luceneversions
======================= Lucene 10.0.0 =======================
API Changes
---------------------
* LUCENE-10010: AutomatonQuery, CompiledAutomaton, RunAutomaton, RegExp
classes no longer determinize NFAs. Instead it is the responsibility
of the caller to determinize. (Robert Muir)
* LUCENE-10368: IntTaxonomyFacets has been make pkg-private and serves only as an internal
implementation detail of taxonomy-faceting. (Greg Miller)
* LUCENE-10400: Remove deprecated dictionary constructors in Kuromoji and Nori (Tomoko Uchida)
* LUCENE-10440: TaxonomyFacets and FloatTaxonomyFacets have been made pkg-private and only serve
as internal implementation details of taxonomy-faceting. (Greg Miller)
* LUCENE-10431: MultiTermQuery.setRewriteMethod() has been removed. (Alan Woodward)
* LUCENE-10436: Remove deprecated DocValuesFieldExistsQuery, NormsFieldExistsQuery and
KnnVectorFieldExistsQuery. (Zach Chen, Adrien Grand)
* LUCENE-10561: Reduce class/member visibility of all normalizer and stemmer classes. (Rushabh Shah)
* LUCENE-10266: Move nearest-neighbor search on points to core. (Rushabh Shah)
* LUCENE-10603: Remove SortedSetDocValues#NO_MORE_ORDS definition. (Greg Miller)
* GITHUB#11813: Remove Operations.isFinite: the recursive implementation could be problematic
for large automatons (WildcardQuery, PrefixQuery, RegExpQuery, etc). (taroplus, Robert Muir)
* GITHUB#11840: Query rewrite now takes an IndexSearcher instead of IndexReader to enable concurrent
rewriting. (Patrick Zhai)
* GITHUB#11933: Remove IOContext from Directory#openChecksumInput. (Zach Chen)
* GITHUB#11814: Support deletions in IndexRearranger. (Stefan Vodita)
New Features
---------------------
* LUCENE-10010 Introduce NFARunAutomaton to run NFA directly. (Patrick Zhai)
* LUCENE-10626 Hunspell: add tools to aid dictionary editing:
analysis introspection, stem expansion and stem/flag suggestion (Peter Gromov)
Improvements
---------------------
* LUCENE-10416: Update Korean Dictionary to mecab-ko-dic-2.1.1-20180720 for Nori.
(Uihyun Kim)
* LUCENE-10614: Properly support getTopChildren in RangeFacetCounts. (Yuting Gan)
* LUCENE-10652: Add a top-n range faceting example to RangeFacetsExample. (Yuting Gan)
Optimizations
---------------------
* GITHUB#11857, GITHUB#11859, GITHUB#11893, GITHUB#11909: Hunspell: improved suggestion performance (Peter Gromov)
Bug Fixes
---------------------
* LUCENE-10599: LogMergePolicy is more likely to keep merging segments until
they reach the maximum merge size. (Adrien Grand)
Other
---------------------
* LUCENE-10376: Roll up the loop in VInt/VLong in DataInput. (Guo Feng)
* LUCENE-10283: The minimum required Java version was bumped from 11 to 17.
(Adrien Grand, Uwe Schindler, Dawid Weiss, Robert Muir)
* LUCENE-10253: The @BadApple annotation has been removed from the test
framework. (Adrien Grand)
* LUCENE-10393: Unify binary dictionary and dictionary writer in Kuromoji and Nori.
(Tomoko Uchida, Robert Muir)
* LUCENE-10475: Merge dictionary builders in `util` package into `dict` package in Kuromoji and Nori.
All classes in `org.apache.lucene.analysis.[ja|ko].util` was moved to `org.apache.lucene.analysis.[ja|ko].dict`.
(Tomoko Uchida)
* LUCENE-10493: Factor out Viterbi algorithm in Kuromoji and Nori to analysis-common. (Tomoko Uchida)
* GITHUB#977, LUCENE-9500: Remove the deflater hack introduced because of JDK-8252739 (Uwe Schindler)
* GITHUB#11960: Hunspell: supported empty dictionaries (Peter Gromov)
======================== Lucene 9.5.0 =======================
API Changes
---------------------
* GITHUB#11998: Add new stored fields and termvectors interfaces: IndexReader.storedFields()
and IndexReader.termVectors(). Deprecate IndexReader.document() and IndexReader.getTermVector().
The new APIs do not rely upon ThreadLocal storage for each index segment, which can greatly
reduce RAM requirements when there are many threads and/or segments.
(Adrien Grand, Robert Muir)
* GITHUB#11742: MatchingFacetSetsCounts#getTopChildren now properly returns "top" children instead
of all children. (Greg Miller)
* GITHUB#11772: Removed native subproject and WindowsDirectory implementation from lucene.misc. Recommendation:
use MMapDirectory implementation on Windows. (Robert Muir, Uwe Schindler, Dawid Weiss)
* GITHUB#11804: FacetsCollector#collect is no longer final, allowing extension. (Greg Miller)
* GITHUB#11761: TieredMergePolicy now allowed a maximum allowable deletes percentage of down to 5%, and the default
maximum allowable deletes percentage is changed from 33% to 20%. (Marc D'Mello)
* GITHUB#11822: Configure replicator PrimaryNode replia shutdown timeout. (Steven Schlansker)
* GITHUB#11930: Added IOContext#LOAD for files that are a small fraction of the
total index size and heavily accessed with a random access pattern. Some
Directory implementations may choose to load files that use this IOContext in
memory to provide stronger guarantees on query latency.
(Adrien Grand, Uwe Schindler)
* GITHUB#11941: QueryBuilder#add and #newSynonymQuery methods now take a `field` parameter,
to avoid possible exceptions when building queries from an empty term list. The helper
TermAndBoost class now holds a BytesRef rather than a Term. (Alan Woodward)
* GITHUB#11961: VectorValues#EMPTY was removed as this instance was not
necessary and also illegal as it reported a number of dimensions equal to
zero. (Adrien Grand)
* GITHUB#11962: VectorValues#cost() now delegates to VectorValues#size().
(Adrien Grand)
* GITHUB#11984: Improved TimeLimitBulkScorer to check the timeout at exponantial rate.
(Costin Leau)
* GITHUB#12004: Add new KnnByteVectorQuery for querying vector fields that are encoded as BYTE. Removes the ability to
use KnnVectorQuery against fields encoded as BYTE (Ben Trent)
* GITHUB#11997: Introduce IntField, LongField, FloatField and DoubleField.
These new fields index both 1D points and sorted numeric doc values and
provide best performance for filtering and sorting.
(Francisco Fernández Castaño, Adrien Grand)
New Features
---------------------
* GITHUB#11795: Add ByteWritesTrackingDirectoryWrapper to expose metrics for bytes merged, flushed, and overall
write amplification factor. (Marc D'Mello)
* GITHUB#11929: MMapDirectory gives more granular control on which files to
preload. (Adrien Grand, Uwe Schindler)
* GITHUB#11999: MemoryIndex now supports stored fields. (Alan Woodward)
* GITHUB#11997: Add IntField, LongField, FloatField and DoubleField: easy to
use numeric fields that perform well both for filtering and sorting.
(Francisco Fernández Castaño)
* GITHUB#12033: Support for Java 19 foreign memory support is now enabled by default,
no need to pass "--enable-preview" on the command line. If exactly Java 19 is used,
MMapDirectory will mmap Lucene indexes in chunks of 16 GiB (instead of 1 GiB) and
indexes closed while queries are running can no longer crash the JVM. (Uwe Schindler)
Improvements
---------------------
* GITHUB#11778: Detailed part-of-speech information for particle(조사) and ending(어미) on Nori
is now tagged. (Namgyu Kim)
* GITHUB#11785: Improve Tessellator performance by delaying calls to the method
#isIntersectingPolygon (Ignacio Vera)
* GITHUB#687: speed up IndexSortSortedNumericDocValuesRangeQuery#BoundedDocIdSetIterator
construction using bkd binary search. (Jianping Weng)
* GITHUB#11985: ExitableTerms to override Terms#getMin and Terms#getMax in order to avoid
iterating through the terms when the wrapped implementation caches such values. (Luca Cavanna)
* GITHUB#11860: Improve storage efficiency of connections in the HNSW graph that Lucene uses for
vector search. (Ben Trent)
* GITHUB#12008: Clean up LongRange#verifyAndEncode logic to remove unnecessary NaN checks. (Greg Miller)
* GITHUB#12003: Minor cleanup/improvements to IndexSortSortedNumericDocValuesRangeQuery. (Greg Miller)
* GITHUB#12016: Upgrade lucene/expressions to use antlr 4.11.1 (Andriy Redko)
Bug Fixes
---------------------
* GITHUB#11726: Indexing term vectors on large documents could fail due to
trying to apply a dictionary whose size is greater than the maximum supported
window size for LZ4. (Adrien Grand)
* GITHUB#11768: Taxonomy and SSDV faceting now correctly breaks ties by preferring smaller ordinal
values. (Greg Miller)
* GITHUB#11807: Don't rewrite queries in unified highlighter. (Alan Woodward)
* GITHUB#11907: Fix latent casting bugs in BKDWriter. (Ben Trent)
* GITHUB#11954: Remove QueryTimeout#isTimeoutEnabled method and move check to caller. (Shubham Chaudhary)
* GITHUB#11950: Fix NPE in BinaryRangeFieldRangeQuery variants when the queried field doesn't exist
in a segment or is of the wrong type. (Greg Miller)
* GITHUB#11990: PassageSelector now has a larger minimum size for its priority queue,
so that subsequent passage merges don't mean that we return too few passages in
total. (Alan Woodward, Dawid Weiss)
* GITHUB#11986: Fix algorithm that chooses the bridge between a polygon and a hole when there is
common vertex. (Ignacio Vera)
* GITHUB#12020: Fixes bug whereby very flat polygons can incorrectly contain intersecting geometries. (Craig Taverner)
Optimizations
---------------------
* GITHUB#11738: Optimize MultiTermQueryConstantScoreWrapper when a term is present that matches all
docs in a segment. (Greg Miller)
* GITHUB#11735: KeywordRepeatFilter + OpenNLPLemmatizer always drops last token of a stream.
(Luke Kot-Zaniewski)
* GITHUB#11771: KeywordRepeatFilter + OpenNLPLemmatizer sometimes arbitrarily exits token stream.
(Luke Kot-Zaniewski)
* GITHUB#11803: DrillSidewaysScorer has improved to leverage "advance" instead of "next" where
possible, and splits out first and second phase checks to delay match confirmation. (Greg Miller)
* GITHUB#11828: Tweak TermInSetQuery "dense" optimization to only require all terms present in a
given field to match a term (rather than all docs in a segment). This is consistent with
MultiTermQueryConstantScoreWrapper. (Greg Miller)
* GITHUB#11876: Use ByteArrayComparator to speed up PointInSetQuery in single dimension case.
(Guo Feng)
* GITHUB#11880: Use ByteArrayComparator to speed up BinaryRangeFieldRangeQuery, RangeFieldQuery
LatLonPointDistanceFeatureQuery and CheckIndex. (Guo Feng)
* GITHUB#11881: Further optimize drill-sideways scoring by specializing the single dimension case
and borrowing some concepts from "min should match" scoring. (Greg Miller)
* GITHUB#11884: Simplify the logic of matchAll() in IndexSortSortedNumericDocValuesRangeQuery. (Lu Xugang)
* GITHUB#11895: count() in BooleanQuery could be early quit. (Lu Xugang)
* GITHUB#11972: `IndexSortSortedNumericDocValuesRangeQuery` can now also
optimize query execution with points for descending sorts. (Adrien Grand)
* GITHUB#12006: Do ints compare instead of ArrayUtil#compareUnsigned4 in LatlonPointQueries. (Guo Feng)
* GITHUB#12017: Aggressive count in BooleanWeight. (Lu Xugang)
Other
---------------------
* GITHUB#11856: Fix nanos to millis conversion for tests (Marios Trivyzas)
* LUCENE-10423: Remove usages of System.currentTimeMillis() from tests. (Marios Trivyzas)
* GITHUB#11811: Upgrade google java format to 1.15.0 (Dawid Weiss)
* GITHUB#11834: Upgrade forbiddenapis to version 3.4. (Uwe Schindler)
* LUCENE-10635: Ensure test coverage for WANDScorer by using a test query. (Zach Chen, Adrien Grand)
* GITHUB#11752: Added interface to relate a LatLonShape with another shape represented as Component2D. (Navneet Verma)
* GITHUB#11983: Make constructors for OffsetFromPositions and OffsetsFromMatchIterator
public. (Alan Woodward)
* LUCENE-10546: Update Faceting user guide. (Egor Potemkin)
Build
---------------------
* GITHUB#11886: Upgrade to gradle 7.5.1 (Dawid Weiss)
======================== Lucene 9.4.2 =======================
Bug Fixes
---------------------
* GITHUB#11905: Fix integer overflow when seeking the vector index for connections in a single segment.
This addresses a bug that was introduced in 9.2.0 where having many vectors is not handled well
in the vector connections reader.
* GITHUB#11939: Fix incorrect cost calculation in DocIdSetBuilder after upgradeToBitSet when doc list is growing.
This addresses a bug where the cost of TermRangeQuery/TermInSetQuery and some other queries will be highly underestimated.
Improvements
---------------------
* GITHUB#11912, GITHUB#11918: Port generic exception handling from MemorySegmentIndexInput
to ByteBufferIndexInput. This also adds the invalid position while seeking or reading
to the exception message. Allows better debugging and analysis of bugs like GITHUB#11905.
(Uwe Schindler, Robert Muir)
* GITHUB#11916: improve checkindex to be more thorough for vectors. (Ben Trent)
======================== Lucene 9.4.1 =======================
Bug Fixes
---------------------
* GITHUB#11858: Fix kNN vectors format validation on large segments. This
addresses a regression in 9.4.0 where validation could fail, preventing
further writes or searches on the index. (Julie Tibshirani)
======================== Lucene 9.4.0 =======================
API Changes
---------------------
* LUCENE-10577: Add VectorEncoding to enable byte-encoded HNSW vectors (Michael Sokolov, Julie Tibshirani)
New Features
---------------------
* LUCENE-10654: Add new ShapeDocValuesField for LatLonShape and XYShape. (Nick Knize)
* LUCENE-10629: Support match set filtering with a query in MatchingFacetSetCounts. (Stefan Vodita, Shai Erera)
* LUCENE-10633: SortField#setOptimizeSortWithIndexedData and
SortField#getOptimizeSortWithIndexedData were introduced to provide
an option to disable sort optimization for various sort fields. (Mayya Sharipova)
* GITHUB#912: Support for Java 19 foreign memory support was added. Applications started
with command line parameter "java --enable-preview" will automatically use the new
foreign memory API of Java 19 to access indexes on disk with MMapDirectory. This is
an opt-in feature and requires explicit Java command line flag! When enabled, Lucene logs
a notice using java.util.logging. Please test thoroughly and report bugs/slowness to Lucene's
mailing list. When the new API is used, MMapDirectory will mmap Lucene indexes in chunks of
16 GiB (instead of 1 GiB) and indexes closed while queries are running can no longer crash
the JVM. (Uwe Schindler)
Improvements
---------------------
* LUCENE-10592: Build HNSW Graph on indexing. (Mayya Sharipova, Adrien Grand, Julie Tibshirani)
* LUCENE-10207: TermInSetQuery can now provide a ScoreSupplier with cost estimation, making it
usable in IndexOrDocValuesQuery. (Greg Miller)
* LUCENE-10216: Use MergePolicy to define and MergeScheduler to trigger the reader merges
required by addIndexes(CodecReader[]) API. (Vigya Sharma, Michael McCandless)
* GITHUB#11715: Add Integer awareness to RamUsageEstimator.sizeOf (Mike Drob)
Optimizations
---------------------
* LUCENE-10661: Reduce memory copy in BytesStore. (luyuncheng)
* GITHUB#1020: Support #scoreSupplier and small optimizations to DocValuesRewriteMethod. (Greg Miller)
* LUCENE-10633: Added support for dynamic pruning to queries sorted by a string
field that is indexed with terms and SORTED or SORTED_SET doc values.
(Adrien Grand)
* LUCENE-10627: Using ByteBuffersDataInput reduce memory copy on compressing data. (luyuncheng)
* GITHUB#1062: Optimize TermInSetQuery when a term is present that matches all docs in a segment.
(Greg Miller)
Bug Fixes
---------------------
* LUCENE-10663: Fix KnnVectorQuery explain with multiple segments. (Shiming Li)
* LUCENE-10673: Improve check of equality for latitudes for spatial3d GeoBoundingBox (ignacio Vera)
* LUCENE-10678: Fix potential overflow when building a BKD tree with more than 4 billion points. The overflow
occurs when computing the partition point. (Ignacio Vera)
* LUCENE-10644: Facets#getAllChildren testing should ignore child order. (Yuting Gan)
* LUCENE-10665, GITHUB#11701: Fix classloading deadlock in analysis factories / AnalysisSPILoader
initialization. (Uwe Schindler)
* LUCENE-10674: Ensure BitSetConjDISI returns NO_MORE_DOCS when sub-iterator exhausts. (Jack Mazanec)
* GITHUB#11794: Guard FieldExistsQuery against null pointers (Luca Cavanna)
Build
---------------------
* GITHUB#11720: Upgrade randomizedtesting to 2.8.1 (potential fix for odd wall clock - related
timeout failures). (Dawid Weiss)
* LUCENE-10669: The build should be more helpful when generated resources are touched (Dawid Weiss)
Other
---------------------
* LUCENE-10559: Add Prefilter Option to KnnGraphTester (Kaival Parikh)
======================== Lucene 9.3.0 =======================
API Changes
---------------------
* LUCENE-10603: SortedSetDocValues#NO_MORE_ORDS marked @deprecated in favor of iterating with
SortedSetDocValues#docValueCount(). (Greg Miller)
* GITHUB#978: Deprecate (remove in Lucene 10) obsolete constants in oal.util.Constants; remove
code which is no longer executed after Java 9. (Uwe Schindler)
New Features
---------------------
* LUCENE-10550: Add getAllChildren functionality to facets (Yuting Gan)
* LUCENE-10274: Added facetsets module for high dimensional (hyper-rectangle) faceting
(Shai Erera, Marc D'Mello, Greg Miller)
* LUCENE-10151 Enable timeout support in IndexSearcher. (Deepika Sharma)
Improvements
---------------------
* LUCENE-10078: Merge on full flush is now enabled by default with a timeout of
500ms. (Adrien Grand)
* LUCENE-10585: Facet module code cleanup (copy/paste scrubbing, simplification and some very minor
optimization tweaks). (Greg Miller)
* LUCENE-10603: Update SortedSetDocValues iteration to use SortedSetDocValues#docValueCount().
(Greg Miller, Stefan Vodita)
* LUCENE-10619: Optimize the writeBytes in TermsHashPerField. (Tang Donghai)
* GITHUB#983: AbstractSortedSetDocValueFacetCounts internal code cleanup/refactoring. (Greg Miller)
Optimizations
---------------------
* LUCENE-8519: MultiDocValues.getNormValues should not call getMergedFieldInfos (Rushabh Shah)
* GITHUB#961: BooleanQuery can return quick counts for simple boolean queries.
(Adrien Grand)
* LUCENE-10618: Implement BooleanQuery rewrite rules based for minimumShouldMatch. (Fang Hou)
* LUCENE-10480: Implement Block-Max-Maxscore scorer for 2 clauses disjunction. (Zach Chen, Adrien Grand)
* LUCENE-10606: For KnnVectorQuery, optimize case where filter is backed by BitSetIterator (Kaival Parikh)
* LUCENE-10593: Vector similarity function and NeighborQueue reverse removal. (Alessandro Benedetti)
* GITHUB#984: Use primitive type data structures in FloatTaxonomyFacets and IntTaxonomyFacets
#getAllChildren() internal implementation to avoid some garbage creation. (Greg Miller)
* GITHUB#1010: Specialize ordinal encoding for common case in SortedSetDocValues. (Greg Miller)
* LUCENE-10657: CopyBytes now saves one memory copy on ByteBuffersDataOutput. (luyuncheng)
* GITHUB#1007: Optimize IntersectVisitor#visit implementations for certain bulk-add cases.
(Greg Miller)
* LUCENE-10653: BlockMaxMaxscoreScorer uses heapify instead of individual adds. (Greg Miller)
Changes in runtime behavior
---------------------
* GITHUB#978: IndexWriter diagnostics written to index only contain java's runtime version
and vendor. (Uwe Schindler)
Bug Fixes
---------------------
* LUCENE-10574: Prevent pathological O(N^2) merging. (Adrien Grand)
* LUCENE-10584: Properly support #getSpecificValue for hierarchical dims in SSDV faceting.
(Greg Miller)
* LUCENE-10582: Fix merging of overridden CollectionStatistics in CombinedFieldQuery (Yannick Welsch)
* LUCENE-10563: Fix failure to tessellate complex polygon (Craig Taverner)
* LUCENE-10605: Fix error in 32bit jvm object alignment gap calculation (Sun Wuqiang)
* GITHUB#956: Make sure KnnVectorQuery applies search boost. (Julie Tibshirani)
* LUCENE-10598: SortedSetDocValues#docValueCount() should be always greater than zero. (Lu Xugang)
* LUCENE-10600: SortedSetDocValues#docValueCount should be an int, not long (Lu Xugang)
* LUCENE-10611: Fix failure when KnnVectorQuery has very selective filter (Kaival Parikh)
* LUCENE-10607: Fix potential integer overflow in maxArcs computions (Tang Donghai)
* GITHUB#986: Fix FieldExistsQuery rewrite when all docs have vectors. (Julie Tibshirani)
* LUCENE-10623: Error implementation of docValueCount for SortingSortedSetDocValues (Lu Xugang)
* GITHUB#1028: Fix error in TieredMergePolicy (Lin Jian)
Other
---------------------
* GITHUB#991: Update randomizedtesting to 2.8.0, hppc to 0.9.1, morfologik to 2.1.9. (Dawid Weiss)
* LUCENE-10370: pass proper classpath/module arguments for forking jvms from within tests. (Dawid Weiss)
* LUCENE-10604: Improve ability to test and debug triangulation algorithm in Tessellator.
(Craig Taverner)
* GITHUB#922: Remove unused and confusing FacetField indexing options (Gautam Worah)
Build
---------------------
* GITHUB#976: Exclude Lucene's own JAR files from classpath entries in Eclipse config.
(Uwe Schindler)
======================= Lucene 9.2.0 =======================
API Changes
---------------------
* LUCENE-10325: Facets API extended to support getTopFacets. (Yuting Gan)
* LUCENE-10482: Allow users to create their own DirectoryTaxonomyReaders with empty taxoArrays instead of letting the
taxoEpoch decide. Add a test case that demonstrates the inconsistencies caused when you reuse taxoArrays on older
checkpoints. (Gautam Worah)
* LUCENE-10558: Add new constructors to Kuromoji and Nori dictionary classes to support classpath /
module system usage. It is now possible to use JDK's Class/ClassLoader/Module#getResource(...) apis
and pass their returned URL to dictionary constructors to load resources from Classpath or Module
resources. (Uwe Schindler, Tomoko Uchida, Mike Sokolov)
New Features
---------------------
* LUCENE-10312: Add PersianStemmer based on the Arabic stemmer. (Ramin Alirezaee)
* LUCENE-10539: Return a stream of completions from FSTCompletion. (Dawid Weiss)
* LUCENE-10385: Implement Weight#count on IndexSortSortedNumericDocValuesRangeQuery
to speed up computing the number of hits when possible. (Lu Xugang, Luca Cavanna, Adrien Grand)
* LUCENE-10422: Monitor Improvements: `Monitor` can use a custom `Directory`
implementation. `Monitor` can be created with a readonly `QueryIndex` in order to
have readonly `Monitor` instances. (Niko Usai)
* LUCENE-10456: Implement rewrite and Weight#count for MultiRangeQuery
by merging overlapping ranges . (Jianping Weng)
* LUCENE-10444: Support alternate aggregation functions in association facets. (Greg Miller)
Improvements
---------------------
* LUCENE-10229: return -1 for unknown offsets in ExtendedIntervalsSource. Modify highlighting to
work properly with or without offsets. (Dawid Weiss)
* LUCENE-10494: Implement method to bulk add all collection elements to a PriorityQueue.
(Bauyrzhan Sakhariyev)
* LUCENE-10484: Add support for concurrent random sampling by calling
RandomSamplingFacetsCollector#createManager. (Luca Cavanna)
* LUCENE-10467: Throws IllegalArgumentException for Facets#getAllDims and Facets#getTopChildren
if topN <= 0. (Yuting Gan)
* LUCENE-9848: Correctly sort HNSW graph neighbors when applying diversity criterion (Mayya
Sharipova, Michael Sokolov)
* LUCENE-10527: Use 2*maxConn for the last layer in HNSW (Mayya Sharipova)
Optimizations
---------------------
* LUCENE-10555: avoid NumericLeafComparator#iteratorCost repeated initialization
when NumericLeafComparator#setScorer is called. (Jianping Weng)
* LUCENE-10452: Hunspell: call checkCanceled less frequently to reduce the overhead (Peter Gromov)
* LUCENE-10451: Hunspell: don't perform potentially expensive spellchecking after timeout (Peter Gromov)
* LUCENE-10418: More `Query#rewrite` optimizations for the non-scoring case.
(Adrien Grand)
* LUCENE-10436: Deprecate DocValuesFieldExistsQuery, NormsFieldExistsQuery and KnnVectorFieldExistsQuery
with FieldExistsQuery. (Zach Chen, Michael McCandless, Adrien Grand)
* LUCENE-10481: FacetsCollector will not request scores if it does not use them. (Mike Drob)
* LUCENE-10503: Potential speedup for pure disjunctions whose clauses produce
scores that are very close to each other. (Adrien Grand)
* LUCENE-10315: Use SIMD instructions to decode BKD doc IDs. (Guo Feng, Adrien Grand, Ignacio Vera)
* LUCENE-8836: Speed up calls to TermsEnum#lookupOrd on doc values terms enums
and sequences of increasing ords. (Bruno Roustant, Adrien Grand)
* LUCENE-10536: Doc values terms dictionaries now use the first (uncompressed)
term of each block as a dictionary when compressing suffixes of the other 63
terms of the block. (Adrien Grand)
* LUCENE-10411: Add nearest neighbors vectors support to ExitableDirectoryReader.
(Zach Chen, Adrien Grand, Julie Tibshirani, Tomoko Uchida)
* LUCENE-10542: FieldSource exists implementations can avoid value retrieval (Kevin Risden)
* LUCENE-10534: MinFloatFunction / MaxFloatFunction exists check can be slow (Kevin Risden)
* LUCENE-10496: Queries sorted by field now better handle the degenerate case
when the search order and the index order are in opposite directions.
(Jianping Weng)
* LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle
ordToDoc in HNSW vectors (Lu Xugang)
* LUCENE-10488: Facets#getTopDims optimized for taxonomy faceting and
ConcurrentSortedSetDocValuesFacetCounts. (Yuting Gan)
Bug Fixes
---------------------
* LUCENE-10477: Highlighter: WeightedSpanTermExtractor.extractWeightedSpanTerms to Query#rewrite
multiple times if necessary. (Christine Poerschke, Adrien Grand)
* LUCENE-10491: A correctness bug in the way scores are provided within TaxonomyFacetSumValueSource
was fixed. (Michael McCandless, Greg Miller)
* LUCENE-10466: Ensure IndexSortSortedNumericDocValuesRangeQuery handles sort field
types besides LONG (Andriy Redko)
* LUCENE-10292: Suggest: Fix AnalyzingInfixSuggester / BlendedInfixSuggester to correctly return
existing lookup() results during concurrent build(). Fix other FST based suggesters so that
getCount() returned results consistent with lookup() during concurrent build(). (hossman)
* LUCENE-10508: Fixes some edge cases where GeoArea were built in a way that vertical planes
could not evaluate their sign, either because the planes where the same or the center between those
planes was lying in one of the planes. (Ignacio Vera)
* LUCENE-10495: Fix return statement of siblingsLoaded() in TaxonomyFacets. (Yuting Gan)
* LUCENE-10533: SpellChecker.formGrams is missing bounds check (Kevin Risden)
* LUCENE-10529: Properly handle when TestTaxonomyFacetAssociations test case randomly indexes
no documents instead of throwing an NPE. (Greg Miller)
* LUCENE-10470: Check if polygon has been successfully tessellated before we fail (we are failing some valid
tessellations) and allow filtering edges that fold on top of the previous one. (Ignacio Vera)
* LUCENE-10530: Avoid floating point precision test case bug in TestTaxonomyFacetAssociations.
(Greg Miller)
* LUCENE-10552: KnnVectorQuery has incorrect equals/ hashCode. (Lu Xugang)
* LUCENE-10558: Restore behaviour of deprecated Kuromoji and Nori dictionary constructors for
custom dictionary support. Please also use new URL-based constructors for classpath/module
system ressources. (Uwe Schindler, Tomoko Uchida, Mike Sokolov)
* LUCENE-10564: Make sure SparseFixedBitSet#or updates ramBytesUsed. (Julie Tibshirani)
Build
---------------------
* GITHUB#768: Upgrade forbiddenapis to version 3.3. (Uwe Schindler)
* GITHUB#890: Detect CI builds on Github or Jenkins and enable errorprone. (Uwe Schindler, Dawid Weiss)
* LUCENE-10532: Remove LuceneTestCase.Slow annotation. All tests can be fast. (Robert Muir)
Other
---------------------
* LUCENE-10526: Test-framework: Add FilterFileSystemProvider.wrapPath(Path) method for mock filesystems
to override if they need to extend the Path implementation. (Gautam Worah, Robert Muir)
* LUCENE-10525: Test-framework: Add detection of illegal windows filenames to WindowsFS. (Gautam Worah)
* LUCENE-10541: Test-framework: limit the default length of MockTokenizer tokens to 255.
(Robert Muir, Uwe Schindler, Tomoko Uchida, Dawid Weiss)
* GITHUB#854: Allow to link to GitHub pull request from CHANGES. (Tomoko Uchida, Jan Høydahl)
======================= Lucene 9.1.0 =======================
API Changes
---------------------
* LUCENE-10244: MultiCollector::getCollectors is now public, allowing users to access the wrapped
collectors. (Andriy Redko)
* LUCENE-10197: UnifiedHighlighter now has a Builder to construct it. The UH's setters are now
deprecated. (Animesh Pandey, David Smiley)
* LUCENE-10301: the test framework is now a module. All the classes have been moved from
org.apache.lucene.* to org.apache.lucene.tests.* to avoid package name conflicts with the
core module. (Dawid Weiss)
* LUCENE-10183: KnnVectorsWriter#writeField to take KnnVectorsReader instead of VectorValues.
(Zach Chen, Michael Sokolov, Julie Tibshirani, Adrien Grand)
* LUCENE-10335: Deprecate helper methods for resource loading in IOUtils and StopwordAnalyzerBase
that are not compatible with module system (Class#getResourceAsStream() and Class#getResource()
are caller sensitive in Java 11). Instead add utility method IOUtils#requireResourceNonNull(T)
to test existence of resource based on null return value. (Uwe Schindler, Dawid Weiss)
* LUCENE-10349: WordListLoader methods now return unmodifiable CharArraySets. (Uwe Schindler)
* LUCENE-10377: SortField.getComparator() has changed signature. The second parameter is now
a boolean indicating whether or not skipping should be enabled on the comparator.
(Alan Woodward)
* LUCENE-10381: Require users to provide FacetsConfig for SSDV faceting. (Greg Miller)
* LUCENE-10368: IntTaxonomyFacets has been deprecated and is no longer a supported extension point
for user-created faceting implementations. (Greg Miller)
* LUCENE-10400: Add constructors that take external resource Paths to dictionary classes in Kuromoji and Nori:
ConnectionCosts, TokenInfoDictionary, and UnknownDictionary. Old constructors that take resource scheme and
resource path in those classes are deprecated; These are replaced with the new constructors and planned to be
removed in a future release. (Tomoko Uchida, Uwe Schindler, Mike Sokolov)
* LUCENE-10050: Deprecate DrillSideways#search(Query, Collector) in favor of
DrillSideways#search(Query, CollectorManager). This reflects the change (LUCENE-10002) being made in
IndexSearcher#search that trends towards using CollectorManagers over Collectors. (Gautam Worah)
* LUCENE-10420: Move functional interfaces in IOUtils to top-level interfaces.
(David Smiley, Uwe Schindler, Dawid Weiss, Tomoko Uchida)
* LUCENE-10398: Add static method for getting Terms from LeafReader. (Spike Liu)
* LUCENE-10440: TaxonomyFacets and FloatTaxonomyFacets have been deprecated and are no longer
supported extension points for user-created faceting implementations. (Greg Miller)
* LUCENE-10431: MultiTermQuery.setRewriteMethod() has been deprecated, and constructor
parameters for the various implementations added. (Alan Woodward)
* LUCENE-10171: OpenNLPOpsFactory.getLemmatizerDictionary(String, ResourceLoader) now returns a
DictionaryLemmatizer object instead of a raw String serialization of the dictionary.
(Spyros Kapnissis via Michael Gibney, Alessandro Benedetti)
New Features
---------------------
* LUCENE-10255: Lucene JARs are now proper modules, with module descriptors and dependency information.
(Chris Hegarty, Uwe Schindler, Tomoko Uchida, Dawid Weiss)
* LUCENE-10342: Lucene Core now depends on java.logging (JUL) module and reports
if MMapDirectory cannot unmap mapped ByteBuffers or RamUsageEstimator's object size
calculations may be off. This was added especially for users running Lucene with the
Java Module System where some optional features are not available by default or supported.
For all apps using Lucene it is strongly recommended, to explicitely require non-standard
JDK modules: jdk.unsupported (unmapping) and jdk.management (OOP size for RAM usage calculatons).
It is also recommended to install JUL logging adapters to feed the log events into your app's
logging system. (Uwe Schindler, Dawid Weiss, Tomoko Uchida, Robert Muir)
* LUCENE-10330: Make MMapDirectory tests fail by default, if unmapping does not work.
(Uwe Schindler, Dawid Weiss)
* LUCENE-10223: Add interval function support to StandardQueryParser. Add min-should-match operator
support to StandardQueryParser. Update and clean up package documentation in flexible query parser
module. (Dawid Weiss, Alan Woodward)
* LUCENE-10220: Add an utility method to get IntervalSource from analyzed text (or token stream).
(Uwe Schindler, Dawid Weiss, Alan Woodward)
* LUCENE-10085: Added Weight#count on DocValuesFieldExistsQuery to speed up the query if terms or
points are indexed.
(Quentin Pradet, Adrien Grand)
* LUCENE-10263: Added Weight#count to NormsFieldExistsQuery to speed up the query if all
documents have the field.. (Alan Woodward)
* LUCENE-10248: Add SpanishPluralStemFilter, for precise stemming of Spanish plurals.
For more information, see https://s.apache.org/spanishplural (Xavier Sanchez Loro)
* LUCENE-10243: StandardTokenizer, UAX29URLEmailTokenizer, and HTMLStripCharFilter have
been upgraded to Unicode 12.1 (Robert Muir)
* LUCENE-10335: Add ModuleResourceLoader as complement to ClasspathResourceLoader.
(Uwe Schindler)
* LUCENE-10245: MultiDoubleValues(Source) and MultiLongValues(Source) were added as multi-valued
versions of DoubleValues(Source) and LongValues(Source) to the facets module. LongValueFacetCounts,
LongRangeFacetCounts and DoubleRangeFacetCounts were augmented to support these new multi-valued
abstractions. DoubleRange and LongRange also support creating queries from these multi-valued
sources. (Greg Miller)
* LUCENE-10250: Add support for arbitrary length hierarchical SSDV facets. (Marc D'mello)
* LUCENE-10395: Add support for TotalHitCountCollectorManager, a collector manager
based on TotalHitCountCollector that allows users to parallelize counting the
number of hits. (Luca Cavanna, Adrien Grand)
* LUCENE-10403: Add ArrayUtil#grow(T[]). (Greg Miller)
* LUCENE-10414: Add fn:fuzzyTerm interval function to flexible query parser (Dawid Weiss,
Alan Woodward)
* LUCENE-10378: Implement Weight#count for PointRangeQuery to provide a faster way to calculate
the number of matching range docs when each doc has at-most one point and the points are 1-dimensional.
(Gautam Worah, Ignacio Vera, Adrien Grand)
* LUCENE-10415: FunctionScoreQuery and IndexOrDocValuesQuery delegate Weight#count. (Ignacio Vera)
* LUCENE-10382: Add support for filtering in KnnVectorQuery. This allows for finding the
nearest k documents that also match a query. (Julie Tibshirani, Joel Bernstein)
* LUCENE-10237: Add MergeOnFlushMergePolicy to sandbox.
(Michael Froh, Anand Kotriwal)
Improvements
---------------------
* LUCENE-10313: use java util logging in Luke. Add dynamic log filtering. Drop
the persistent log previously written to ~/.luke.d/luke.log. Configure Java's default
logging handlers to persist Luke logs according to your needs. (Tomoko Uchida, Dawid Weiss)
* LUCENE-10238: Upgrade icu4j dependency to 70.1. (Dawid Weiss)
* LUCENE-9820: Extract BKD tree interface and move intersecting logic to the
PointValues abstract class. (Ignacio Vera, Adrien Grand)
* LUCENE-10262: Lift up restrictions for navigating PointValues#PointTree
added in LUCENE-9820 (Ignacio Vera)
* LUCENE-9538: Detect polygon self-intersections in the Tessellator. (Ignacio Vera)
* LUCENE-10275: Speed up MultiRangeQuery by using an interval tree. (Ignacio Vera)
* LUCENE-10229: Unify behaviour of match offsets for interval queries on fields
with or without offsets enabled. (Patrick Zhai)
* LUCENE-10054 Make HnswGraph hierarchical (Mayya Sharipova, Julie Tibshirani, Mike Sokolov,
Adrien Grand)
* LUCENE-10371: Make IndexRearranger able to arrange segment in a determined order.
(Patrick Zhai)
Optimizations
---------------------
* LUCENE-10329: Use computed block mask for DirectMonotonicReader#get. (Guo Feng)
* LUCENE-10280: Optimize BKD leaves' doc IDs codec when they are continuous. (Guo Feng)
* LUCENE-10233: Store BKD leaves' doc IDs as bitset in some cases (typically for low cardinality fields
or sorted indices) to speed up addAll. (Guo Feng, Adrien Grand)
* LUCENE-10225: Improve IntroSelector with 3-ways partitioning. (Bruno Roustant, Adrien Grand)
* LUCENE-10321: Tweak MultiRangeQuery interval tree creation to skip "pulling up" mins. (Greg Miller)
* LUCENE-10252: ValueSource.asDoubleValues and asLongValues should not compute the score unless
asked to -- typically never. This fixes a performance regression since 7.3 LUCENE-8099 when some
older boosting queries were replaced with this. (David Smiley)
* LUCENE-10346: Optimize facet counting for single-valued TaxonomyFacetCounts. (Guo Feng)
* LUCENE-10356: Further optimize facet counting for single-valued TaxonomyFacetCounts. (Greg Miller)
* LUCENE-10379: Count directly into the dense values array in FastTaxonomyFacetCounts#countAll.
(Guo Feng, Greg Miller)
* LUCENE-10375: Speed up HNSW vectors merge by first writing combined vector
data to a file. (Julie Tibshirani, Adrien Grand)
* LUCENE-10388: Remove MultiLevelSkipListReader#SkipBuffer to make JVM less confused. (Guo Feng)
* LUCENE-10367: Optimize CoveringQuery for the case when the minimum number of
matching clauses is a constant. (LuYunCheng via Adrien Grand)
* LUCENE-10412: More `Query#rewrite` optimizations for MatchNoDocsQuery.
(Adrien Grand)
* LUCENE-10408 Better encoding of doc Ids in vectors. (Mayya Sharipova, Julie Tibshirani, Adrien Grand)
* LUCENE-10424, LUCENE-10439: Optimize the "everything matches" case for count query in PointRangeQuery. (Ignacio Vera, Lu Xugang)
* LUCENE-10084, LUCENE-10435: Rewrite DocValuesFieldExistsQuery to MatchAllDocsQuery whenever
terms or points have a docCount that is equal to maxDoc. (Vigya Sharma, Lu Xugang)
* LUCENE-10442: When indexQuery or/and dvQuery be a MatchAllDocsQuery
then IndexOrDocValuesQuery should be rewrite to MatchAllDocsQuery. (Lu Xugang)
* LUCENE-10450: IndexSortSortedNumericDocValuesRangeQuery could be rewrite to MatchAllDocsQuery. (Lu Xugang)
* LUCENE-10453: Indexing and search speedup with KNN vectors when using
euclidean distance. (Adrien Grand)
* LUCENE-10455: IndexSortSortedNumericDocValuesRangeQuery now implements the scorerSupplier API. (Lu Xugang)
Changes in runtime behavior
---------------------
* LUCENE-10291: Lucene now only writes files for terms and postings if at least
one field is indexed with postings. (Yannick Welsch)
* LUCENE-10311: FixedBitSet#approximateCardinality now trades accuracy for
speed instead of delegating to FixedBitSet#cardinality.
(Robert Muir, Adrien Grand)
Bug Fixes
---------------------
* LUCENE-10316: fix TestLRUQueryCache.testCachingAccountableQuery failure. (Patrick Zhai)
* LUCENE-10279: Fix equals in MultiRangeQuery. (Ignacio Vera)
* LUCENE-10349: Fix all analyzers to behave according to their documentation:
getDefaultStopSet() methods now return unmodifiable CharArraySets. (Uwe Schindler)
* LUCENE-10352: Add missing service provider entries: KoreanNumberFilterFactory,
DaitchMokotoffSoundexFilterFactory (Uwe Schindler, Robert Muir)
* LUCENE-10352: Fixed ctor argument checks: JapaneseKatakanaStemFilter,
DoubleMetaphoneFilter (Uwe Schindler, Robert Muir)
* LUCENE-10236: Stop duplicating norms when scoring in CombinedFieldQuery.
(Zach Chen, Jim Ferenczi, Julie Tibshirani)
* LUCENE-10353: Add random null injection to TestRandomChains. (Robert Muir,
Uwe Schindler)
* LUCENE-10377: CheckIndex could incorrectly throw an error when checking index sorts
defined on older indexes. (Alan Woodward)
* LUCENE-9952: Address inaccurate dim counts for SSDV faceting in cases where a dim is configured
as multi-valued. (Greg Miller)
* LUCENE-10401: Fix lookups on empty doc-value terms dictionaries to no longer
throw an ArrayIndexOutOfBoundsException. (Adrien Grand)
* LUCENE-10402: Prefix intervals should declare their automaton as binary, otherwise prefixes
containing multibyte characters will not correctly match. (Alan Woodward)
* LUCENE-10407: Containing intervals could sometimes yield incorrect matches when wrapped
in a disjunction. (Alan Woodward, Dawid Weiss)
* LUCENE-10405: When using the MemoryIndex, binary and Sorted doc values are stored
as BytesRef instead of BytesRefHash so they don't have a limit on size. (Ignacio Vera)
* LUCENE-10428: Queries with a misbehaving score function may no longer cause
infinite loops in their parent BooleanQuery.
(Ankit Jain, Daniel Doubrovkine, Adrien Grand)
* LUCENE-10431: MultiTermQuery no longer includes its rewrite method in its hashcode
calculation, as this could cause problems with wrapper queries like BooleanQuery which
expect their child queries hashcodes to be stable. (Alan Woodward)
* LUCENE-10469: Fix ScoreMode propagation by ConstantScoreQuery. (Adrien Grand)
Other
---------------------
* LUCENE-10273: Deprecate SpanishMinimalStemFilter in favor of SpanishPluralStemFilter. (Robert Muir)
* LUCENE-10284: Upgrade morfologik-stemming to 2.1.8. (Dawid Weiss)
* LUCENE-10310: TestXYDocValuesQueries#doRandomDistanceTest does not produce random circles with radius
with '0' value any longer.
* LUCENE-10352: Removed duplicate instances of StringMockResourceLoader and migrated class to
test-framework. (Uwe Schindler, Robert Muir)
* LUCENE-10352: Convert TestAllAnalyzersHaveFactories and TestRandomChains to a global integration test
and discover classes to check from module system. The test now checks all analyzer modules,
so it may discover new bugs outside of analysis:common module. (Uwe Schindler, Robert Muir)
* LUCENE-10413: Make Ukrainian default stop words list available as a public getter. (Alan Woodward)
* LUCENE-10437: Polygon tessellator throws a more informative error message when the provided polygon
does not contain enough no-collinear points. (Ignacio Vera)
======================= Lucene 9.0.0 =======================
New Features
---------------------
* LUCENE-9322, LUCENE-9855: Vector-valued fields, Lucene90 Codec (Mike Sokolov, Julie Tibshirani, Tomoko Uchida)
* LUCENE-9004, LUCENE-10040: Approximate nearest vector search via NSW graphs (Mike Sokolov, Tomoko Uchida et al.)
* LUCENE-9659: SpanPayloadCheckQuery now supports inequalities. (Kevin Watters, Gus Heck)
* LUCENE-9589: Swedish Minimal Stemmer (janhoy)
* LUCENE-9313: Add SerbianAnalyzer based on the snowball stemmer. (Dragan Ivanovic)
* LUCENE-10095: Add NepaliAnalyzer based on the snowball stemmer. (Robert Muir)
* LUCENE-10096: Add TamilAnalyzer based on the snowball stemmer. (Robert Muir)
* LUCENE-10102: Add JapaneseCompletionFilter for Input Method-aware auto-completion (Tomoko Uchida, Robert Muir, Jun Ohtani)
System Requirements
---------------------
* LUCENE-8738: Move to Java 11 as minimum Java version.
(Adrien Grand, Uwe Schindler)
API Changes
---------------------
* LUCENE-8638: Remove many deprecated methods and classes including FST.lookupByOutput(),
LegacyBM25Similarity and Jaspell suggester.
* LUCENE-8982: Separate out native code to another module to allow cpp
build with gradle. This also changes the name of the native "posix-support"
library to LuceneNativeIO. (Zachary Chen, Dawid Weiss)
* LUCENE-9562: All binary analysis packages (and corresponding