- Added f64 field. Internally reuse u64 code the same way i64 does (@fdb-hiroshima)
- Various bugfixes in the query parser.
- Better handling of hyphens in query parser. (#609)
- Better handling of whitespaces.
- Closes #498 - add support for Elastic-style unbounded range queries for alphanumeric types eg. "title:>hello", "weight:>=70.5", "height:<200" (@petr-tik)
- API change around
Box<BoxableTokenizer>
. See detail in #629 - Avoid rebuilding Regex automaton whenever a regex query is reused. #630 (@brainlock)
Box<dyn BoxableTokenizer>
has been replaced by aBoxedTokenizer
struct.- Regex are now compiled when the
RegexQuery
instance is built. As a result, it can now return an error and handling theResult
is required.
- Closes #544. A few users experienced problems with the directory watching system. Avoid watching the mmap directory until someone effectively creates a reader that uses this functionality.
Tantivy 0.10.0 index format is compatible with the index format in 0.9.0.
- Added an API to easily tweak or entirely replace the
default score. See
TopDocs::tweak_score
andTopScore::custom_score
(@pmasurel) - Added an ASCII folding filter (@drusellers)
- Bugfix in
query.count
in presence of deletes (@pmasurel) - Added
.explain(...)
inQuery
andWeight
to (@pmasurel) - Added an efficient way to
delete_all_documents
inIndexWriter
(@petr-tik). All segments are simply removed.
- Switched to Rust 2018 (@uvd)
- Small simplification of the code. Calling .freq() or .doc() when .advance() has never been called on segment postings should panic from now on.
- Tokens exceeding
u16::max_value() - 4
chars are discarded silently instead of panicking. - Fast fields are now preloaded when the
SegmentReader
is created. IndexMeta
is now public. (@hntd187)IndexWriter
add_document
,delete_term
.IndexWriter
isSync
, making it possible to use it with aArc<RwLock<IndexWriter>>
.add_document
anddelete_term
can only require a read lock. (@pmasurel)- Introducing
Opstamp
as an expressive type alias foru64
. (@petr-tik) - Stamper now relies on
AtomicU64
on all platforms (@petr-tik) - Bugfix - Files get deleted slightly earlier
- Compilation resources improved (@fdb-hiroshima)
Your program should be usable as is.
Fast fields used to be accessed directly from the SegmentReader
.
The API changed, you are now required to acquire your fast field reader via the
segment_reader.fast_fields()
, and use one of the typed method:
.u64()
,.i64()
if your field is single-valued ;.u64s()
,.i64s()
if your field is multi-valued ;.bytes()
if your field is bytes fast field.
0.9.0 index format is not compatible with the previous index format.
- MAJOR BUGFIX :
Some
Mmap
objects were being leaked, and would never get released. (@fulmicoton) - Removed most unsafe (@fulmicoton)
- Indexer memory footprint improved. (VInt comp, inlining the first block. (@fulmicoton)
- Stemming in other language possible (@pentlander)
- Segments with no docs are deleted earlier (@barrotsteindev)
- Added grouped add and delete operations. They are guaranteed to happen together (i.e. they cannot be split by a commit). In addition, adds are guaranteed to happen on the same segment. (@elbow-jason)
- Removed
INT_STORED
andINT_INDEXED
. It is now possible to useSTORED
andINDEXED
for int fields. (@fulmicoton) - Added DateTime field (@barrotsteindev)
- Added IndexReader. By default, index is reloaded automatically upon new commits (@fulmicoton)
- SIMD linear search within blocks (@fulmicoton)
tantivy 0.9 brought some API breaking change. To update from tantivy 0.8, you will need to go through the following steps.
-
schema::INT_INDEXED
andschema::INT_STORED
should be replaced byschema::INDEXED
andschema::INT_STORED
. -
The index now does not hold the pool of searcher anymore. You are required to create an intermediary object called
IndexReader
for this.// create the reader. You typically need to create 1 reader for the entire // lifetime of you program. let reader = index.reader()?; // Acquire a searcher (previously `index.searcher()`) is now written: let searcher = reader.searcher(); // With the default setting of the reader, you are not required to // call `index.load_searchers()` anymore. // // The IndexReader will pick up that change automatically, regardless // of whether the update was done in a different process or not. // If this behavior is not wanted, you can create your reader with // the `ReloadPolicy::Manual`, and manually decide when to reload the index // by calling `reader.reload()?`.
Fixing build for x86_64 platforms. (#496) No need to update from 0.8.1 if tantivy is building on your platform.
Hotfix of #476.
Merge was reflecting deletes before commit was passed. Thanks @barrotsteindev for reporting the bug.
No change in the index format
- API Breaking change in the collector API. (@jwolfe, @fulmicoton)
- Multithreaded search (@jwolfe, @fulmicoton)
No change in the index format
- Bugfix: NGramTokenizer panics on non ascii chars
- Added a space usage API
- Skip data for doc ids and positions (@fulmicoton), greatly improving performance
- Tantivy error now rely on the failure crate (@drusellers)
- Added support for
AND
,OR
,NOT
syntax in addition to the+
,-
syntax - Added a snippet generator with highlight (@vigneshsarma, @fulmicoton)
- Added a
TopFieldCollector
(@pentlander)
- Bugfix #324. GC removing was removing file that were still in useful
- Added support for parsing AllQuery and RangeQuery via QueryParser
- AllQuery:
*
- RangeQuery:
- Inclusive
field:[startIncl to endIncl]
- Exclusive
field:{startExcl to endExcl}
- Mixed
field:[startIncl to endExcl}
and vice versa - Unbounded
field:[start to *]
,field:[* to end]
- Inclusive
- AllQuery:
Special thanks to @drusellers and @jason-wolfe for their contributions to this release!
- Removed C code. Tantivy is now pure Rust. (@pmasurel)
- BM25 (@pmasurel)
- Approximate field norms encoded over 1 byte. (@pmasurel)
- Compiles on stable rust (@pmasurel)
- Add &[u8] fastfield for associating arbitrary bytes to each document (@jason-wolfe) (#270)
- Completely uncompressed
- Internally: One u64 fast field for indexes, one fast field for the bytes themselves.
- Add NGram token support (@drusellers)
- Add Stopword Filter support (@drusellers)
- Add a FuzzyTermQuery (@drusellers)
- Add a RegexQuery (@drusellers)
- Various performance improvements (@pmasurel)_
- bugfix #274
- bugfix #280
- bugfix #289
- bugfix #254 : tantivy failed if no documents in a segment contained a specific field.
- Faceting
- RangeQuery
- Configurable tokenization pipeline
- Bugfix in PhraseQuery
- Various query optimisation
- Allowing very large indexes
- 64 bits file address
- Smarter encoding of the
TermInfo
objects
- Bugfix race condition when deleting files. (#198)
- Prevent usage of AVX2 instructions (#201)
- Bugfix for non-indexed fields. (#199)
- Raise the limit of number of fields (previously 256 fields) (@fulmicoton)
- Removed u32 fields. They are replaced by u64 and i64 fields (#65) (@fulmicoton)
- Optimized skip in SegmentPostings (#130) (@lnicola)
- Replacing rustc_serialize by serde. Kudos to @KodrAus and @lnicola
- Using error-chain (@KodrAus)
- QueryParser: (@fulmicoton)
- Explicit error returned when searched for a term that is not indexed
- Searching for a int term via the query parser was broken
(age:1)
- Searching for a non-indexed field returns an explicit Error
- Phrase query for non-tokenized field are not tokenized by the query parser.
- Faster/Better indexing (@fulmicoton)
- using murmurhash2
- faster merging
- more memory efficient fast field writer (@lnicola )
- better handling of collisions
- lesser memory usage
- Added API, most notably to iterate over ranges of terms (@fulmicoton)
- Bugfix that was preventing to unmap segment files, on index drop (@fulmicoton)
- Made the doc! macro public (@fulmicoton)
- Added an alternative implementation of the streaming dictionary (@fulmicoton)
- Expose a method to trigger files garbage collection
Special thanks to @Kodraus @lnicola @Ameobea @manuel-woelker @celaus for their contribution to this release.
Thanks also to everyone in tantivy gitter chat for their advise and company :)
https://gitter.im/tantivy-search/tantivy
Warning:
Tantivy 0.3 is NOT backward compatible with tantivy 0.2 code and index format. You should not expect backward compatibility before tantivy 1.0.
- Delete. You can now delete documents from an index.
- Support for windows (Thanks to @lnicola)
- Added CI for Windows (https://ci.appveyor.com/project/fulmicoton/tantivy) Thanks to @KodrAus ! (#108)
- Various dependy version update (Thanks to @Ameobea) #76
- Fixed several race conditions in
Index.wait_merge_threads
- Fixed #72. Mmap were never released.
- Fixed #80. Fast field used to take an amplitude of 32 bits after a merge. (Ouch!)
- Fixed #92. u32 are now encoded using big endian in the fst in order to make there enumeration consistent with the natural ordering.
- Building binary targets for tantivy-cli (Thanks to @KodrAus)
- Misc invisible bug fixes, and code cleanup.
- Use