Skip to content

Commit

Permalink
[lib] Use RadixTree in SearchIndex
Browse files Browse the repository at this point in the history
Summary:
This diff actually uses the `RadixTree` introduced in the parent diff from the `SearchIndex` code.

Linear tasks: [ENG-5137](https://linear.app/comm/issue/ENG-5137/change-searchindex-to-radixtree-or-compressed-trie) and [ENG-5480](https://linear.app/comm/issue/ENG-5480/investigate-chatmentioncontextprovider-tti-regression)

Depends on D9626

Test Plan:
1. I tested the chat mentions experience
2. I did some perf testing:

In combination with the previous diff, I used [this patch](https://gist.github.com/Ashoat/fc1c91a61009de0e9959527454be8236) to test performance before and after this change. I made sure I had at least three samples of each scenario. Will also link my [messy Gist of results](https://gist.github.com/Ashoat/b871afaaaee10b435b8676175d120d53), but it's not really interpretable by anyone other than me.

Here's the relevant portion:

```
BEFORE

 LOG  useChatMentionSearchIndex took 1801ms
 LOG  useChatMentionSearchIndex took 1748ms
 LOG  useChatMentionSearchIndex took 1730ms
 LOG  useChatMentionSearchIndex took 1831ms

 AVERAGE 1777.5ms

JUST DEDUP (parent diff)

 LOG  useChatMentionSearchIndex took 1027ms
 LOG  useChatMentionSearchIndex took 949ms
 LOG  useChatMentionSearchIndex took 957ms

 AVERAGE 977.7ms

DEDUP + RADIX TREE

 LOG  useChatMentionSearchIndex took 643ms
 LOG  useChatMentionSearchIndex took 629ms
 LOG  useChatMentionSearchIndex took 651ms
 LOG  useChatMentionSearchIndex took 609ms

 AVERAGE 633ms

JUST RADIX TREE

 LOG  useChatMentionSearchIndex took 1394ms
 LOG  useChatMentionSearchIndex took 1468ms
 LOG  useChatMentionSearchIndex took 1511ms
 LOG  useChatMentionSearchIndex took 1492ms
 LOG  useChatMentionSearchIndex took 1397ms

 AVERAGE 1452.4ms
```

Reviewers: tomek, atul, inka, rohan

Reviewed By: tomek

Subscribers: wyilio

Differential Revision: https://phab.comm.dev/D9627
  • Loading branch information
Ashoat committed Oct 30, 2023
1 parent 7555577 commit c3231dd
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 30 deletions.
37 changes: 12 additions & 25 deletions lib/shared/search-index.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

import Tokenizer from 'tokenize-text';

import RadixTree from './radix-tree.js';

type Token = {
index: number,
match: {
Expand All @@ -20,30 +22,14 @@ const defaultTokenize: TokenizeFunc = new Tokenizer().words();

class SearchIndex {
tokenize: (str: string) => Token[];
fullTextIndex: { [token: string]: Set<string> };
partialTextIndex: { [token: string]: Set<string> };
radixTree: RadixTree<string> = new RadixTree<string>();

constructor(inputTokenize?: TokenizeFunc) {
this.tokenize = inputTokenize ?? defaultTokenize;
this.fullTextIndex = {};
this.partialTextIndex = {};
}

addAllPrefixes(id: string, value: string): void {
if (this.fullTextIndex[value] === undefined) {
this.fullTextIndex[value] = new Set();
}
this.fullTextIndex[value].add(id);
let partialString = '';
for (let i = 0; i < value.length; i++) {
const char = value[i];
partialString += char;
// TODO probably should do some stopwords here
if (this.partialTextIndex[partialString] === undefined) {
this.partialTextIndex[partialString] = new Set();
}
this.partialTextIndex[partialString].add(id);
}
this.radixTree.add(value, id);
}

addEntry(id: string, rawText: string) {
Expand All @@ -63,20 +49,21 @@ class SearchIndex {
const lastKeyword = keywords[keywords.length - 1];
const lastKeywordValue = lastKeyword.value.toLowerCase();
const lastMatchSet = lastKeyword.match.input.match(/\S$/)
? this.partialTextIndex[lastKeywordValue]
: this.fullTextIndex[lastKeywordValue];
if (!lastMatchSet) {
? this.radixTree.getAllMatchingPrefix(lastKeywordValue)
: this.radixTree.getAllMatchingExactly(lastKeywordValue);
if (lastMatchSet.length === 0) {
return [];
}
const fullKeywords = keywords.slice(0, -1).map(k => k.value.toLowerCase());

let possibleMatches: string[] = Array.from(lastMatchSet);
let possibleMatches = lastMatchSet;
for (const keyword of fullKeywords) {
const fullMatches = this.fullTextIndex[keyword];
if (!fullMatches) {
const fullMatches = this.radixTree.getAllMatchingExactly(keyword);
if (fullMatches.length === 0) {
return [];
}
possibleMatches = possibleMatches.filter(id => fullMatches.has(id));
const fullMatchesSet = new Set(fullMatches);
possibleMatches = possibleMatches.filter(id => fullMatchesSet.has(id));
if (possibleMatches.length === 0) {
return [];
}
Expand Down
6 changes: 1 addition & 5 deletions lib/shared/sentence-prefix-search-index.js
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,7 @@ class SentencePrefixSearchIndex extends SearchIndex {
}

getSearchResults(query: string): string[] {
const transformedQuery = query.toLowerCase();
if (this.partialTextIndex[transformedQuery]) {
return Array.from(this.partialTextIndex[transformedQuery]);
}
return [];
return this.radixTree.getAllMatchingPrefix(query.toLowerCase());
}
}

Expand Down

0 comments on commit c3231dd

Please sign in to comment.