Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search suffix tree implementation #48652
Search suffix tree implementation #48652
Changes from 61 commits
5425082
54a7b60
8622670
01162fe
09e8aa7
fa81e13
e33142f
30424a9
1d11ed2
b502edc
c0e38b8
16e4078
955cdcb
d7c8d88
d07940d
84d3231
eb6b371
6002101
131c303
0a89089
def772b
0806fb0
c660312
24a59b7
146d51b
a2cdaf9
f6dbecf
bda1e34
2508d55
c32d57f
f4812c0
de0b467
c17a95b
0f94c05
fc54faa
96c93e7
81dffdd
25909ba
b85cfd4
5de310c
d51a4c3
9e9b574
4fa8390
8e583cb
56a1e8a
309556a
f7528f4
8b5b77f
a6b7939
8b41e88
4e273df
57af9b1
8b26a1f
af43c93
615a237
2ededdd
02f562d
e894a7f
54dcea8
8bb5b1e
b331193
18de46d
a91ffc9
8277d37
d1d028a
e829066
3b69cf4
8951708
4c4f1dc
050a320
017e9f6
4d5ad3d
1dc2a5e
fdac05f
e8fdd5c
418e2cd
49692fd
1f90a13
ee553da
29a1934
1aec94a
55acdc5
a1103a3
41207c6
736650f
4d115ed
35b2555
6f5be58
34d2dbd
6160173
fabe796
3539a57
fdf63af
fa68c64
162ad83
fd76f43
7c44fa7
29e5288
de7f3b5
f757fd2
59d2562
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we exceed this? Or is that pretty unlikely? What happens if we do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I made this number to be based on the most extreme case i have seen so far (the search string generated for ten thousands of reports and personal details was below 400.000, don't remember the exact number anymore).
In case the search string is longer, we'd loose search results. I think no error would be thrown, but if the user you are searching for is on position 400.010 you wouldn't be able to find that person.
Let me brainstorm quickly with Szymon if we can do better!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We think it would be best to implement a resize mechanism. If we hit the limit, we'd resize the arrays to be bigger. However, that comes with a performance overhead. So we should pick a default size that makes sense for most users (400_00 might actually be a bit too much for the average case, but for some extreme cases it might be too little).
However, as this isn't directly breaking anything, would you be fine with us tweaking this in a follow up PR? I think the Pr is already quite huge and hard to understand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Another solution for performance improvements on this could also be to store the array's arraybuffer in mmkv and rehydrate from there [mmkv has first class support for ArrayBuffers], but
again, it would be better to experiment with that in a follow up PR - i can open a follow up issue ticket, i think that would be best?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to keep it as 400K for now, but have a follow-up issue to make it resizable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks! I think this is something we can wait and see about - my initial thinking is that we still don't have any pagination for either personal details or reports. And we have also spent a lot of time tweaking things for some very high capacity usage.
Though, it was hard for me to tell how many personal details or reports we are talking about here when we'd be hitting that limit. If it's over 10k reports then nobody actually has that AFAIK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it should never ever happen then why is the code necessary ?
Maybe we say something less generic? Like,
Oh no. Ask Hanno how this code works.
Just kidding, but seriously maybe "Search failed due to range error" or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with more specific errors. Considering right now we have 2 with the same text, it would be hard to get to know where it failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, let me update this. Actually this might be removable completely. Its a remnant of when the dataSetIndex could be
undefined
, which we knew could never happen and we only added that statement to satisfy typescript.However, i might keep the checks in place because .... you never know, right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to call this variable
internalNodes
? Is this variable representing all non-leaf nodes in the tree?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we need to use
-1
as the default for "no transition" instead of0
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use 0 but then we would need to move all the nodes as root is marked as 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't really get what you're saying. I'm just assuming that since this is the largest array of nodes, avoiding having to fill them all with
-1
would be better. Also it would mean that this can beUInt32Array
instead ofInt32Array
(though I don't know if that matters at all for performance)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why
maxNodes * 4
?