Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search suffix tree implementation #48652

Merged
merged 101 commits into from
Oct 17, 2024

Conversation

hannojg
Copy link
Contributor

@hannojg hannojg commented Sep 5, 2024

Details

This PR aims at improving the local search speed.
On a lower end android phone with a hightraffic account searching is currently taking around 344ms in the release build.
Our proposed suffix tree search implementation takes 0.14ms for searching on that same device.
That is a 2456x improvement.

Remaining todos:
  • @Szymon20000 Investigate size of N. Initially it was 1_000_000, I set it to be 25_000, because otherwise allocating the memory took a very long time. Can we approximate N? Can we use array buffers here?
  • @Szymon20000 we need to support numbers as well. There are users who's id is for example a phone number, searching for that phone number is currently not possible
  • @Szymon20000 We need to support other unicode characters other than a-z (or need a way to handle it properly). For example Spanish or French letters. One example would be if we migrate the emoji trie to also use our suffix tree the emoji keywords in Spanish have special characters such as ñ
  • The faster recursive depth search function is currently broken
  • There seems to be a bug where while we are searching the tree currently gets recalculated, ie. the input data don't appear to be stable (could be due to react strict mode, but I think that shouldn't matter here). Would be good to investigate why that is (could be a separate ticket / PR) <- it happens because the first render gives personalDetails and recentReports as empty arrays, and then we recalculate memo again when we get actual data. useMemo is not called twice in StrictMode - it's applicable only to useEffect. Update during re-render happens because we send HTTP request which can update reports in ONYX and it'll lead to searchOptions recalculation
  • We need to cleanup the suffix tree implementation. Right now there is lots of code in the ChatFinderPage which we should hide in our implementation, such as the delimiter chars, the stringToArray function, the manual mapping from indexes to OptionData
  • Add all search values for personal details and reports into the search string for the tree
  • Add the userToInvite case back, for now I skipped implementing this (I deleted the code, but I believe we need to keep the functionality)
  • Properly test and finish this PR
    • Marc produces different search output (1d11ed2)
    • Taras can not be found
  • Make a follow up issue to propose to replace the emoji trie implementation with our suffix tree

Fixed Issues

$ #46591
PROPOSAL:

Tests

  • Open the chat finder page (search icon on the start page when authenticated)
  • Perform searches, confirm that everything is found, make sure to also search for numbers
  • Search for an email address (e.g. [email protected]), make sure it's found. Then search for hannomargeloio (removing all special chars), and make sure its still found
  • You might want to open the production version and compare your search results with the dev version to make sure you got the same results

Offline tests

Same as testing steps

QA Steps

Same as testing steps

PR Author Checklist

  • I linked the correct issue in the ### Fixed Issues section above
  • I wrote clear testing steps that cover the changes made in this PR
    • I added steps for local testing in the Tests section
    • I added steps for the expected offline behavior in the Offline steps section
    • I added steps for Staging and/or Production testing in the QA steps section
    • I added steps to cover failure scenarios (i.e. verify an input displays the correct error message if the entered data is not correct)
    • I turned off my network connection and tested it while offline to ensure it matches the expected behavior (i.e. verify the default avatar icon is displayed if app is offline)
    • I tested this PR with a High Traffic account against the staging or production API to ensure there are no regressions (e.g. long loading states that impact usability).
  • I included screenshots or videos for tests on all platforms
  • I ran the tests on all platforms & verified they passed on:
    • Android: Native
    • Android: mWeb Chrome
    • iOS: Native
    • iOS: mWeb Safari
    • MacOS: Chrome / Safari
    • MacOS: Desktop
  • I verified there are no console errors (if there's a console error not related to the PR, report it or open an issue for it to be fixed)
  • I followed proper code patterns (see Reviewing the code)
    • I verified that any callback methods that were added or modified are named for what the method does and never what callback they handle (i.e. toggleReport and not onIconClick)
    • I verified that the left part of a conditional rendering a React component is a boolean and NOT a string, e.g. myBool && <MyComponent />.
    • I verified that comments were added to code that is not self explanatory
    • I verified that any new or modified comments were clear, correct English, and explained "why" the code was doing something instead of only explaining "what" the code was doing.
    • I verified any copy / text shown in the product is localized by adding it to src/languages/* files and using the translation method
      • If any non-english text was added/modified, I verified the translation was requested/reviewed in #expensify-open-source and it was approved by an internal Expensify engineer. Link to Slack message:
    • I verified all numbers, amounts, dates and phone numbers shown in the product are using the localization methods
    • I verified any copy / text that was added to the app is grammatically correct in English. It adheres to proper capitalization guidelines (note: only the first word of header/labels should be capitalized), and is either coming verbatim from figma or has been approved by marketing (in order to get marketing approval, ask the Bug Zero team member to add the Waiting for copy label to the issue)
    • I verified proper file naming conventions were followed for any new files or renamed files. All non-platform specific files are named after what they export and are not named "index.js". All platform-specific files are named for the platform the code supports as outlined in the README.
    • I verified the JSDocs style guidelines (in STYLE.md) were followed
  • If a new code pattern is added I verified it was agreed to be used by multiple Expensify engineers
  • I followed the guidelines as stated in the Review Guidelines
  • I tested other components that can be impacted by my changes (i.e. if the PR modifies a shared library or component like Avatar, I verified the components using Avatar are working as expected)
  • I verified all code is DRY (the PR doesn't include any logic written more than once, with the exception of tests)
  • I verified any variables that can be defined as constants (ie. in CONST.js or at the top of the file that uses the constant) are defined as such
  • I verified that if a function's arguments changed that all usages have also been updated correctly
  • If any new file was added I verified that:
    • The file has a description of what it does and/or why is needed at the top of the file if the code is not self explanatory
  • If a new CSS style is added I verified that:
    • A similar style doesn't already exist
    • The style can't be created with an existing StyleUtils function (i.e. StyleUtils.getBackgroundAndBorderStyle(theme.componentBG))
  • If the PR modifies code that runs when editing or sending messages, I tested and verified there is no unexpected behavior for all supported markdown - URLs, single line code, code blocks, quotes, headings, bold, strikethrough, and italic.
  • If the PR modifies a generic component, I tested and verified that those changes do not break usages of that component in the rest of the App (i.e. if a shared library or component like Avatar is modified, I verified that Avatar is working as expected in all cases)
  • If the PR modifies a component related to any of the existing Storybook stories, I tested and verified all stories for that component are still working as expected.
  • If the PR modifies a component or page that can be accessed by a direct deeplink, I verified that the code functions as expected when the deeplink is used - from a logged in and logged out account.
  • If the PR modifies the UI (e.g. new buttons, new UI components, changing the padding/spacing/sizing, moving components, etc) or modifies the form input styles:
    • I verified that all the inputs inside a form are aligned with each other.
    • I added Design label and/or tagged @Expensify/design so the design team can review the changes.
  • If a new page is added, I verified it's using the ScrollView component to make it scrollable when more elements are added to the page.
  • If the main branch was merged into this PR after a review, I tested again and verified the outcome was still expected according to the Test steps.

Screenshots/Videos

Android: Native
Screen_Recording_20240927_212736_New.Expensify.Dev.mp4
Android: mWeb Chrome
Screen_Recording_20240927_211510_Chrome.mp4
iOS: Native
Screen.Recording.2024-09-27.at.21.28.56.mov
iOS: mWeb Safari
ScreenRecording_09-27-2024.21-34-17_1.MP4
MacOS: Chrome / Safari
Screen.Recording.2024-09-27.at.21.08.10.mov
MacOS: Desktop
Screen.Recording.2024-09-27.at.21.28.10.mov

@hannojg
Copy link
Contributor Author

hannojg commented Sep 5, 2024

cc @kirillzyusko

@kirillzyusko kirillzyusko force-pushed the perf/search-suffix-ukkonen-tree branch from 2b586a8 to 01162fe Compare September 11, 2024 15:39
@kirillzyusko
Copy link
Contributor

kirillzyusko commented Sep 13, 2024

I'll copy TODOs and paste here, since I took this PR over but can't edit an original post, so will track my progress in this comment:

Remaining todos:

  • @Szymon20000 Investigate size of N. Initially it was 1_000_000, I set it to be 25_000, because otherwise allocating the memory took a very long time. Can we approximate N? Can we use array buffers here?
  • @Szymon20000 we need to support numbers as well. There are users who's id is for example a phone number, searching for that phone number is currently not possible
  • @Szymon20000 We need to support other unicode characters other than a-z (or need a way to handle it properly). For example Spanish or French letters. One example would be if we migrate the emoji trie to also use our suffix tree the emoji keywords in Spanish have special characters such as ñ
  • We probably want to have an "update" function for the tree when for example the personal details are updated (instead of recreating the tree from scratch) - this needs to be added to the tree implementation as well @Szymon20000
  • There seems to be a bug where while we are searching the tree currently gets recalculated, ie. the input data don't appear to be stable (could be due to react strict mode, but I think that shouldn't matter here). Would be good to investigate why that is (could be a separate ticket / PR) <- it happens because the first render gives personalDetails and recentReports as empty arrays, and then we recalculate memo again when we get actual data. useMemo is not called twice in StrictMode - it's applicable only to useEffect. Update during re-render happens because we send HTTP request which can update reports in ONYX and it'll lead to searchOptions recalculation
  • We need to cleanup the suffix tree implementation. Right now there is lots of code in the ChatFinderPage which we should hide in our implementation, such as the delimiter chars, the stringToArray function, the manual mapping from indexes to OptionData
  • Add all search values for personal details and reports into the search string for the tree
  • Add the userToInvite case back, for now I skipped implementing this (I deleted the code, but I believe we need to keep the functionality)
  • Properly test and finish this PR
  • Marc produces different search output (1d11ed2)
  • Taras can not be found
  • Make a follow up issue to propose to replace the emoji trie implementation with our suffix tree

@melvin-bot melvin-bot bot requested a review from marcaaron October 10, 2024 17:15
@danieldoglas
Copy link
Contributor

@marcaaron all yours to a last look and merge!

marcaaron
marcaaron previously approved these changes Oct 10, 2024
Copy link
Contributor

@marcaaron marcaaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@marcaaron
Copy link
Contributor

Re-running the perf test - I would first assume it failed for some other reason.

@marcaaron
Copy link
Contributor

Hmm still failing.

@hannojg
Copy link
Contributor Author

hannojg commented Oct 11, 2024

Sorry for not posting it earlier, the discussion has been here about the failed performance test. I think the conclusion was to do nothing here in this PR and let Fabio handle it (the issue comes from switching to useOnyx)

@danieldoglas
Copy link
Contributor

@hannojg can you please solve the conflicts here?

@marcaaron once the conflicts are solved, do you think we can merge this? I agree that if the performance issue is with the change to useOnyx, we should treat it separately from this PR

@marcaaron
Copy link
Contributor

Yes, but I am waiting for confirmation about how the performance tests work? I am unsure if merging this means that all App PRs will start to fail. Asked about it here.

@hannojg hannojg dismissed stale reviews from marcaaron and danieldoglas via f757fd2 October 17, 2024 07:31
@marcaaron marcaaron merged commit 91c6e0c into Expensify:main Oct 17, 2024
16 checks passed
@OSBotify
Copy link
Contributor

✋ This PR was not deployed to staging yet because QA is ongoing. It will be automatically deployed to staging after the next production release.

Copy link
Contributor

🚀 Deployed to staging by https://github.com/marcaaron in version: 9.0.51-1 🚀

platform result
🤖 android 🤖 success ✅
🖥 desktop 🖥 success ✅
🍎 iOS 🍎 success ✅
🕸 web 🕸 success ✅

@IuliiaHerets
Copy link

This PR is failing for Android because of issue #51123

Copy link
Contributor

🚀 Deployed to production by https://github.com/yuwenmemon in version: 9.0.51-4 🚀

platform result
🤖 android 🤖 success ✅
🖥 desktop 🖥 success ✅
🍎 iOS 🍎 failure ❌
🕸 web 🕸 success ✅

Copy link
Contributor

🚀 Deployed to production by https://github.com/yuwenmemon in version: 9.0.51-4 🚀

platform result
🤖 android 🤖 success ✅
🖥 desktop 🖥 success ✅
🍎 iOS 🍎 success ✅
🕸 web 🕸 success ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.