Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[$1000] Web - Onyx: out-of-order updates #28737

Closed
2 of 6 tasks
kbecciv opened this issue Oct 3, 2023 · 61 comments
Closed
2 of 6 tasks

[$1000] Web - Onyx: out-of-order updates #28737

kbecciv opened this issue Oct 3, 2023 · 61 comments
Assignees
Labels
Awaiting Payment Auto-added when associated PR is deployed to production Bug Something is broken. Auto assigns a BugZero manager. Daily KSv2 External Added to denote the issue can be worked on by a contributor

Comments

@kbecciv
Copy link

kbecciv commented Oct 3, 2023

If you haven’t already, check out our contributing guidelines for onboarding and email [email protected] to request to join our Slack channel!


Action Performed:

This bug comes from discussion in #26592
TL;DR: Onyx processes the successData and API response in the wrong order on iOS and Android.
Dive deeper:

API.write('AddMembersToWorkspace', params, {optimisticData, successData, failureData});

When adding a member to workspace here, we send the following data:

"optimisticData": {
    "onyxMethod": "set",
    "key": "report_3008335418540078",
    "value": {
        "type": "chat",
        "chatType": "policyExpenseChat",
        "ownerAccountID": 899964195,
        "participantAccountIDs": [
            15664021,
            899964195
        ],
        "policyID": "41A8E180FAEB43DF",
        "reportID": "3008335418540078",
        "reportName": "Chat Report",
        "pendingFields": {
            "createChat": "add"
        },
        "isOptimisticReport": true
        ...
    }
},
"successData": {
    "onyxMethod": "merge",
    "key": "report_3008335418540078",
    "value": {
        "pendingFields": {
            "createChat": null
        },
        "errorFields": {
            "createChat": null
        },
        "isOptimisticReport": false
    }
}

And the API call returns the following onyx merge_collection action:

{
    "jsonCode": 200,
    "requestID": "806fecea0afd3512-WAW",
    "onyxData": [
         {
            "onyxMethod": "mergecollection",
            "key": "report_",
            "value": {
                ...
                "report_3008335418540078": {
                    "reportID": "3008335418540078",
                    "reportName": "",
                    "type": "chat",
                    "chatType": "policyExpenseChat",
                    "ownerEmail": "[[email protected]](mailto:[email protected])",
                    "ownerAccountID": 15671904,
                    "policyID": "41A8E180FAEB43DF",
                    "participants": [
                        "[[email protected]](mailto:[email protected])",
                        "[[email protected]](mailto:[email protected])"
                    ],
                    "participantAccountIDs": [
                        15664021,
                        15671904
                    ],
                    "lastActorAccountID": 15671904,
                },   
            }
        },
    ]
}

It should replace the temporary accountID of the report.ownerAccountID with the real one, which comes from the server.
On the web, the onyx actions are executed in the following order:

  1. persist optimisticData;
  2. persist successData;
  3. persist API response;
    The issue is that on iOS, the onyx actions are executed in the following order:
  4. persist optimisticData;
  5. persist API response;
  6. persist successData;
    The successData holds merge operation of only several certain fields, which shouldn't affect the report much.
    However, here's the tricky part:
    This piece of code, which creates a promise for future merge operation, is executed before Onyx persists the API response:
    https://github.com/Expensify/react-native-onyx/blob/616c4c5286b7a7fadac18d477cd75edd60865b12/lib/Onyx.js#L1136-L1147
    Meaning the following execution chain:
  7. persist optimisticData;
  8. create persistSuccessData promise;
  9. persist API response;
  10. execute persistSuccessData promise;
    This means, at the moment persistSuccessData is created, the existingValue is still not updated, so even though it should update only a couple of fields, the modifiedData here is saved in a state of pre-API-response.
    Then the report is updated by the "persist API response" operation, but gets overwritten by the persistSuccessData operation with the outdated data.
    If we debug step-by-step, you'll notice that the chat name is displayed correctly between the steps 3 and 4 from above.
    As an outcome of this complete operation, the state of Onyx is the following:
  • personalDetailsList contains the new permanent ownerAccountID;
  • report has the temporary ownerAccountID, which is already removed from Onyx.

Expected Result:

Onyx updates are executed in correct order without data being overwritten

Actual Result:

Onyx updates face a race condition and some data is overwritten

Workaround:

Unknown

Platforms:

Which of our officially supported platforms is this issue occurring on?

  • Android / native
  • Android / Chrome
  • iOS / native
  • iOS / Safari
  • MacOS / Chrome / Safari
  • MacOS / Desktop

Version Number: 1.3.76.0
Reproducible in staging?: n/a
Reproducible in production?: n/a
If this was caught during regression testing, add the test name, ID and link from TestRail:
Email or phone of affected tester (no customers):
Logs: https://stackoverflow.com/c/expensify/questions/4856
Notes/Photos/Videos: Any additional supporting documentation

n/a

Expensify/Expensify Issue URL:
Issue reported by: @@paultsimura
Slack conversation: https://expensify.slack.com/archives/C049HHMV9SM/p1696287156456449

View all open jobs on GitHub

Upwork Automation - Do Not Edit
  • Upwork Job URL: https://www.upwork.com/jobs/~013373cff08a2575e9
  • Upwork Job ID: 1709846441891180544
  • Last Price Increase: 2023-10-05
@kbecciv kbecciv added Daily KSv2 Bug Something is broken. Auto assigns a BugZero manager. labels Oct 3, 2023
@melvin-bot
Copy link

melvin-bot bot commented Oct 3, 2023

Triggered auto assignment to @CortneyOfstad (Bug), see https://stackoverflow.com/c/expensify/questions/14418 for more details.

@melvin-bot
Copy link

melvin-bot bot commented Oct 3, 2023

Bug0 Triage Checklist (Main S/O)

  • This "bug" occurs on a supported platform (ensure Platforms in OP are ✅)
  • This bug is not a duplicate report (check E/App issues and #expensify-bugs)
    • If it is, comment with a link to the original report, close the issue and add any novel details to the original issue instead
  • This bug is reproducible using the reproduction steps in the OP. S/O
    • If the reproduction steps are clear and you're unable to reproduce the bug, check with the reporter and QA first, then close the issue.
    • If the reproduction steps aren't clear and you determine the correct steps, please update the OP.
  • This issue is filled out as thoroughly and clearly as possible
    • Pay special attention to the title, results, platforms where the bug occurs, and if the bug happens on staging/production.
  • I have reviewed and subscribed to the linked Slack conversation to ensure Slack/Github stay in sync

@ospfranco
Copy link
Contributor

I believe I found the cause.

When the update function is called, all the operations are fired in an async manner. However, this causes a race condition where certain operations will be written into the DB before others. It's a built-in race condition.

Just for documentation purposes: I suggested we need to make the update operations sequentially await. @marcaaron @tgolen are however worried about the app taking a big performance hit. I will try another solution and report back if I manage to get the app to do reliable updates.

@pradeepmdk
Copy link
Contributor

pradeepmdk commented Oct 5, 2023

I believe I found the cause.

When the update function is called, all the operations are fired in an async manner. However, this causes a race condition where certain operations will be written into the DB before others. It's a built-in race condition.

Just for documentation purposes: I suggested we need to make the update operations sequentially await. @marcaaron @tgolen are however worried about the app taking a big performance hit. I will try another solution and report back if I manage to get the app to do reliable updates.

for note: that is what I already told in my proposal. #26592 (comment)
cc: @Julesssss @ospfranco

@pradeepmdk
Copy link
Contributor

pradeepmdk commented Oct 5, 2023

Another problem in the App side:

Proposal

Please re-state the problem that we are trying to solve in this issue.

Onyx updating on race condition

What is the root cause of that problem?

Screenshot 2023-10-05 at 10 46 05 AM

https://github.com/Expensify/App/assets/16595846/736946e5-183f-4b16-969a-c0ef118ad228
Screenshot 2023-10-05 at 11 26 13 AM

@marcaaron I saw recently it changed to onyx to memory here
#25614

  1. we are flushed out to Onyx once all the api is completed in the queue (networkRequestQueue). I think this will create a performance issue right? ex: here in the queue 2035 is pending after coming online it gets a response object in the queuedOnyxUpdates to 46k object.

  2. Save queued updates to memory only #25614 Due to this change incase the middle of the response I refresh the page means it will never update to onyx.

What changes do you think we should make in order to solve the problem?

flushOnyxUpdatesQueue();

we need to flush out each api success or failure.

cc: @marcaaron @tgolen @ospfranco @mountiny

@Julesssss Julesssss added the External Added to denote the issue can be worked on by a contributor label Oct 5, 2023
@melvin-bot melvin-bot bot changed the title Web - Onyx: out-of-order updates [$500] Web - Onyx: out-of-order updates Oct 5, 2023
@melvin-bot
Copy link

melvin-bot bot commented Oct 5, 2023

Job added to Upwork: https://www.upwork.com/jobs/~013373cff08a2575e9

@melvin-bot melvin-bot bot added the Help Wanted Apply this label when an issue is open to proposals by contributors label Oct 5, 2023
@melvin-bot
Copy link

melvin-bot bot commented Oct 5, 2023

Triggered auto assignment to Contributor-plus team member for initial proposal review - @jjcoffee (External)

@Julesssss Julesssss changed the title [$500] Web - Onyx: out-of-order updates [$1000] Web - Onyx: out-of-order updates Oct 5, 2023
@melvin-bot
Copy link

melvin-bot bot commented Oct 5, 2023

Upwork job price has been updated to $1000

@Julesssss
Copy link
Contributor

@CortneyOfstad bumping the price as this one is pretty critical

@CortneyOfstad
Copy link
Contributor

@jjcoffee any thoughts on the comments/proposals above?

@jjcoffee
Copy link
Contributor

jjcoffee commented Oct 6, 2023

Unsure if this is still reproducible based on this, but also looks like @ospfranco has a PR in the works.

@pradeepmdk
Copy link
Contributor

@jjcoffee #26592 (comment), is fixed . but still have some issues can you check my #28737 (comment)

@melvin-bot melvin-bot bot added the Overdue label Oct 9, 2023
@CortneyOfstad
Copy link
Contributor

Bump @jjcoffee ^^^ Thanks!

@melvin-bot melvin-bot bot removed the Overdue label Oct 9, 2023
@jjcoffee
Copy link
Contributor

jjcoffee commented Oct 9, 2023

Sorry @pradeepmdk could you flesh out your proposal or try to explain it a bit better? I'm finding it hard to follow what new issue you've found.

@ospfranco
Copy link
Contributor

From what I understand API requests responses are placed in a networkRequestQueue, which was before persisted to disk but now it is only kept in memory. If one refreshes the tab, then this queue will be lost (also due to its size it will probably take some time to process). Which could potentially lead to an inconsistent state. IMO this is another issue, not even sure if it is an issue if it doesn't cause incosistent data in the app, and should be in a separate ticket.

my PR is merged and should have already fixed the out-of-order database writes. Let me know if there is anything else.

@Julesssss
Copy link
Contributor

Hi @ospfranco, the PR link you shared just links back to this issue.

@ospfranco
Copy link
Contributor

Oops, sorry, this is the PR

Expensify/react-native-onyx#384

@Julesssss
Copy link
Contributor

Ah, fantastic. So I think we can consider this issue finished once the Onyx lib is bumped.

@Julesssss
Copy link
Contributor

@CortneyOfstad yeah 👍

@CortneyOfstad
Copy link
Contributor

Sounds good @Julesssss!

@paultsimura I've adjusted the offer in Upwork, so please let me know once you accept!

@ospfranco I still need you to link me your Upwork profile so I can send you an offer. Thanks!

@paultsimura
Copy link
Contributor

Thank you @CortneyOfstad, accepted the offer

@CortneyOfstad
Copy link
Contributor

Thanks! Both @paultsimura and @Julesssss have been paid 👍

Will also ping @ospfranco in Slack 👍

@CortneyOfstad
Copy link
Contributor

Actually, can't find them in Slack 🤔

@shubham1206agra
Copy link
Contributor

@Julesssss Can you tell me if this issue is fixed or not? Cause I saw the linked PR and I am not able to understand how changes to withOnyx solved the race condition in Onyx.update.

@Julesssss
Copy link
Contributor

@shubham1206agra there are a couple of linked PRs, can you clarify the question please?

@shubham1206agra
Copy link
Contributor

shubham1206agra commented Dec 8, 2023

I am talking about this PR #29169. My question is how did this PR fix the out-of-order updates?

@shubham1206agra
Copy link
Contributor

@Julesssss Bump on the question ^

@melvin-bot melvin-bot bot added the Overdue label Dec 11, 2023
@Julesssss
Copy link
Contributor

There are explanations in the original and follow up onyx repo PR

@melvin-bot melvin-bot bot added Daily KSv2 Overdue and removed Overdue Daily KSv2 labels Dec 11, 2023
@Julesssss
Copy link
Contributor

Actually, can't find them in Slack 🤔

I found them on Twitter, but I got blocked for typing a message that sounded like a scam 😆

lets see if they get my second message

@melvin-bot melvin-bot bot removed the Overdue label Dec 13, 2023
@ospfranco
Copy link
Contributor

Ah, so actually my PR was reverted, I don't know which steps the team took to fix the issue if any at all. As it had entailed doing a large refactoring of the Onyx internals. I cannot in good conscience accept any payment since I'm no longer helping Margelo with the project. You can forgo any payment you are planning for me, it's fine :)

@melvin-bot melvin-bot bot added the Overdue label Dec 18, 2023
@Julesssss
Copy link
Contributor

@ospfranco thanks, I see. It's hard to tell but it looks like we would have reverted, then resubmitted with a small fix which would still be eligible for payment.

But I'll leave it up to you

@melvin-bot melvin-bot bot added Overdue and removed Overdue labels Dec 18, 2023
@CortneyOfstad
Copy link
Contributor

@ospfranco bump on the comment above ^^^ Thanks!

@melvin-bot melvin-bot bot removed the Overdue label Dec 20, 2023
@Julesssss
Copy link
Contributor

Sounds like he's happy not to. We can always reopen if that changes though!

Copy link

melvin-bot bot commented Dec 23, 2023

⚠️ Looks like this issue was linked to a Deploy Blocker here

If you are the assigned CME please investigate whether the linked PR caused a regression and leave a comment with the results.

If a regression has occurred and you are the assigned CM follow the instructions here.

If this regression could have been avoided please consider also proposing a recommendation to the PR checklist so that we can avoid it in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting Payment Auto-added when associated PR is deployed to production Bug Something is broken. Auto assigns a BugZero manager. Daily KSv2 External Added to denote the issue can be worked on by a contributor
Projects
None yet
Development

No branches or pull requests

10 participants