Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HOLD for payment 2024-03-26] [HOLD for payment 2024-03-20] [HOLD for payment 2024-03-13] Investigate a dashboard tool to monitor app performance improvement/degradation after deploys #29274

Closed
muttmuure opened this issue Oct 11, 2023 · 59 comments
Assignees
Labels
Awaiting Payment Auto-added when associated PR is deployed to production NewFeature Something to build that is a new item. ring0 Weekly KSv2

Comments

@muttmuure
Copy link
Contributor

muttmuure commented Oct 11, 2023

The goal of the #newdot-performance project is achieving, maintaining (and protecting) these benchmarks for actions taken in app:

TTI: 3 seconds
Chat switching: 500ms
Sending a message: 100ms
Switching Active Workspace: 300ms

We have made good progress, in that the app is stable and performant on an anecdotal basis, and the wider company can focus on other areas of the roadmap. We would like to solidify this feeling with hard evidence and protect the app from performance regressions.

We would like to begin formally reporting on the performance benchmarks so we know they are being maintained. And we will use this data to protect the app’s benchmarked performance by “time-stamping” when performance regressions originate.

@melvin-bot melvin-bot bot added the Monthly KSv2 label Oct 16, 2023
@mountiny mountiny changed the title [HELD on Reassure Testing] Investigate a dashboard tool to monitor app performance improvement/degradation after deploys Investigate a dashboard tool to monitor app performance improvement/degradation after deploys Oct 24, 2023
@adhorodyski
Copy link
Contributor

@rinej started working on this today - he'll investigate the ways we can utilise json outputs from Reassure to showcase historical data and how we can pipe this through to eg. https://grafana.com/

@adhorodyski
Copy link
Contributor

@mountiny I think we can assign @rinej to this one 🙂

@mountiny
Copy link
Contributor

Grafana would be ideal as we use that already!

@mountiny mountiny self-assigned this Nov 15, 2023
@mountiny mountiny moved this to MEDIUM in [#whatsnext] #quality Nov 15, 2023
@rinej
Copy link
Contributor

rinej commented Nov 15, 2023

@mountiny Hello, today I started investigating the approach, I plan to use grafana

Copy link

melvin-bot bot commented Nov 15, 2023

📣 @rinej! 📣
Hey, it seems we don’t have your contributor details yet! You'll only have to do this once, and this is how we'll hire you on Upwork.
Please follow these steps:

  1. Make sure you've read and understood the contributing guidelines.
  2. Get the email address used to login to your Expensify account. If you don't already have an Expensify account, create one here. If you have multiple accounts (e.g. one for testing), please use your main account email.
  3. Get the link to your Upwork profile. It's necessary because we only pay via Upwork. You can access it by logging in, and then clicking on your name. It'll look like this. If you don't already have an account, sign up for one here.
  4. Copy the format below and paste it in a comment on this issue. Replace the placeholder text with your actual details.
    Screen Shot 2022-11-16 at 4 42 54 PM
    Format:
Contributor details
Your Expensify account email: <REPLACE EMAIL HERE>
Upwork Profile Link: <REPLACE LINK HERE>

@mountiny mountiny added the NewFeature Something to build that is a new item. label Nov 15, 2023
Copy link

melvin-bot bot commented Nov 15, 2023

@melvin-bot melvin-bot bot added Weekly KSv2 and removed Monthly KSv2 labels Nov 15, 2023
@rinej
Copy link
Contributor

rinej commented Nov 17, 2023

Hello mountiny,

We created an example approach using the JSON REST API Grafana plugin -> https://grafana.com/grafana/plugins/marcusolsson-json-datasource/

Here is the example dashboard:
Screenshot 2023-11-17 at 15 42 22

In that scenario, we created a simple node server that exposed the endpoints with JSON data, which Grafana consumed. It has some limitations; we will have to maintain a separate instance of the server to deliver data.

We are exploring the other two approaches:

  1. Storing just the JSON file on the S3 bucket and updating it when the user runs tests.
  2. Storing the SQLite database file in the S3 bucket and updating it when the user runs tests.

This will not require any additional server, just, for example, an S3 bucket to store the data. I will let you know when we have some insights about the approaches.

Meanwhile, do you have any preferences over what needs to be displayed and shown on the graph? I can send you the example output file from Reassure; just let me know! :)

@mountiny
Copy link
Contributor

Thank you @rinej!

Storing just the JSON file on the S3 bucket and updating it when the user runs tests.

I thin kthis sounds good to me, we already have solutions where we save files to S3 bucket from the GH actions so it shoudl also be reasonable easy to achieve

@mountiny
Copy link
Contributor

Meanwhile, do you have any preferences over what needs to be displayed and shown on the graph?

I was imagining that we would output the results of main branch over time, when the results get worse we can see based on timestamp what PR caused it, but we could just keep data about the results on main for various test scenarios and see how they evolved overtime

@rinej
Copy link
Contributor

rinej commented Nov 22, 2023

Thanks mountiny you for your answers!
We created some POC using JSON rest API grafana plugin

JSON api plugin: https://grafana.com/grafana/plugins/marcusolsson-json-datasource/
Plugin needs one endpoint which returns the data

Data used for the POC response (3 test cases run on different dates on different branches):

{"name":"should render Composer with text input interactions","type":"render", "meanDuration":18.244273030757904, "creationDate":"2023-11-17T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"some-branch" },
{"name":"should press add attachemnt button","type":"render", "meanDuration":24.91762231886387, "creationDate":"2023-11-17T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"some-branch" },
{"name":"should press add emoji button","type":"render", "meanDuration":12.9931875705718993, "creationDate":"2023-11-17T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"some-branch" },
{"name":"should render Composer with text input interactions","type":"render", "meanDuration":13.244273030757904, "creationDate":"2023-11-15T12:45:04.146Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"branch123" },
{"name":"should press add attachemnt button","type":"render", "meanDuration":16.91762231886387, "creationDate":"2023-11-15T12:45:04.146Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"branch123" },
{"name":"should press add emoji button","type":"render", "meanDuration":2.9931875705718993, "creationDate":"2023-11-15T12:45:04.146Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"branch123" },
{"name":"should render Composer with text input interactions","type":"render", "meanDuration":14.244273030757904, "creationDate":"2023-11-16T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"test-branch" },
{"name":"should press add attachemnt button","type":"render", "meanDuration":20.91762231886387, "creationDate":"2023-11-16T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"test-branch" },
{"name":"should press add emoji button","type":"render", "meanDuration":5.9931875705718993, "creationDate":"2023-11-16T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"test-branch" }

Example dashboards:
Screenshot 2023-11-21 at 17 05 08

A - table view - we can choose from a dropdown which test data we want to display
B - Tests grouped by name on a timeline - we can preview additional information about each point
C - Table view for all test data
D - Tests grouped by name on one graph

I think dashboards A and B will be the most useful, so we can see on the timeline when performance has dropped

🎯 What needs to be done:

  • Add github action to run reassure after merge to main
  • Upload output.json file to S3 bucket
    • It can be done either in reassure lib or in the script in Expensify repo
  • Create endpoint which gets JSONs files from S3, combines it and returns data in the above format
  • Create dashboard in Grafana which consumes the endpoint

Questions to mountiny?

  • Do you have some infrastructure so we can create and maintain the simple REST get endpoint? The logic in the endpoint:
  • Get all json performance files from S3 bucket (or any other storage place)
  • Combine and return the final json file which grafana will consume

@mountiny
Copy link
Contributor

@rinej Looks very promising, thanks! I will discuss with out infra team who could help us with the implementation as I do not have access to edit S3 or add this plugin to Graphana if we do not have it yet

Copy link

melvin-bot bot commented Nov 22, 2023

Triggered auto assignment to @johnmlee101 (ring0), see https://stackoverflow.com/c/expensify/questions/6102 for more details.

@mountiny
Copy link
Contributor

Thread we were discussing this is here

@rinej
Copy link
Contributor

rinej commented Nov 24, 2023

@mountiny I don't have access to the above link or the link is broken, could you give me the access?

@mountiny
Copy link
Contributor

Sorry cannot give you access to that one. It was Thanksgiving week so @johnmlee101 will be back this week and he is going to be able to help us with this I am sure

@johnmlee101
Copy link
Contributor

Upload output.json file to S3 bucket

Okay might need to figure out some of these details. I think S3 might be overkill if we just need to store and fetch.
Going to check one last thing with Infra

@johnmlee101
Copy link
Contributor

@mountiny @rinej
https://docs.github.com/en/actions/using-workflows/storing-workflow-data-as-artifacts

@justinpersaud brought up using artifacts if we don't care about the data lasting for too long. We have a max 90 day policy.

As for the REST endpoint if we can easily find a way to list all artifacts this way that would be ideal

@rinej
Copy link
Contributor

rinej commented Dec 5, 2023

@johnmlee101 @mountiny
you mentioned that you already use Grafana for some metrics, what kind of data source do you use? Do you use some service like https://prometheus.io/ recommended from Grafana?
Maybe adding integration to the existing flow would be more efficient, instead of creating something new (like S3)?

Do you use Grafana cloud service or standalone app?

@melvin-bot melvin-bot bot added Weekly KSv2 Daily KSv2 and removed Weekly KSv2 labels Mar 11, 2024
Copy link

melvin-bot bot commented Mar 13, 2024

Skipping the payment summary for this issue since all the assignees are employees or vendors. If this is incorrect, please manually add the payment summary SO.

@melvin-bot melvin-bot bot added Weekly KSv2 and removed Daily KSv2 labels Mar 13, 2024
@melvin-bot melvin-bot bot changed the title [HOLD for payment 2024-03-13] Investigate a dashboard tool to monitor app performance improvement/degradation after deploys [HOLD for payment 2024-03-20] [HOLD for payment 2024-03-13] Investigate a dashboard tool to monitor app performance improvement/degradation after deploys Mar 13, 2024
Copy link

melvin-bot bot commented Mar 13, 2024

Reviewing label has been removed, please complete the "BugZero Checklist".

@melvin-bot melvin-bot bot removed the Reviewing Has a PR in review label Mar 13, 2024
Copy link

melvin-bot bot commented Mar 13, 2024

The solution for this issue has been 🚀 deployed to production 🚀 in version 1.4.51-3 and is now subject to a 7-day regression period 📆. Here is the list of pull requests that resolve this issue:

If no regressions arise, payment will be issued on 2024-03-20. 🎊

For reference, here are some details about the assignees on this issue:

  • @rinej does not require payment (Contractor)

Copy link

melvin-bot bot commented Mar 13, 2024

BugZero Checklist: The PR adding this new feature has been merged! The following checklist (instructions) will need to be completed before the issue can be closed:

  • [@rinej] Please propose regression test steps to ensure the new feature will work correctly on production in further releases.
  • [@stephanieelliott] Link the GH issue for creating/updating the regression test once above steps have been agreed upon.

@stephanieelliott
Copy link
Contributor

This was a new feature/tool, no regression test needed here.

@mountiny
Copy link
Contributor

Correct! I think this is done but just checked the dashboard and cant see anything in it
image

asking if we moved it somewhere else

@mountiny
Copy link
Contributor

It just takes very long to load for anything more than a day

@melvin-bot melvin-bot bot added Reviewing Has a PR in review Weekly KSv2 and removed Weekly KSv2 labels Mar 15, 2024
@melvin-bot melvin-bot bot changed the title [HOLD for payment 2024-03-20] [HOLD for payment 2024-03-13] Investigate a dashboard tool to monitor app performance improvement/degradation after deploys [HOLD for payment 2024-03-26] [HOLD for payment 2024-03-20] [HOLD for payment 2024-03-13] Investigate a dashboard tool to monitor app performance improvement/degradation after deploys Mar 19, 2024
Copy link

melvin-bot bot commented Mar 19, 2024

Reviewing label has been removed, please complete the "BugZero Checklist".

@melvin-bot melvin-bot bot removed the Reviewing Has a PR in review label Mar 19, 2024
Copy link

melvin-bot bot commented Mar 19, 2024

The solution for this issue has been 🚀 deployed to production 🚀 in version 1.4.54-4 and is now subject to a 7-day regression period 📆. Here is the list of pull requests that resolve this issue:

If no regressions arise, payment will be issued on 2024-03-26. 🎊

For reference, here are some details about the assignees on this issue:

  • @rinej does not require payment (Contractor)

Copy link

melvin-bot bot commented Mar 19, 2024

BugZero Checklist: The PR adding this new feature has been merged! The following checklist (instructions) will need to be completed before the issue can be closed:

  • [@rinej] Please propose regression test steps to ensure the new feature will work correctly on production in further releases.
  • [@stephanieelliott] Link the GH issue for creating/updating the regression test once above steps have been agreed upon.

@melvin-bot melvin-bot bot added the Overdue label Mar 28, 2024
@stephanieelliott
Copy link
Contributor

No regression test needed, this was a new feature! Expert contrib so no payment due on this one.

@melvin-bot melvin-bot bot removed the Overdue label Mar 28, 2024
@github-project-automation github-project-automation bot moved this from CRITICAL to Done in [#whatsnext] #quality Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting Payment Auto-added when associated PR is deployed to production NewFeature Something to build that is a new item. ring0 Weekly KSv2
Projects
Development

No branches or pull requests

6 participants