-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DO NOT MERGE: Debug gene info endpoint returning null #1153
Conversation
b6f146d
to
31fd26b
Compare
@@ -22,4 +22,4 @@ stack: | |||
- name: DATA_LOCATOR_DOMAIN | |||
value: $(__ARGUS_STACK_INGRESS_HOST) | |||
- name: CXG_BUCKET_PATH | |||
value: hosted-cellxgene-dev | |||
value: hosted-cellxgene-staging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hspitzley-czi I'm trying to test a fix by changing Explorer BE from hitting dev to staging, but I'm getting:
Unable to discover s3 region name from s3://hosted-cellxgene-staging (type=value_error)
Is that because rdev role permission doesn't have access to staging bucket?
The bug I'm trying to address is that rdev gene info endpoint always returns null
🤔
CURL
curl 'https://certain-bull.dev-sc.dev.czi.team/cellxgene/s3_uri/s3%253A%252F%252Fhosted-cellxgene-dev%252Fsuper-cool-spatial.cxg/api/v0.3/geneinfo?geneID=&gene=AR' \
-H 'accept: application/json' \
-H 'accept-language: en-US,en;q=0.9,zh-TW;q=0.8,zh;q=0.7,zh-CN;q=0.6' \
-H 'cookie: argus_sticky_session=1734566048.418.1662.83615|e3f9a91a7a0db7bca0ae9c8322a67838' \
-H 'dnt: 1' \
-H 'priority: u=1, i' \
-H 'referer: https://certain-bull.dev-sc.dev.czi.team/d/super-cool-spatial.cxg/' \
-H 'sec-ch-ua: "Google Chrome";v="131", "Chromium";v="131", "Not_A Brand";v="24"' \
-H 'sec-ch-ua-mobile: ?0' \
-H 'sec-ch-ua-platform: "macOS"' \
-H 'sec-fetch-dest: empty' \
-H 'sec-fetch-mode: cors' \
-H 'sec-fetch-site: same-origin' \
-H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36'
whereas on staging it returns actual info:
curl 'https://api.cellxgene.staging.single-cell.czi.technology/cellxgene/s3_uri/s3%253A%252F%252Fhosted-cellxgene-staging%252Fpbmc3k.cxg/api/v0.3/geneinfo?geneID=&gene=ACD' \
-H 'accept: application/json' \
-H 'accept-language: en-US,en;q=0.9,zh-TW;q=0.8,zh;q=0.7,zh-CN;q=0.6' \
-H 'cookie: argus_sticky_session=1734565349.449.38.739565|df4dba4aefd2407066eb62a1488673d2' \
-H 'dnt: 1' \
-H 'origin: https://cellxgene.staging.single-cell.czi.technology' \
-H 'priority: u=1, i' \
-H 'referer: https://cellxgene.staging.single-cell.czi.technology/' \
-H 'sec-ch-ua: "Google Chrome";v="131", "Chromium";v="131", "Not_A Brand";v="24"' \
-H 'sec-ch-ua-mobile: ?0' \
-H 'sec-ch-ua-platform: "macOS"' \
-H 'sec-fetch-dest: empty' \
-H 'sec-fetch-mode: cors' \
-H 'sec-fetch-site: same-site' \
-H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that seems likely. I'm not sure what the application code is running to produce that error but looking at the role used by rdev I'm not seeing permission for the staging bucket
https://github.com/chanzuckerberg/single-cell-explorer/blob/main/.infra/common.yaml#L24
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see! I think the code that produces the error is this line in the Explorer BE code:
raise ValueError(f"Unable to discover s3 region name from {path}") |
My hypothesis is that the Explorer BE is returning null
for gene info query now, because behind the scene Explorer GE actually queries Data Portal's dev gene info endpoint for data, and I believe dev data portal endpoint is no longer working, thus us seeing the issue
https://cellxgene.dev.single-cell.czi.technology/ <-- gets 503 service not available now
So I'm thinking the solution here is for Explorer rdev to switch from hitting dev to staging and thus needing permission to staging bucket
Should I just change the common iam
from dev
to staging
?
arn:aws:iam::699936264352:role/data_portal_staging_explorer
Let me try that out! Thanks for pointing me to the code!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to check if there are other differences in the permissions between the dev role and staging role. You could probably pretty easily add permission to the dev role for the staging bucket
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks for the callout! Okay let me test if using staging role actually works, if so I'll just add staging bucket permission to dev role 🙆♂️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh yeah github org wide search didn't return anything o_o
https://github.com/search?q=org:chanzuckerberg%20jheath-argus-demo-deleteme&type=code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh it looks like the access key and secret key set with argus secrets belongs to that user. (pass -r
to see the values)
» argus list secrets --app single-cell-explorer --env rdev hspitzley@MacBookPro
Listing environment variables on environment rdev
+-----------------------+----------+-------------+
| KEY | VALUE | SOURCE |
+-----------------------+----------+-------------+
| AWS_ACCESS_KEY_ID | ******** | environment |
| AWS_SECRET_ACCESS_KEY | ******** | environment |
+-----------------------+----------+-------------+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was some reason we needed it to assume a user instead of a role (I'm forgetting why right now) but this gives us a path forward. You could copy those values elsewhere so you can restore them later and then overwrite them with the corresponding values from staging and I think that will make it able to access the staging bucket
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh fantastic!! Thanks so much for helping me out with this 🤩 🙏 That's really good to know 💡
Okay will do and retest 🫡
Thanks again!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay can confirm that my rdev can now hit staging endpoint thank you!
However, somehow the staging endpoint is still returning null
, so I'm adding more logs to troubleshoot what's going on there. Will update this thread 👌
ad4cde2
to
f70b093
Compare
d0cbc71
to
366e793
Compare
4719a41
to
15b9998
Compare
67889c1
to
d98718d
Compare
d203f43
to
92a1302
Compare
92a1302
to
d81ca4c
Compare
89e3b23
to
18feafd
Compare
Stale Stack Warning The stack associated with this PR has not been updated since 2024-12-23 18:51:13 UTC (~10.2 days ago). The threshold for deletion specified in the 'stale_rdev_cleanup_threshold_days' setting for the associated app (single-cell-explorer) is 14 days. Unless a new commit is made to this branch before the threshold is reached, the stack will be deleted. |
Closing this for now to focus on higher priority stuff. The latest investigation shows that the rdev hitting the staging API still returns |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1153 +/- ##
============================
============================
☔ View full report in Codecov by Sentry. |
Argus Stack Details
This reverts commit 6d162bf.