Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add cacheable ParquetResponseEmptyError in first-rows-from-parquet #2101

Merged
merged 3 commits into from
Nov 14, 2023

Conversation

AndreaFrancis
Copy link
Contributor

@AndreaFrancis AndreaFrancis commented Nov 13, 2023

Part of #1443
Currently, we have { _id: { cause: 'ParquetResponseEmptyError' }, count: 303 }, records in cache collection with UnexpectedError because of ParquetResponseEmptyError.
Actually, we have two definitions of ParquetResponseEmptyError: one in parquet_utils.py https://github.com/huggingface/datasets-server/blob/main/libs/libcommon/src/libcommon/parquet_utils.py#L29 and another in https://github.com/huggingface/datasets-server/blob/main/libs/libcommon/src/libcommon/exceptions.py#L424.
SplitFirstRowsFromParquetJobRunner should throw the cacheable version of ParquetResponseEmptyError but currently it is not handling the noncacheable exception leading to an UnexpectedError.
After this PR, I will refresh the cache of the 303 records.

@AndreaFrancis AndreaFrancis marked this pull request as ready for review November 13, 2023 18:54
@AndreaFrancis AndreaFrancis requested a review from a team November 13, 2023 18:55
@AndreaFrancis AndreaFrancis requested a review from severo November 14, 2023 11:59
@codecov-commenter
Copy link

codecov-commenter commented Nov 14, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (67459d2) 90.45% compared to head (194867a) 90.37%.
Report is 3 commits behind head on main.

Files Patch % Lines
libs/libcommon/src/libcommon/parquet_utils.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2101      +/-   ##
==========================================
- Coverage   90.45%   90.37%   -0.09%     
==========================================
  Files         248      249       +1     
  Lines       15204    15221      +17     
==========================================
+ Hits        13753    13756       +3     
- Misses       1451     1465      +14     
Flag Coverage Δ
jobs_cache_maintenance 95.33% <ø> (+0.03%) ⬆️
jobs_mongodb_migration 86.69% <ø> (ø)
libs_libcommon 90.06% <50.00%> (-0.15%) ⬇️
services_admin 86.56% <ø> (-1.08%) ⬇️
services_api 87.09% <ø> (ø)
services_rows 85.49% <ø> (ø)
services_search 80.48% <ø> (ø)
services_sse-api 94.21% <ø> (ø)
services_worker 92.58% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@AndreaFrancis AndreaFrancis merged commit 882de92 into main Nov 14, 2023
@AndreaFrancis AndreaFrancis deleted the parquet-reponse-empty-error branch November 14, 2023 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants