Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upload requests failing #63

Closed
brenktt opened this issue Aug 14, 2024 · 25 comments
Closed

upload requests failing #63

brenktt opened this issue Aug 14, 2024 · 25 comments
Assignees
Labels
bug Something isn't working question Further information is requested

Comments

@brenktt
Copy link

brenktt commented Aug 14, 2024

Hi, I'm having issues with communicating with volume system using db_volume_write() function in the latest release (v0.2.4) using code that was working couple months ago.

I receive following error:

Error in httr2::req_perform():
! Failed to perform HTTP request.
Caused by error in curl::curl_fetch_memory():
! Could not resolve host: https; Unknown error

It seems to me that the issue is with the host parameter. So far I have provided it in format like https://adb-<many_digits>.<single_digit>.azuredatabricks.net/. According to the current documentation, host should be in a format like xxxxxxx.cloud.databricks.com.

Could you help me to acquire the host address in correct format?

@zacdav-db
Copy link
Contributor

@brenktt presumably this is not specific to volumes, do other functions work?

I believe any of the following should be valid:

  • adb-<many_digits>.<single_digit>.azuredatabricks.net
  • https://adb-<many_digits>.<single_digit>.azuredatabricks.net

Remove the trailing / and try again please.

@brenktt
Copy link
Author

brenktt commented Aug 15, 2024

Hi, I managed to get the connection working using the first option.
adb-<many_digits>.<single_digit>.azuredatabricks.net

However, I have run into second issue when trying to upload files to volume. The file starts uploading, but the upload immediately freezes and the expected upload time keeps increasing in the console.

image

In this example I'm trying to upload very small parquet file (15kb).

I have tried reading from the volume and this works as expected, so the issue seems to be just with this db_volume_write() function.

@zacdav-db
Copy link
Contributor

@brenktt can you try this please:

# adjust before running
vpath <- "/Volumes/<catalog>/<schema>/<volume>"

# save to tempdir
dir <- tempdir()
fpath <- file.path(dir, "cars.csv")
write.csv(cars, fpath)

# upload to volume
vol_dest <- file.path(vpath, "cars.csv")
brickster::db_volume_write(path = vol_dest, file = fpath, overwrite = TRUE)

# read from volume
local_dest <- file.path(dir, "vol_cars.csv")
path <- brickster::db_volume_read(path = vol_dest, destination = local_dest)

read.csv(path) # or `read.csv(local_dest)`

I'm currently unable to reproduce the issue thus far, even with larger data.

@zacdav-db zacdav-db self-assigned this Aug 15, 2024
@brenktt
Copy link
Author

brenktt commented Aug 15, 2024

I have tried your solution and it works and to my surprise my code now works as well. There must have been some network issue at the time or perhaps I messed up with some of the function inputs.

Thank you so much for your help and sorry I wasted your time.

One more question from my side- I think I asked some time ago, but is there any plan to have the package available on CRAN? It was a life saver for me and it would be great if I did not have to install through GitHub.

@zacdav-db
Copy link
Contributor

No worries, glad its working now!

CRAN process has been kicked off, I did the first review a few weeks ago. I have put some time aside to go through the feedback and hopefully all things going well then its on CRAN soon 🤞.

@brenktt
Copy link
Author

brenktt commented Aug 15, 2024

Hopefully it works out for the best!

Please could you let the issue open for a little longer so I can test on larger datasets as well?

@brenktt
Copy link
Author

brenktt commented Aug 15, 2024

It seems I was too quick with conclusions as I have tested on a file that is smaller thank 16 KB. For some reason the upload of files freezes at exactly 16 KB for all files (whether it is parquet or csv).

@zacdav-db
Copy link
Contributor

Is there an example file you can make thats reproducible?

@brenktt
Copy link
Author

brenktt commented Aug 15, 2024

I just pasted couple of mtcars dataframes together so it exceeds 16 KB.
write.csv( dplyr::bind_rows( mtcars, mtcars, mtcars, mtcars, mtcars, mtcars, mtcars, mtcars, mtcars, mtcars, mtcars, mtcars ), "mtcars.csv" )

@zacdav-db
Copy link
Contributor

Hmm, that works fine for me, I also tested data that was 150MB which worked as well.

e.g. adjusting my example to write 100k rows

write.csv(dplyr::sample_n(cars, 100000, TRUE), fpath)

@brenktt
Copy link
Author

brenktt commented Aug 15, 2024

This is where the difference is between our sessions. Do you have any idea what could be causing this or is there anything else I could provide that could be investigated?

@zacdav-db
Copy link
Contributor

@brenktt you can paste an output of sessionInfo().

Ensure httr2 is up to date and maybe try a different internet connection?

@brenktt
Copy link
Author

brenktt commented Aug 20, 2024

Here is the output:

R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server 7.9 (Maipo)

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblasp-r0.3.3.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices datasets  utils    
[6] methods   base     

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.9        rstudioapi_0.14   knitr_1.42       
 [4] magrittr_2.0.3    rappdirs_0.3.3    tidyselect_1.2.0 
 [7] bit_4.0.5         lattice_0.20-45   R6_2.5.1         
[10] rlang_1.1.4       fansi_1.0.3       stringr_1.5.0    
[13] httr2_1.0.2       tools_4.2.1       grid_4.2.1       
[16] xfun_0.39         png_0.1-8         arrow_14.0.0.2   
[19] utf8_1.2.2        DBI_1.1.3         cli_3.4.1        
[22] brickster_0.2.4   bit64_4.0.5       assertthat_0.2.1 
[25] tibble_3.1.8      lifecycle_1.0.3   Matrix_1.4-1     
[28] purrr_0.3.5       vctrs_0.5.2       glue_1.6.2       
[31] stringi_1.7.8     compiler_4.2.1    pillar_1.8.1     
[34] jsonlite_1.8.3    reticulate_1.38.0 renv_0.16.0      
[37] pkgconfig_2.0.3  

I have also spoken to our IT department and it seems only I have this issue. I will get back to you if this gets solved somewhere on the IT side. It is likely it is not actually an issue with the package.

@zacdav-db zacdav-db added the question Further information is requested label Aug 20, 2024
@zacdav-db
Copy link
Contributor

Keep me posted. I'll close the issue in a week or two if I don't hear otherwise. Can always re-open.

@brenktt
Copy link
Author

brenktt commented Aug 20, 2024

I will probably have an answer sometime at the start of September, so please keep the issue open until then.

@brenktt
Copy link
Author

brenktt commented Sep 2, 2024

@zacdav-db So it turns out the issue is with httr2. The upload does not work with versions of the package above 1.0.1 (I have checked both 1.0.2 & 1.0.3). When I downgrade the package version this issue disappears.

@zacdav-db
Copy link
Contributor

Thanks @brenktt, I can now repro the issue.

I'm having a dig through what's changed in {httr2}.
I wonder if the changes in r-lib/httr2#489 are to do with it 🤔

@zacdav-db
Copy link
Contributor

I've tested the repro with the commit before the change and then the commit with the change and its clear that it is the culprit.

remotes::install_github(repo = "r-lib/httr2", ref = "ff16551") # before change, works
remotes::install_github(repo = "r-lib/httr2", ref = "bdb13fe") # after change, fails

@brenktt
Copy link
Author

brenktt commented Sep 2, 2024

From my part I'm happy this is now working, but of course it would be best to have the package working with newest versions as there was a lot of time spent to find the issue.

@zacdav-db
Copy link
Contributor

@brenktt of course. I'm investigating and will likely raise an issue with httr2 if its indeed an issue there.

I want this to work with all versions without issue too!

@zacdav-db
Copy link
Contributor

zacdav-db commented Sep 2, 2024

Raised an issue with httr2 (r-lib/httr2#524)

@zacdav-db zacdav-db added the bug Something isn't working label Sep 2, 2024
@zacdav-db
Copy link
Contributor

I'll be waiting for a resolution before continuing with CRAN process - this is important before release.

@zacdav-db
Copy link
Contributor

zacdav-db commented Sep 3, 2024

@brenktt The issue is now fixed in the development version of {httr2} - thanks again for raising the issue and initial debugging.

r-lib/httr2#525

@brenktt
Copy link
Author

brenktt commented Sep 3, 2024

Thanks to you for prompt investigation!

@zacdav-db zacdav-db changed the title Authentication issues upload requests failing Sep 10, 2024
@zacdav-db
Copy link
Contributor

You can now install {httr2} 1.0.4 to resolve this issue.

https://github.com/r-lib/httr2/releases/tag/v1.0.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants