Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in read.dta(tmpfile) : a binary read error occurred #2

Closed
pssguy opened this issue Feb 24, 2014 · 12 comments
Closed

Error in read.dta(tmpfile) : a binary read error occurred #2

pssguy opened this issue Feb 24, 2014 · 12 comments

Comments

@pssguy
Copy link

pssguy commented Feb 24, 2014

Occurs with following command

DpiData <- DpiGet()

also there is what appears to be a typo on the main page. You refer to pdData as not yet being on CRAN

@christophergandrud
Copy link
Contributor

Hi @pssguy

Can you try out the following code:

library(foreign)

url = 'http://bit.ly/1jZ3nmM'
tmpfile <- tempfile()
download.file(url, tmpfile)
DpiData <- read.dta(tmpfile)  
unlink(tmpfile)

Also can you send the output from:

sessionInfo()

in the case where DpiData gives you this error.

@pssguy
Copy link
Author

pssguy commented Feb 24, 2014

Hi
I shut down and reopened RStudio and reran code
Hope this helps

library(psData)
library(foreign)

url = 'http://bit.ly/1jZ3nmM'
tmpfile <- tempfile()
download.file(url, tmpfile)
trying URL 'http://bit.ly/1jZ3nmM'
Content type 'text/plain' length 4098155 bytes (3.9 Mb)
opened URL
downloaded 3.9 Mb

DpiData <- read.dta(tmpfile)
Error in read.dta(tmpfile) : a binary read error occurred
unlink(tmpfile)

sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_Canada.1252 LC_CTYPE=English_Canada.1252 LC_MONETARY=English_Canada.1252
[4] LC_NUMERIC=C LC_TIME=English_Canada.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] foreign_0.8-59 psData_0.1.2

loaded via a namespace (and not attached):
[1] countrycode_0.16 data.table_1.8.10 DataCombine_0.1.17 plyr_1.8 rJava_0.9-6 tools_3.0.2
[7] xlsx_0.5.5 xlsxjars_0.5.0

@christophergandrud
Copy link
Contributor

Hm, there is some issue with either download.file or read.dta.

When you run

url = 'http://bit.ly/1jZ3nmM'
tmpfile <- tempfile()
download.file(url, tmpfile)

Do you get a message that looks like this:

trying URL 'http://bit.ly/1jZ3nmM'
Content type 'text/plain' length 4098155 bytes (3.9 Mb)
opened URL
==================================================
downloaded 3.9 Mb

returned in your R console?

@pssguy
Copy link
Author

pssguy commented Feb 24, 2014

Yep
Every time. See previous comment

@christophergandrud
Copy link
Contributor

Hu, then it could be with in read.dta.

What if you download the file manually (put http://bit.ly/1jZ3nmM into your web browswer). And then load it into R with something like:

Test <- read.dta('THE_FILE_PATH_TO_THE_DTA_FILE')

@pssguy
Copy link
Author

pssguy commented Feb 24, 2014

That seems to work fine
str(Test) #data.frame': 6764 obs. of 125 variables:

BTW where would I find column definitions?

@christophergandrud
Copy link
Contributor

By the process of elimination, it must be something to do with where/how your computer is storing the tmpfile, i.e. something happens somewhere in:

tmpfile <- tempfile()
download.file(url, tmpfile)

This is really difficult to debug without having access to your computer and file system. What do you get with:

tempdir()

Stata .dta files don't store variable descriptions in a column, they are stored in the binary in a way that read.dta (I think) doesn't access. Though, you can easily find the full DPI variable descriptions here.

@pssguy
Copy link
Author

pssguy commented Feb 24, 2014

Here we go

library(psData)
library(foreign)
tempdir() "C:\Users\Andy\AppData\Local\Temp\RtmpCaMTEP"
url = 'http://bit.ly/1jZ3nmM'
tempfile() #[1] "C:\Users\Andy\AppData\Local\Temp\RtmpCaMTEP\file2d70c7d293697"
tmpfile <- tempfile()
tmpfile # "C:\Users\Andy\AppData\Local\Temp\RtmpCaMTEP\file2d70c62631a0"
download.file(url, tmpfile)
tmpfile #"C:\Users\Andy\AppData\Local\Temp\RtmpCaMTEP\file2d70c62631a0" 4006kb Type file
DpiData <- read.dta(tmpfile) #Error in read.dta(tmpfile) : a binary read error occurred

Is this relevant for read.dta
Description

Reads a file in Stata version 5–12 binary format into a data frame.

Frozen: will not support Stata formats after 12.

The default file format for Stata 13, format-115, is substantially different from those for Stata 5–12

@christophergandrud
Copy link
Contributor

I'm stumped. The data is definitely in pre Stata 13 format and you found it can be opened from read.dta when it's not in a temp file. We can also see that it should know where the temp file is. The series of functions work on other computers and passes a CRAN check.

So yeah, I'm stumped. You might want to contact the maintainers of the foreign package where read.dta is from. They might have a better idea of what is going on.

@christophergandrud
Copy link
Contributor

I'm closing this issue for now as it's not exactly a psData issue. Though if anyone finds the answer, please reopen it.

@briatte
Copy link
Contributor

briatte commented Feb 28, 2014

  • try specifying the mode: download.file(mode = "wb")
  • replace download.file by downloader::download
  • try getting the data through another GET function
  • try replacing the ink by the original one

The first item should help, and the second one always solved my encoding issues, because it handles some Windows/Mac backend options that can cause issues from time to time.

@pssguy
Copy link
Author

pssguy commented Feb 28, 2014

Thanks briatte
Just adding the mode="wb" argument appears to solve the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants